langchain

mirror of https://github.com/hwchase17/langchain.git synced 2026-01-23 13:19:22 +00:00

Author	SHA1	Message	Date
Zeeland	ad7089a6d0	fix: change ddg to DDGS (#6480 ) This commit updates the duckduckgo search utility by using a more accurate name in the import statement.	2023-06-20 10:15:05 -07:00
Davis Chase	8cd5f65a6f	release 207 (#6488 )	2023-06-20 10:14:29 -07:00
zhaoshengbo	ab44c24333	Add Alibaba Cloud OpenSearch as a new vector store (#6154 ) Hello Folks, Thanks for creating and maintaining this great project. I'm excited to submit this PR to add Alibaba Cloud OpenSearch as a new vector store. OpenSearch is a one-stop platform to develop intelligent search services. OpenSearch was built based on the large-scale distributed search engine developed by Alibaba. OpenSearch serves more than 500 business cases in Alibaba Group and thousands of Alibaba Cloud customers. OpenSearch helps develop search services in different search scenarios, including e-commerce, O2O, multimedia, the content industry, communities and forums, and big data query in enterprises. OpenSearch provides the vector search feature. In specific scenarios, especially test question search and image search scenarios, you can use the vector search feature together with the multimodal search feature to improve the accuracy of search results. This PR includes: A AlibabaCloudOpenSearch class that can connect to the Alibaba Cloud OpenSearch instance. add embedings and metadata into a opensearch datasource. querying by squared euclidean and metadata. integration tests. ipython notebook and docs. I have read your contributing guidelines. And I have passed the tests below - [x] make format - [x] make lint - [x] make coverage - [x] make test --------- Co-authored-by: zhaoshengbo <shengbo.zsb@alibaba-inc.com>	2023-06-20 10:07:40 -07:00
Davis Chase	b7ad4c4c30	fix openai qa chain (#6487 )	2023-06-20 10:01:13 -07:00
thehunmonkgroup	10adec5f1b	add FunctionMessage support to `_convert_dict_to_message()` in OpenAI chat model (#6382 ) Already supported in the reverse operation in `_convert_message_to_dict()`, this just provides parity. @hwchase17 @agola11 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-20 08:25:55 -07:00
Harrison Chase	7414e9d196	bump version to 206 (#6465 )	2023-06-19 23:05:09 -07:00
Hubert	22601b0b63	fix neo4j schema query (#6381 ) Fix issue #6380 <!-- Remove if not applicable --> Fixes #6380 (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: HubertKl <HubertKl>	2023-06-19 22:48:35 -07:00
Gavin	b0d80c4b3e	Update serpapi.py Support baidu list type answer_box (#6386 ) Support baidu list type answer_box From [this document](https://serpapi.com/baidu-answer-box), we can know that the answer_box attribute returned by the Baidu interface is a list, and the list contains only one Object, but an error will occur when the current code is executed. So when answer_box is a list, we reset res["answer_box"] so that the code can execute successfully.	2023-06-19 22:48:18 -07:00
Bryce Drennan	384fa43fc3	fix: llm caching for replicate (#6396 ) Caching wasn't accounting for which model was used so a result for the first executed model would return for the same prompt on a different model. This was because `Replicate._identifying_params` did not include the `model` parameter. FYI - @cbh123 - @hwchase17 - @agola11	2023-06-19 22:47:59 -07:00
Zeeland	8a604b93ab	feat: use latest duckduckgo_search API to call (#6409 ) # Provider the latest duckduckgo_search API The Git commit contents involve two files related to some DuckDuckGo query operations, and an upgrade of the DuckDuckGo module to version 3.8.3. A suitable commit message could be "Upgrade DuckDuckGo module to version 3.8.3, including query operations". Specifically, in the duckduckgo_search.py file, a DDGS() class instance is newly added to replace the previous ddg() function, and the time parameter name in the get_snippets() and results() methods is changed from "time" to "timelimit" to accommodate recent changes. In the pyproject.toml file, the duckduckgo-search module is upgraded to version 3.8.3. [duckduckgo_search readme attention](https://github.com/deedy5/duckduckgo_search): Versions before v2.9.4 no longer work as of May 12, 2023 ## Who can review? @vowelparrot --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:47:39 -07:00
Harrison Chase	9eec7c3206	Harrison/unstructured page number (#6464 ) Co-authored-by: Reza Sanaie <reza@sanaie.ca>	2023-06-19 22:31:43 -07:00
Alonso Silva Allende	b82ddf9cfb	Improve error message (#6275 ) Trying to use OpenAI models like 'text-davinci-002' or 'text-davinci-003' the agent doesn't work and the message is 'Only supported with OpenAI models.' The error message should be 'Only supported with ChatOpenAI models.' My Twitter handle is @alonsosilva <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> Co-authored-by: SILVA Alonso <alonso.silva@nokia-bell-labs.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:21:01 -07:00
zengbo	7e5f5ebf86	Fix the issue where ANTHROPIC_API_URL set in environment is not takin… (#6400 ) I apologize for the error: the 'ANTHROPIC_API_URL' environment variable doesn't take effect if the 'anthropic_api_url' parameter has a default value. #### Who can review? Models - @hwchase17 - @agola11	2023-06-19 22:20:36 -07:00
Grayson Adkins	9f5f747dc3	Fix broken links in autonomous agents docs (#6398 ) Fixes broken links here: https://python.langchain.com/docs/use_cases/autonomous_agents.html #### Who can review? Tag maintainers/contributors who might be interested: Agents / Tools / Toolkits - @hwchase17	2023-06-19 22:20:00 -07:00
volodymyr-memsql	d2e9b621ab	Update SinglStoreDB vectorstore (#6423 ) 1. Introduced new distance strategies support: DOT_PRODUCT and EUCLIDEAN_DISTANCE for enhanced flexibility. 2. Implemented a feature to filter results based on metadata fields. 3. Incorporated connection attributes specifying "langchain python sdk" usage for enhanced traceability and debugging. 4. Expanded the suite of integration tests for improved code reliability. 5. Updated the existing notebook with the usage example @dev2049 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:08:58 -07:00
Avinash Raj	6efd5fa2b9	Fix for #6431 - chatprompt template with partial variables giing validation error (#6456 ) W.r.t recent changes, ChatPromptTemplate does not accepting partial variables. This PR should fix that issue. Fixes #6431 #### Who can review? @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:08:15 -07:00
Harrison Chase	02c0a1e77e	Harrison/functions in retrieval (#6463 )	2023-06-19 22:07:58 -07:00
Swapnil Sharma	dc4ffa8d9b	Incorrect argument count handling (#5543 ) Throwing ToolException when incorrect arguments are passed to tools so that that agent can course correct them. # Incorrect argument count handling I was facing an error where the agent passed incorrect arguments to tools. As per the discussions going around, I started throwing ToolException to allow the model to course correct. ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:06:20 -07:00
kYLe	3a58c4c3a0	Fixed a link typo /-/route -> /-/routes. and change endpoint format (#6186 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes a link typo from `/-/route` to `/-/routes`. and change endpoint format from `f"{self.anyscale_service_url}/{self.anyscale_service_route}"` to `f"{self.anyscale_service_url}{self.anyscale_service_route}"` Also adding documentation about the format of the endpoint #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:05:54 -07:00
Leonid Ganeline	03b16ed2b1	docs `retrievers` fixes (#6299 ) Fixed several inconsistencies: - file names and notebook titles should be similar otherwise ToC on the [retrievers page](https://python.langchain.com/en/latest/modules/indexes/retrievers.html) and on the left ToC tab are different. For example, now, `Self-querying with Chroma` is not correctly alphabetically sorted because its file named `chroma_self_query.ipynb` - `Stringing compressors and document transformers...` demoted from `#` to `##`. Otherwise, it appears in Toc. - several formatting problems #### Who can review? @hwchase17 @dev2049 Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:04:35 -07:00
M. Tolga Cangöz	bccee85c8f	Update introduction.mdx (#6425 ) Fix typo	2023-06-19 22:04:09 -07:00
Nir Gazit	95b77a5215	Fix Custom LLM Agent example (#6429 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> The `CustomOutputParser` needs to throw `OutputParserException` when it fails to parse the response from the agent, so that the executor can [catch it and retry](`be9371ca8f/langchain/agents/agent.py (L767)`) when `handle_parsing_errors=True`. <!-- Remove if not applicable --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-19 22:03:58 -07:00
ykerus	b697bbb5b5	Remove backticks without clear purpose from docs (#6442 ) #### Description - Removed two backticks surrounding the phrase "chat messages as" - This phrase stood out among other formatted words/phrases such as `prompt`, `role`, `PromptTemplate`, etc., which all seem to have a clear function. - `chat messages as`, formatted as such, confused me while reading, leading me to believe the backticks were misplaced. #### Who can review? @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-19 22:03:38 -07:00
Dhruvil Shah	9494623869	Update web_base.ipynb (#6430 ) Minor new line character in the markdown. Also, this option is not yet in the latest version of LangChain (0.0.190) from Conda. Maybe in the next update. @eyurtsev @hwchase17	2023-06-19 21:43:35 -07:00
Wenchen Li	76ae9da9db	Add `_similarity_search_with_relevance_scores` in `Pinecone` (#6446 ) Just so it is consistent with other `VectorStore` classes. This is a follow-up of #6056 which also discussed the potential of adding `similarity_search_by_vector_returning_embeddings` that we will continue the discussion here. potentially related: #6286 #### Who can review? Tag maintainers/contributors who might be interested: @rlancemartin <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-19 21:36:40 -07:00
Ismail Pelaseyed	d4e8e0f5ab	Add example for question answering over documents with OpenAI Function Agent (#6448 ) This PR adds an example of doing question answering over documents using OpenAI Function Agents. #### Who can review? @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 21:35:45 -07:00
Andrey Avtomonov	68a675cc68	Remove extra word in the introduction documentation (#6450 ) Removed an extra word in the introduction documentation, a simple typo	2023-06-19 21:31:17 -07:00
Ankush Gola	a9246333fd	fix anthropic chat model mutating input list (#6457 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes: ChatAnthropic was mutating the input message list during formatting which isn't ideal bc you could be changing the behavior for other chat models when using the same input #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested:	2023-06-19 21:30:52 -07:00
Zander Chase	bc0af67aaf	Add Trajectory Eval RunEvaluator (#6449 )	2023-06-19 21:11:50 -07:00
Hakan Tekgul	6a157cf8bb	Update arize_callback.py (#6433 ) Arize released a new Generative LLM Model Type, adjusting the callback function to new logging. Added arize imports, please delete if not necessary. Specifically, this change makes sure that the prompt and response pairs from LangChain agents are logged into Arize as a Generative LLM model, instead of our previous categorical model. In order to do this, the callback functions collects the necessary data and passes the data into Arize using Python Pandas SDK. Arize library, specifically pandas.logger is an additional dependency. Notebook For Test: https://docs.arize.com/arize/resources/integrations/langchain Who can review? Tag maintainers/contributors who might be interested: @hwchase17 - project lead Tracing / Callbacks @agola11	2023-06-19 18:33:49 -07:00
Zander Chase	00f276d23f	Run eval in eval mode (#6447 ) For the `run_on_dataset` sessions	2023-06-19 18:31:38 -07:00
Harrison Chase	1300a4bc8c	expose docs chains (#6453 )	2023-06-19 17:18:54 -07:00
Harrison Chase	286452c7f0	remove mongo	2023-06-19 10:04:14 -07:00
David Duong	be9371ca8f	Include placeholder value for all secrets, not just kwargs (#6421 ) Mirror PR for https://github.com/hwchase17/langchainjs/pull/1696 Secrets passed via environment variables should be present in the serialised chain	2023-06-19 15:41:45 +01:00
Harrison Chase	df40cd233f	bump version to 205 (#6410 )	2023-06-18 23:21:26 -07:00
Harrison Chase	e9c2b280db	Harrison/refactor functions (#6408 )	2023-06-18 23:13:42 -07:00
Harrison Chase	6a4a950a3c	changes to llm chain (#6328 ) - return raw and full output (but keep run shortcut method functional) - change output parser to take in generations (good for working with messages) - add output parser to base class, always run (default to same as current) --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-06-18 22:49:47 -07:00
Davis Chase	d3c2eab0b3	Docs nit (#6350 )	2023-06-18 20:58:12 -07:00
Davis Chase	af96de6552	fix prod docs build (#6402 )	2023-06-18 20:56:12 -07:00
Fei Wang	50556f3b35	support memory for functions (#6165 ) #### Before submitting Add memory support for `OpenAIFunctionsAgent` like `StructuredChatAgent`. #### Who can review? @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 19:00:40 -07:00
Dhruvil Shah	b2b9ded12f	Update web_base.py _fetch() method For SiteMapLoader (#6256 ) A must-include for SiteMap Loader to avoid the SSL verification error. Setting the 'verify' to False by ``` sitemap_loader.requests_kwargs = {"verify": False}``` does not bypass the SSL verification in some websites. There are websites (https:// researchadmin.asu.edu/ sitemap.xml) where setting "verify" to False as shown below would not work: sitemap_loader.requests_kwargs = {"verify": False} We need this merge to tell the Session to use a connector with a specific argument about SSL: \# For SiteMap SSL verification if not self.request_kwargs['verify']: connector = aiohttp.TCPConnector(ssl=False) else: connector = None <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes #5483 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 18:34:18 -07:00
Harrison Chase	10bff4ecc4	Harrison/chroma fix (#6390 ) Co-authored-by: Junu Moon(Fran) <francomoon7@gmail.com>	2023-06-18 18:33:26 -07:00
Harrison Chase	5c1fa3e70e	Harrison/typesense fix (#6391 ) Co-authored-by: Gaurav Chauhan <2796gaurav@gmail.com> Co-authored-by: gaurav <gaurav.chauhan1@rksv.in>	2023-06-18 18:33:15 -07:00
Harrison Chase	5ccebce777	rm pandas from arize (#6392 )	2023-06-18 18:33:04 -07:00
matias-biatoz	3b7c4c51d5	Added gpt-3.5-turbo 0613 16k and 16k-0613 pricing (#6287 ) @agola11 Issue #6193 I added the new pricing for the new models. Also, now gpt-3.5-turbo got split into "input" and "output" pricing. It currently does not support that.	2023-06-18 18:32:20 -07:00
Ly Nguyen	1e0af59f69	- Fix pass system_message argument in new feature openai_functions_agent (#6297 ) can't pass system_message argument, the prompt always show default message "System: You are a helpful AI assistant." ``` system_message = SystemMessage( content="You are an AI that provides information to Human regarding documentation." ) agent = initialize_agent( tools, llm=openai_llm_chat, agent=AgentType.OPENAI_FUNCTIONS, system_message=system_message, agent_kwargs={ "system_message": system_message, }, verbose=False, ) ``` #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:54:00 -07:00
georgian	e64bafed3a	Fixes typo in Vectara.similarity_search (#6277 ) Fixes a simple typo. @hwchase17 @dev2049 Co-authored-by: Georgian Sarghi <georgian.sarghi@gmail.com>	2023-06-18 17:48:54 -07:00
Ted	112695e4da	Iterate through filtered file types instead of all listed files (#6258 ) # Iterate through filtered file types instead of all listed files Fixes https://github.com/hwchase17/langchain/issues/6257 https://github.com/hwchase17/langchain/pull/4926 originally added the functionality to filter by file type, storing the filtered files in `_files` https://github.com/hwchase17/langchain/pull/5220 removed the functionality when adding code to filter trashed files by using the `files` variables instead of the `_files` variable. This PR simply adds the functionality back by using `_files` again. #### Who can review? @hwchase17 - project lead @eyurtsev	2023-06-18 17:47:58 -07:00
Dhruvil Shah	ba90e3c990	Update web_base.ipynb for guiding purposes (#6248 ) To bypass SSL verification errors during fetching, you can include the `verify=False` parameter. This markdown proves useful, especially for beginners in the field of web scraping. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes #6079 #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 17:47:10 -07:00
Dhruvil Shah	92f05a67a4	Add markdown to specify important arguments (#6246 ) To bypass SSL verification errors during web scraping, you can include the ssl_verify=False parameter along with the headers parameter. This combination of arguments proves useful, especially for beginners in the field of web scraping. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes #1829 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @eyurtsev --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 17:47:00 -07:00
ikebo	ca7a44d024	add max_context_size property in BaseOpenAI (#6239 ) Hi, I make a small improvement for BaseOpenAI. I added a max_context_size attribute to BaseOpenAI so that we can get the max context size directly instead of only getting the maximum token size of the prompt through the max_tokens_for_prompt method. Who can review? @hwchase17 @agola11 I followed the [Common Tasks](`c7db9febb0/.github/CONTRIBUTING.md`), the test is all passed. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 17:46:35 -07:00
Jan Pawellek	3e3ed8c5c9	Fix LLM types so that they can be loaded from config dicts (#6235 ) LLM configurations can be loaded from a Python dict (or JSON file deserialized as dict) using the [load_llm_from_config](`8e1a7a8646/langchain/llms/loading.py (L12)`) function. However, the type string in the `type_to_cls_dict` lookup dict differs from the type string defined in some LLM classes. This means that the LLM object can be saved, but not loaded again, because the type strings differ.	2023-06-18 17:46:22 -07:00
Shu	46782ad79b	Fixed an unhandled error that was raised when DynamoDB did not have any chat history. (#6141 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> The current version of chat history with DynamoDB doesn't handle the case correctly when a table has no chat history. This change solves this error handling. <!-- Remove if not applicable --> Fixes https://github.com/hwchase17/langchain/issues/6088 #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:39:19 -07:00
Cameron Vetter	2286204354	Correct AzureSearch Vector Store not applying search_kwargs when searching (#6132 ) Fixes #6131 Simply passes kwargs forward from similarity_search to helper functions so that search_kwargs are applied to search as originally intended. See bug for repro steps. #### Who can review? @hwchase17 @dev2049 Twitter: poshporcupine	2023-06-18 17:39:06 -07:00
Pierre Dulac	395a2a3724	Fix typo in the CAI critique prompt (#6123 ) Very small typo in the Constitutional AI critique default prompt. The negation "If there is no material critique of ..." is used two times, should be used only on the first one. Cheers, Pierre	2023-06-18 17:38:56 -07:00
Hao Chen	38057f0d2e	Fix latest clickhouse vector schema change (#6385 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes https://github.com/hwchase17/langchain/issues/6208 <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> VectorStores / Retrievers / Memory - @dev2049	2023-06-18 17:34:53 -07:00
Davit Buniatyan	1ab9dc8293	[hotfix] Deep Lake fails on newer version due to hardcode (#6383 ) Hot Fixes for Deep Lake [would highly appreciate expedited review] * deeplake version was hardcoded and since deeplake upgraded the integration fails with confusing error * an additional integration test fixed due to embedding function * Additionally fixed docs for code understanding links after docs upgraded * notebook removal of public parameter to make sure code understanding notebook works #### Who can review? @hwchase17 @dev2049 --------- Co-authored-by: Davit Buniatyan <d@activeloop.ai>	2023-06-18 17:33:49 -07:00
hp0404	6aa7b04f79	Fix integration tests for Faiss vector store (#6281 ) Fixes #5807 (issue) #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:25:49 -07:00
Chakib Benziane	ddd518a161	searx_search: updated tools and doc (#6276 ) - Allows using the same wrapper to create multiple tools ```python wrapper = SearxSearchWrapper(searx_host="**") github_tool = SearxSearchResults(name="Github", wrapper=wrapper, kwargs = { "engines": ["github"], }) arxiv_tool = SearxSearchResults(name="Arxiv", wrapper=wrapper, kwargs = { "engines": ["arxiv"] }) ``` - Updated link to searx documentation Agents / Tools / Toolkits - @hwchase17	2023-06-18 17:23:12 -07:00
ju-bezdek	e2f36ee608	OpenAI functions dont work with async streaming... #6225 (#6226 ) Related to this https://github.com/hwchase17/langchain/issues/6225 Just copied the implementation from `generate` function to `agenerate` and tested it. Didn't run any official tests thought <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #6225 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17, @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:05:16 -07:00
Jan Pawellek	ea6a5b03e0	Fix output final text for HuggingFaceTextGenInference when streaming (#6211 ) The LLM integration [HuggingFaceTextGenInference](https://github.com/hwchase17/langchain/blob/master/langchain/llms/huggingface_text_gen_inference.py) already has streaming support. However, when streaming is enabled, it always returns an empty string as the final output text when the LLM is finished. This is because `text` is instantiated with an empty string and never updated. This PR fixes the collection of the final output text by concatenating new tokens.	2023-06-18 17:01:15 -07:00
Tomaz Bratanic	b3bccabc66	Add option to save/load graph cypher QA (#6219 ) Similar as https://github.com/hwchase17/langchain/pull/5818 Added the functionality to save/load Graph Cypher QA Chain due to a user reporting the following error > raise NotImplementedError("Saving not supported for this chain type.")\nNotImplementedError: Saving not supported for this chain type.\n'	2023-06-18 17:00:27 -07:00
Harrison Chase	495128ba95	Harrison/functions docs improvements (#6389 ) Co-authored-by: Sumanth Donthula <46747610+sumanthdonthula@users.noreply.github.com>	2023-06-18 16:57:33 -07:00
Leonid Ganeline	c7ca350cd3	Fix class promotion (#6187 ) In LangChain, all module classes are enumerated in the `__init__.py` file of the correspondent module. But some classes were missed and were not included in the module `__init__.py` This PR: - added the missed classes to the module `__init__.py` files - `__init__.py:__all_` variable value (a list of the class names) was sorted - `langchain.tools.sql_database.tool.QueryCheckerTool` was renamed into the `QuerySQLCheckerTool` because it conflicted with `langchain.tools.spark_sql.tool.QueryCheckerTool` - changes to `pyproject.toml`: - added `pgvector` to `pyproject.toml:extended_testing` - added `pandas` to `pyproject.toml:[tool.poetry.group.test.dependencies]` - commented out the `streamlit` from `collbacks/__init__.py`, It is because now the `streamlit` requires Python >=3.7, !=3.9.7 - fixed duplicate names in `tools` - fixed correspondent ut-s #### Who can review? @hwchase17 @dev2049	2023-06-18 16:55:18 -07:00
Harrison Chase	c0c2fd0782	Harrison/zep mem (#6388 ) Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	2023-06-18 16:53:35 -07:00
Harrison Chase	b7159c15cc	Harrison/metaphor search fix (#6387 ) Co-authored-by: jeffzwang <jeffreyzhiyuanwang@gmail.com>	2023-06-18 16:53:24 -07:00
Harrison Chase	9bf5b0defa	Harrison/myscale self query (#6376 ) Co-authored-by: Fangrui Liu <fangruil@moqi.ai> Co-authored-by: 刘方瑞 <fangrui.liu@outlook.com> Co-authored-by: Fangrui.Liu <fangrui.liu@ubc.ca>	2023-06-18 16:53:10 -07:00
Harrison Chase	bd8d418a95	Merge branch 'master' of github.com:hwchase17/langchain	2023-06-18 16:45:49 -07:00
Harrison Chase	3a75d59c3d	searx - docs	2023-06-18 16:45:42 -07:00
MIDORIBIN	5be465bd86	Fixed PermissionError on windows (#6170 ) Fixed PermissionError that occurred when downloading PDF files via http in BasePDFLoader on windows. When downloading PDF files via http in BasePDFLoader, NamedTemporaryFile is used. This function cannot open the file again on Windows.[Python Doc](https://docs.python.org/3.9/library/tempfile.html#tempfile.NamedTemporaryFile) So, we created a temporary directory with TemporaryDirectory and placed the downloaded file there. temporary directory is deleted in the deconstruct. Fixes #2698 #### Who can review? Tag maintainers/contributors who might be interested: - @eyurtsev - @hwchase17	2023-06-18 16:39:57 -07:00
xleven	4fc7939848	fix link of callbacks on modules page (#6323 ) Since [Callbacks](https://python.langchain.com/docs/modules/callbacks/getting_started/) on [Modules](https://python.langchain.com/docs/modules/) went to a "Page Not Found".	2023-06-18 15:08:12 -07:00
Vijay	2b3b4e0f60	Add the ability to run the map_reduce chains process results step as async (#6181 ) This will add the ability to add an AsyncCallbackManager (handler) for the reducer chain, which would be able to stream the tokens via the `async def on_llm_new_token` callback method Fixes # (issue) [5532](https://github.com/hwchase17/langchain/issues/5532) @hwchase17 @agola11 The following code snippet explains how this change would be used to enable `reduce_llm` with streaming support in a `map_reduce` chain I have tested this change and it works for the streaming use-case of reducer responses. I am happy to share more information if this makes solution sense. ``` AsyncHandler .......................... class StreamingLLMCallbackHandler(AsyncCallbackHandler): """Callback handler for streaming LLM responses.""" def __init__(self, websocket): self.websocket = websocket # This callback method is to be executed in async async def on_llm_new_token(self, token: str, **kwargs: Any) -> None: resp = ChatResponse(sender="bot", message=token, type="stream") await self.websocket.send_json(resp.dict()) Chain .......... stream_handler = StreamingLLMCallbackHandler(websocket) stream_manager = AsyncCallbackManager([stream_handler]) streaming_llm = ChatOpenAI( streaming=True, callback_manager=stream_manager, verbose=False, temperature=0, ) main_llm = OpenAI( temperature=0, verbose=False, ) doc_chain = load_qa_chain( llm=main_llm, reduce_llm=streaming_llm, chain_type="map_reduce", callback_manager=manager ) qa_chain = ConversationalRetrievalChain( retriever=vectorstore.as_retriever(), combine_docs_chain=doc_chain, question_generator=question_generator, callback_manager=manager, ) # Here `acall` will trigger `acombine_docs` on `map_reduce` which should then call `_aprocess_result` which in turn will call `self.combine_document_chain.arun` hence async callback will be awaited result = await qa_chain.acall( {"question": question, "chat_history": chat_history} ) ```	2023-06-18 13:19:56 -07:00
Alvaro Bartolome	e0dea577ee	Extend `ArgillaCallbackHandler` support (#6153 ) Hi again @agola11! 🤗 ## What's in this PR? After playing around with different chains we noticed that some chains were using different `output_key`s and we were just handling some, so we've extended the support to any output, either if it's a Python list or a string. Kudos to @dvsrepo for spotting this! --------- Co-authored-by: Daniel Vila Suero <daniel@argilla.io>	2023-06-18 11:18:33 -07:00
Harrison Chase	a8cb9ee013	Harrison/gdrive enhancements (#6375 ) Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io>	2023-06-18 11:07:23 -07:00
rafael	ebfffaa38f	Guardrails output parser: Pass LLM api for reasking (#6089 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes https://github.com/ShreyaR/guardrails/issues/155 Enables guardrails reasking by specifying an LLM api in the output parser.	2023-06-18 10:50:20 -07:00
Davis Chase	ec850e607f	bump 203 (#6372 )	2023-06-18 09:20:47 -07:00
Lance Martin	370becdfc2	Add self query retriever example with MD header splitting (#6359 ) Flesh out the notebook example for `MarkdownHeaderTextSplitter`	2023-06-17 21:40:20 -07:00
Lance Martin	2c97fbabbd	Update MD header text splitter notebook (#6339 ) Highlight use case for maintaining header groups when splitting.	2023-06-17 13:19:27 -07:00
Harrison Chase	a2bbe3dda4	Harrison/mmr support for opensearch (#6349 ) Co-authored-by: Mehmet Öner Yalçın <oneryalcin@gmail.com>	2023-06-17 12:22:37 -07:00
Davis Chase	2eea5d4cb4	Add ignore vercel preview script (#6320 ) skip building preview of docs for anything branch that doesn't start with `__docs__`. will eventually update to look at code diff directories but patching for now	2023-06-17 11:17:08 -07:00
Harrison Chase	7a48d9ee82	Merge branch 'master' of github.com:hwchase17/langchain	2023-06-17 11:16:19 -07:00
Kenny	e30fdffd1e	Add new openai 0613 model costs (#6110 ) Added costs for gpt-4-32k-0613, gpt-4-0613, gpt-3.5-turbo-16k, gpt-3.5-turbo-0613, and gpt-3.5-turbo-16k-0613 to openai_info callback based on this [OpenAI post](https://openai.com/blog/function-calling-and-other-api-updates) @agola11	2023-06-17 11:11:47 -07:00
Dhruvil Shah	2eec687474	update web_base.py to have verify option (#6107 ) We propose an enhancement to the web-based loader initialize method by introducing a "verify" option. This enhancement addresses the issue of SSL verification errors encountered on certain web pages. By providing users with the option to set the verify parameter to False, we offer greater flexibility and control. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> ### Fixes #6079 #### Who can review? @eyurtsev @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-17 11:10:48 -07:00
Harrison Chase	680d6bbbf8	fix titles in documentation	2023-06-17 11:09:11 -07:00
Nuno Campos	e194dc5306	Make lckwargs private (#6344 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-17 19:08:25 +01:00
Harrison Chase	8cfb52ddbb	fix spelling	2023-06-17 11:06:54 -07:00
zengbo	5d5298087f	Custom Anthropic API URL (#6221 ) [Feature] User can custom the Anthropic API URL #### Who can review? Tag maintainers/contributors who might be interested: Models - @hwchase17 - @agola11	2023-06-17 11:01:29 -07:00
Harrison Chase	61e4a1adf9	Harrison/faiss score (#6341 ) Co-authored-by: Frank Stein <16441059+simonfromla@users.noreply.github.com> Co-authored-by: Sims Juju <sims@Ju.lan>	2023-06-17 11:00:47 -07:00
Harrison Chase	42a28ac1ba	Harrison/error zero tools (#6340 ) Co-authored-by: Juhee Kim <46583939+juppytt@users.noreply.github.com>	2023-06-17 11:00:35 -07:00
Slawomir Gonet	eef62bf4e9	qdrant: search by vector (#6043 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Added support to `search_by_vector` to Qdrant Vector store. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ### Who can review VectorStores / Retrievers / Memory - @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 -->	2023-06-17 09:44:28 -07:00
Mark	b7ba7e8a7b	Allow GoogleDrive to authenticate via application default credentials on Cloud Run/GCE etc without service key (#6035 ) @eyurtsev The existing GoogleDrive implementation always needs a service account to be available at the credentials location. When running on GCP services such as Cloud Run, a service account already exists in the metadata of the service, so no physical key is necessary. This change adds a check to see if it is running in such an environment, and uses that authentication instead. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-17 09:44:17 -07:00
lonestriker	6f36f0f930	Add oobabooga/text-generation-webui support as a llm (#5997 ) Add oobabooga/text-generation-webui support as an LLM. Currently, supports using text-generation-webui's non-streaming API interface. Allows users who already have text-gen running to use the same models with langchain. #### Before submitting Simple usage, similar to existing LLM supported: ``` from langchain.llms import TextGen llm = TextGen(model_url = "http://localhost:5000") ``` #### Who can review? @hwchase17 - project lead --------- Co-authored-by: Hien Ngo <Hien.Ngo@adia.ae>	2023-06-17 09:42:15 -07:00
Richy Wang	444ca3f669	Improve AnalyticDB Vector Store implementation without affecting user (#6086 ) Hi there: As I implement the AnalyticDB VectorStore use two table to store the document before. It seems just use one table is a better way. So this commit is try to improve AnalyticDB VectorStore implementation without affecting user behavior: 1. Streamline the `post_init `behavior by creating a single table with vector indexing. 2. Update the `add_texts` API for document insertion. 3. Optimize `similarity_search_with_score_by_vector` to retrieve results directly from the table. 4. Implement `_similarity_search_with_relevance_scores`. 5. Add `embedding_dimension` parameter to support different dimension embedding functions. Users can continue using the API as before. Test cases added before is enough to meet this commit.	2023-06-17 09:36:31 -07:00
Ja-sonYun	cdd1d78bf2	make modelname_to_contextsize as a staticmethod (#6040 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes ##6039 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17　@agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-17 09:13:08 -07:00
Saba Sturua	427551eabf	DocArray as a Retriever (#6031 ) ## DocArray as a Retriever [DocArray](https://github.com/docarray/docarray) is an open-source tool for managing your multi-modal data. It offers flexibility to store and search through your data using various document index backends. This PR introduces `DocArrayRetriever` - which works with any available backend and serves as a retriever for Langchain apps. Also, I added 2 notebooks: DocArray Backends - intro to all 5 currently supported backends, how to initialize, index, and use them as a retriever DocArray Usage - showcasing what additional search parameters you can pass to create versatile retrievers Example: ```python from docarray.index import InMemoryExactNNIndex from docarray import BaseDoc, DocList from docarray.typing import NdArray from langchain.embeddings.openai import OpenAIEmbeddings from langchain.retrievers import DocArrayRetriever # define document schema class MyDoc(BaseDoc): description: str description_embedding: NdArray[1536] embeddings = OpenAIEmbeddings() # create documents descriptions = ["description 1", "description 2"] desc_embeddings = embeddings.embed_documents(texts=descriptions) docs = DocList[MyDoc]( [ MyDoc(description=desc, description_embedding=embedding) for desc, embedding in zip(descriptions, desc_embeddings) ] ) # initialize document index with data db = InMemoryExactNNIndex[MyDoc](docs) # create a retriever retriever = DocArrayRetriever( index=db, embeddings=embeddings, search_field="description_embedding", content_field="description", ) # find the relevant document doc = retriever.get_relevant_documents("action movies") print(doc) ``` #### Who can review? @dev2049 --------- Signed-off-by: jupyterjazz <saba.sturua@jina.ai>	2023-06-17 09:09:33 -07:00
Masafumi Mori	7bb437146d	fix links to prompt templates and example selectors (#6332 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # links to prompt templates and example selectors on the [Prompts](https://python.langchain.com/docs/modules/model_io/prompts/) page are invalid. #### Before submitting Just a small note that I tried to run `make docs_clean` and other related commands before PR written [here](https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md#build-documentation-locally), it gives me an error: ```bash langchain % make docs_clean Traceback (most recent call last): File "/Users/masafumi/Downloads/langchain/.venv/bin/make", line 5, in <module> from scripts.proto import main ModuleNotFoundError: No module named 'scripts' make: *** [docs_clean] Error 1 # Poetry (version 1.5.1) # Python 3.9.13 ``` I couldn't figure out how to fix this, so I didn't run those command. But links should work. #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 Similar issue #6323 Co-authored-by: masafumimori <m.masafumimori@outlook.com>	2023-06-17 09:07:14 -07:00
Francisco Ingham	83eea230f3	changed height in the nb example (#6327 ) changed height in the example to a more reasonable number (from 9 feet to 6 feet)	2023-06-17 00:05:48 -07:00
James O'Dwyer	0475d015fe	Handle Managed Motorhead Data Key (#6169 ) # Handle Managed Motorhead Data Key Managed motorhead will return a payload with a `data` key. we need to handle this to properly access messages from the server.	2023-06-16 20:36:18 -07:00
Luke Stanley	364f8e7b5d	Better Entity Memory code documentation (#6318 ) Just adds some comments and docstring improvements. There was some behaviour that was quite unclear to me at first like: - "when do things get updated?" - "why are there only entity names and no summaries?" - "why do the entity names disappear?" Now it can be much more obvious to many. I am lukestanley on Twitter.	2023-06-16 18:08:44 -07:00
Harrison Chase	af18413d97	Harrison/deeplake new features (#6263 ) Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-16 17:53:55 -07:00
Davis Chase	6640293087	fix eval guide links (#6319 )	2023-06-16 17:53:46 -07:00
ljeagle	ad324a39ae	Improve the performance of add_texts interface and upgrade the AwaDB from 0.3.2 to 0.3.3 (#6316 ) 1. Changed the implementation of add_texts interface for the AwaDB vector store in order to improve the performance 2. Upgrade the AwaDB from 0.3.2 to 0.3.3 --------- Co-authored-by: vincent <awadb.vincent@gmail.com>	2023-06-16 16:50:01 -07:00
Davis Chase	24b2af5218	nit (#6305 )	2023-06-16 16:21:27 -07:00
Pierre Alexandre SCHEMBRI	9ca11c06b7	Fixes #6282 (#6283 ) Fixes #6282 1 liner to fix default http headers not passed by `LLMRequestsChain`	2023-06-16 16:21:01 -07:00
Davis Chase	23cdebddc4	Del linkcheck readme (#6317 )	2023-06-16 16:18:45 -07:00
Brigit Murtaugh	ccd916babe	Update dev container (#6189 ) Fixes https://github.com/hwchase17/langchain/issues/6172 As described in https://github.com/hwchase17/langchain/issues/6172, I'd love to help update the dev container in this project. Summary of changes: - Dev container now builds (the current container in this repo won't build for me) - Dockerfile updates - Update image to our [currently-maintained Python image](https://github.com/devcontainers/images/tree/main/src/python/.devcontainer) (`mcr.microsoft.com/devcontainers/python`) rather than the deprecated image from vscode-dev-containers - Move Dockerfile to root of repo - in order for `COPY` to work properly, it needs the files (in this case, `pyproject.toml` and `poetry.toml`) in the same directory - devcontainer.json updates - Removed `customizations` and `remoteUser` since they should be covered by the updated image in the Dockerfile - Update comments - Update docker-compose.yaml to properly point to updated Dockerfile - Add a .gitattributes to avoid line ending conversions, which can result in hundreds of pending changes ([info](https://code.visualstudio.com/docs/devcontainers/tips-and-tricks#_resolving-git-line-ending-issues-in-containers-resulting-in-many-modified-files)) - Add a README in the .devcontainer folder and info on the dev container in the contributing.md Outstanding questions: - Is it expected for `poetry install` to take some time? It takes about 30 minutes for this dev container to finish building in a Codespace, but a user should only have to experience this once. Through some online investigation, this doesn't seem unusual - Versions of poetry newer than 1.3.2 failed every time - based on some of the guidance in contributing.md and other online resources, it seemed changing poetry versions might be a good solution. 1.3.2 is from Jan 2023 --------- Co-authored-by: bamurtaugh <brmurtau@microsoft.com> Co-authored-by: Samruddhi Khandale <samruddhikhandale@github.com>	2023-06-16 15:42:14 -07:00
Davis Chase	03b5891cf7	more redirect (#6314 )	2023-06-16 14:43:59 -07:00
Davis Chase	eaee492dbc	basic redirect (#6309 )	2023-06-16 13:39:58 -07:00
Davis Chase	d2243757a3	update readme (#6304 )	2023-06-16 12:27:16 -07:00
Davis Chase	2f47e5c766	update api link (#6303 )	2023-06-16 12:18:17 -07:00
Davis Chase	d558bcfad8	rm ignore_vercel (#6302 )	2023-06-16 12:06:58 -07:00
Davis Chase	87e502c6bc	Doc refactor (#6300 ) Co-authored-by: jacoblee93 <jacoblee93@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-16 11:52:56 -07:00
Harrison Chase	94c82a189d	bump to 202 (#6262 )	2023-06-16 06:52:36 -07:00
hp0404	b01cf0dd54	ArxivAPIWrapper - doc_content_chars_max (#6063 ) This PR refactors the ArxivAPIWrapper class making `doc_content_chars_max` parameter optional. Additionally, tests have been added to ensure the functionality of the doc_content_chars_max parameter. Fixes #6027 (issue)	2023-06-15 22:16:42 -07:00
Daniel King	a9b97aa6f4	Update output format of MosaicML endpoint to be more flexible (#6060 ) There will likely be another change or two coming over the next couple weeks as we stabilize the API, but putting this one in now which just makes the integration a bit more flexible with the response output format. ``` (langchain) danielking@MML-1B940F4333E2 langchain % pytest tests/integration_tests/llms/test_mosaicml.py tests/integration_tests/embeddings/test_mosaicml.py =================================================================================== test session starts =================================================================================== platform darwin -- Python 3.10.11, pytest-7.3.1, pluggy-1.0.0 rootdir: /Users/danielking/github/langchain configfile: pyproject.toml plugins: asyncio-0.20.3, mock-3.10.0, dotenv-0.5.2, cov-4.0.0, anyio-3.6.2 asyncio: mode=strict collected 12 items tests/integration_tests/llms/test_mosaicml.py ...... [ 50%] tests/integration_tests/embeddings/test_mosaicml.py ...... [100%] =================================================================================== slowest 5 durations =================================================================================== 4.76s call tests/integration_tests/llms/test_mosaicml.py::test_retry_logic 4.74s call tests/integration_tests/llms/test_mosaicml.py::test_mosaicml_llm_call 4.13s call tests/integration_tests/llms/test_mosaicml.py::test_instruct_prompt 0.91s call tests/integration_tests/llms/test_mosaicml.py::test_short_retry_does_not_loop 0.66s call tests/integration_tests/llms/test_mosaicml.py::test_mosaicml_extra_kwargs =================================================================================== 12 passed in 19.70s =================================================================================== ``` #### Who can review? @hwchase17 @dev2049	2023-06-15 22:15:39 -07:00
JaysonAlbert	50d9c7d5a4	Fix: change the chatgpt plugin retriever metadata format (#5920 ) the current implement put the doc itself as the metadata, but the document chatgpt plugin retriever returned already has a `metadata` field, it's better to use that instead. the original code will throw the following exception when using `RetrievalQAWithSourcesChain`, becuse it can not find the field `metadata`: ```python Exception has occurred: ValueError (note: full exception trace is shown but execution is paused at: _run_module_as_main) Document prompt requires documents to have metadata variables: ['source']. Received document with missing metadata: ['source']. File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/base.py", line 27, in format_document raise ValueError( File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py", line 65, in <listcomp> doc_strings = [format_document(doc, self.document_prompt) for doc in docs] File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py", line 65, in _get_inputs doc_strings = [format_document(doc, self.document_prompt) for doc in docs] File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py", line 85, in combine_docs inputs = self._get_inputs(docs, **kwargs) File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/base.py", line 84, in _call output, extra_return_dict = self.combine_docs( File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/base.py", line 140, in __call__ raise e ``` Additionally, the `metadata` filed in the `chatgpt plugin retriever` have these fileds by default: ```json { "source": "file", //email, file or chat "source_id": "filename.docx", // the filename "url": "", ... } ``` so, we should set `source_id` to `source` in the langchain metadata. ```python metadata = d.pop("metadata", d) if(metadata.get("source_id")): metadata["source"] = metadata.pop("source_id") ``` #### Who can review? @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: wangjie <wangjie@htffund.com>	2023-06-15 22:04:45 -07:00
Harrison Chase	e67b26eee9	Harrison/openai functions (#6261 ) Co-authored-by: Francisco Ingham <24279597+fpingham@users.noreply.github.com>	2023-06-15 21:54:39 -07:00
Harrison Chase	6aafb46807	Harrison/openai functions (#6223 ) Co-authored-by: Francisco Ingham <24279597+fpingham@users.noreply.github.com>	2023-06-15 21:43:33 -07:00
Zander Chase	bc9b8c8239	Improve Error Message for failed callback (#6247 ) Include the handler class name in the warning	2023-06-15 19:18:37 -07:00
Alon Roth	0013256e81	Support chat history persistence in AutoGPT (#5716 ) Short Description Added a new argument to AutoGPT class which allows to persist the chat history to a file. Changes 1. Removed the `self.full_message_history: List[BaseMessage] = []` 2. Replaced it with `chat_history_memory` which can take any subclasses of `BaseChatMessageHistory` --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-15 17:49:03 -07:00
Martin Antos	1913320cbe	Feature/add acreom loader (#5780 ) adding new loader for [acreom](https://acreom.com) vaults. It's based on the Obsidian loader with some additional text processing for acreom specific markdown elements. @eyurtsev please take a look! --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-15 11:53:00 -07:00
Zander Chase	ae76e473e1	Add Tags for LLMs (#6229 ) - [x] Add tracing tags to LLMs + Chat Models (both inheritable and local) - [x] Add tags for the run_on_dataset helper function(s)	2023-06-15 11:24:11 -07:00
Harrison Chase	8e1a7a8646	bump version to 201 (#6233 )	2023-06-15 08:28:47 -07:00
Harrison Chase	e82687ddf4	Harrison/use functions agent (#6185 ) Co-authored-by: Francisco Ingham <24279597+fpingham@users.noreply.github.com>	2023-06-15 08:18:50 -07:00
Ryo Kanazawa	7d2b946d0b	Fix typo `pandocs` to `pandoc` (#6203 ) Fixes https://github.com/hwchase17/langchain/issues/6204 ### Context An typo issue with `pandoc`. #### Who can review? @hwchase17	2023-06-15 08:18:27 -07:00
Kyle Roth	c7db9febb0	count tokens for new OpenAI model versions (#6195 ) Trying to call `ChatOpenAI.get_num_tokens_from_messages` returns the following error for the newly announced models `gpt-3.5-turbo-0613` and `gpt-4-0613`: ``` NotImplementedError: get_num_tokens_from_messages() is not presently implemented for model gpt-3.5-turbo-0613.See https://github.com/openai/openai-python/blob/main/chatml.md for information on how messages are converted to tokens. ``` This adds support for counting tokens for those models, by counting tokens the same way they're counted for the previous versions of `gpt-3.5-turbo` and `gpt-4`. #### reviewers - @hwchase17 - @agola11	2023-06-15 06:16:03 -07:00
xu0o0	7ad13cdbdb	feat: add content_format param to ConfluenceLoader.load() (#5922 ) Confluence API supports difference format of page content. The storage format is the raw XML representation for storage. The view format is the HTML representation for viewing with macros rendered as though it is viewed by users. Add the `content_format` parameter to `ConfluenceLoader.load()` to specify the content format, this is set to `ContentFormat.STORAGE` by default. #### Who can review? Tag maintainers/contributors who might be interested: @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-14 16:56:28 -07:00
0xJordan	c5a46e7435	feat: Add support for the Solidity language (#6054 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> ## Add Solidity programming language support for code splitter. Twitter: @0xjord4n_ <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-14 14:25:02 -07:00
Nuno Campos	17c4ec4812	Add docs for tags (#6155 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-14 14:01:58 -07:00
thiswillbeyourgithub	4a649e3b14	typo: 'following following' to 'following' (#6163 ) Co-authored-by: thiswillbeyourgithub <github@32mail.33mail.com>	2023-06-14 10:58:47 -07:00
Maciej Bryński	8a44c879c6	Update readthedocs_documentation.ipynb (#6148 ) Minor fix in documentation. Change URL in wget call to proper one.	2023-06-14 07:21:48 -07:00
Zander Chase	e0e3ef1c57	Update Name (#6136 )	2023-06-13 22:25:36 -07:00
Zander Chase	4555ad5d1f	Add Run Collector Callback (#6133 ) Add a callback handler that can collect nested run objects. Useful for evaluation.	2023-06-13 22:17:37 -07:00
Harrison Chase	6ac120f299	bump ver to 200 (#6130 )	2023-06-13 19:33:51 -07:00
Harrison Chase	e41f0b341c	add functions agent (#6113 )	2023-06-13 18:51:01 -07:00
Zander Chase	b3b155d488	Return session name in runner response (#6112 ) Makes it easier to then run evals w/o thinking about specifying a session	2023-06-13 16:59:43 -07:00
Harrison Chase	e74733ab9e	support streaming for functions (#6115 )	2023-06-13 15:26:26 -07:00
Nuno Campos	11ab0be11a	Add support for tags (#5898 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-13 12:30:59 -07:00
Harrison Chase	1281fdf0f2	Harrison/notebook functions (#6103 )	2023-06-13 10:52:54 -07:00
Harrison Chase	34ebb29726	bump version to 199 (#6102 )	2023-06-13 10:50:33 -07:00
Wenchen Li	f9edf76e7c	Implement `max_marginal_relevance_search` in `VectorStore` of Pinecone (#6056 ) This adds implementation of MMR search in pinecone; and I have two semi-related observations about this vector store class: - Maybe we should also have a `similarity_search_by_vector_returning_embeddings` like in supabase, but it's not in the base `VectorStore` class so I didn't implement - Talking about the base class, there's `similarity_search_with_relevance_scores`, but in pinecone it is called `similarity_search_with_score`; maybe we should consider renaming it to align with other `VectorStore` base and sub classes (or add that as an alias for backward compatibility) #### Who can review? Tag maintainers/contributors who might be interested: - VectorStores / Retrievers / Memory - @dev2049	2023-06-13 10:46:45 -07:00
Harrison Chase	970b2f9d38	convert tools to openai (#6100 )	2023-06-13 10:40:49 -07:00
Harrison Chase	292accde2b	support functions (#6099 )	2023-06-13 10:32:58 -07:00
Lance Martin	ee3d0513ad	Add tests and update notebook for MarkdownHeaderTextSplitter (#6069 ) Add test and update notebook for `MarkdownHeaderTextSplitter`.	2023-06-13 09:07:52 -07:00
Keshav Kumar	8fdf88b8e3	Fix for ModuleNotFoundError while running langchain-server. Issue #5833 (#6077 ) This PR fixes the error `ModuleNotFoundError: No module named 'langchain.cli'` Fixes https://github.com/hwchase17/langchain/issues/5833 (issue)	2023-06-13 08:37:07 -07:00
Zander Chase	0c52275bdb	Use Run object from SDK (#6067 ) Update the Run object in the tracer to extend that in the SDK to include the parameters necessary for tracking/tracing	2023-06-13 07:14:11 -07:00
Harrison Chase	cde1e8739a	turn off repr (#6078 )	2023-06-12 22:45:24 -07:00
Nuno Campos	a9b3b2e327	Enable serialization for anthropic (#6049 )	2023-06-12 22:39:10 -07:00
Harrison Chase	6ac5d80286	propogate kwargs fully (#6076 )	2023-06-12 22:37:55 -07:00
Harrison Chase	ec1a2adf9c	improve tools (#6062 )	2023-06-12 22:19:03 -07:00
Julius Lipp	5b6bbf4ab2	Add embaas document extraction api endpoints (#6048 ) # Introduces embaas document extraction api endpoints In this PR, we add support for embaas document extraction endpoints to Text Embedding Models (with LLMs, in different PRs coming). We currently offer the MTEB leaderboard top performers, will continue to add top embedding models and soon add support for customers to deploy thier own models. Additional Documentation + Infomation can be found [here](https://embaas.io). While developing this integration, I closely followed the patterns established by other langchain integrations. Nonetheless, if there are any aspects that require adjustments or if there's a better way to present a new integration, let me know! :) Additionally, I fixed some docs in the embeddings integration. Related PR: #5976 #### Who can review? DataLoaders - @eyurtsev	2023-06-12 19:13:52 -07:00
Zander Chase	2f0088039d	Log tracer errors (#6066 ) Example (would log several times if not for the helper fn. Would emit no logs due to mulithreading previously) ![image](https://github.com/hwchase17/langchain/assets/130414180/070d25ae-1f06-4487-9617-0a6f66f3f01e)	2023-06-12 17:13:49 -07:00
Lance Martin	b023f0c0f2	Text splitter for Markdown files by header (#5860 ) This creates a new kind of text splitter for markdown files. The user can supply a set of headers that they want to split the file on. We define a new text splitter class, `MarkdownHeaderTextSplitter`, that does a few things: (1) For each line, it determines the associated set of user-specified headers (2) It groups lines with common headers into splits See notebook for example usage and test cases.	2023-06-12 15:46:42 -07:00
Jens Madsen	2c91f0d750	chore: spedd up integration test by using smaller model (#6044 ) Adds a new parameter `relative_chunk_overlap` for the `SentenceTransformersTokenTextSplitter` constructor. The parameter sets the chunk overlap using a relative factor, e.g. for a model where the token limit is 100, a `relative_chunk_overlap=0.5` implies that `chunk_overlap=50` Tag maintainers/contributors who might be interested: @hwchase17, @dev2049	2023-06-12 13:27:10 -07:00
Harrison Chase	5922742d56	comment out	2023-06-12 10:57:31 -07:00
Harrison Chase	681ba6d520	embaas title	2023-06-12 08:00:14 -07:00
Ben Flast	7a5e36f3f5	Mongo db doc fix (#6042 ) I missed a few errors in my initial fix @hwchase1. Thanks!	2023-06-12 07:29:27 -07:00
Harrison Chase	289e9aeb9d	bump ver to 198 (#6026 )	2023-06-11 21:32:45 -07:00
Harrison Chase	d1561b74eb	Harrison/cognitive search (#6011 ) Co-authored-by: Fabrizio Ruocco <ruoccofabrizio@gmail.com>	2023-06-11 21:15:42 -07:00
wenmeng zhou	bb7ac9edb5	add dashscope text embedding (#5929 ) #### What I do Adding embedding api for [DashScope](https://help.aliyun.com/product/610100.html), which is the DAMO Academy's multilingual text unified vector model based on the LLM base. It caters to multiple mainstream languages worldwide and offers high-quality vector services, helping developers quickly transform text data into high-quality vector data. Currently supported languages include Chinese, English, Spanish, French, Portuguese, Indonesian, and more. #### Who can review? Models - @hwchase17 - @agola11 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-11 21:14:20 -07:00
Ben Flast	010d0bfeea	Update MongoDB Atlas support docs (#6022 ) Updating MongoDB Atlas support docs @hwchase17 let me know if you have any questions	2023-06-11 20:57:15 -07:00
Harrison Chase	e05997c25e	Harrison/hologres (#6012 ) Co-authored-by: Changgeng Zhao <changgeng@nyu.edu> Co-authored-by: Changgeng Zhao <zhaochanggeng.zcg@alibaba-inc.com>	2023-06-11 20:56:51 -07:00
ljeagle	c5bce4a465	add from_documents interface in awadb vector store (#6023 ) added new interface from_documents in awadb vector store @dev2049 --------- Co-authored-by: vincent <awadb.vincent@gmail.com>	2023-06-11 19:35:03 -07:00
Zander Chase	2c9619bc1d	Remove from PR template (#6018 )	2023-06-11 19:34:26 -07:00
ju-bezdek	18f5c985d9	Langchain decorators (#6017 ) Added description of LangChain Decorators ✨ into the integration section <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-11 19:32:24 -07:00
Zander Chase	a197acfcd3	Update check (#6020 ) We were assigning the name as None in on_chat_model_start then not updating, resulting in a validation error.	2023-06-11 17:59:09 -07:00
Nuno Campos	18af149e91	nc/load (#5733 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-11 15:51:28 -07:00
Zander Chase	614cff89bc	I before E (#6015 )	2023-06-11 15:45:12 -07:00
Harrison Chase	a7227ee01b	Harrison/embaas (#6010 ) Co-authored-by: Julius Lipp <43986145+juliuslipp@users.noreply.github.com>	2023-06-11 13:35:14 -07:00
xu0o0	232faba796	fix: TypeError when loading confluence pages by cql (#5878 ) The Confluence loader uses the wrong API (`Confluence.cql()` provided by `atlassian-python-api`) to load pages by CQL. `Confluence.cql()` is a wrapper of the `/rest/api/search` API which searches for entities in Confluence. To search for pages in Confluence, the loader can use the `/rest/api/content/search` API. #### Who can review? Tag maintainers/contributors who might be interested: @eyurtsev <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> #### References ##### Cloud API https://developer.atlassian.com/cloud/confluence/rest/v1/api-group-content/#api-wiki-rest-api-content-search-get https://developer.atlassian.com/cloud/confluence/rest/v1/api-group-search/#api-wiki-rest-api-search-get ##### Server API https://docs.atlassian.com/ConfluenceServer/rest/8.3.1/#api/content-search https://docs.atlassian.com/ConfluenceServer/rest/8.3.1/#api/search	2023-06-11 13:23:22 -07:00
Akhil Vempali	d7d629911b	feat: ✨ Added filtering option to FAISS vectorstore (#5966 ) Inspired by the filtering capability available in ChromaDB, added the same functionality to the FAISS vectorestore as well. Since FAISS does not have an inbuilt method of filtering used the approach suggested in this [thread](https://github.com/facebookresearch/faiss/issues/1079) Langchain Issue inspiration: https://github.com/hwchase17/langchain/issues/4572 - [x] Added filtering capability to semantic similarly and MMR - [x] Added test cases for filtering in `tests/integration_tests/vectorstores/test_faiss.py` #### Who can review? Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049 - @hwchase17	2023-06-11 13:20:03 -07:00
Jiaping(JP) Zhang	6e90406e0f	[APIChain] enhance the robustness or url (#6008 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> I used the APIChain sometimes it failed during the intermediate step when generating the api url and calling the `request` function. After some digging, I found the url sometimes includes the space at the beginning, like `%20https://...api.com` which causes the ` self.requests_wrapper.get` internal function to fail. Including a little string preprocessing `.strip` to remove the space seems to improve the robustness of the APIchain to make sure it can send the request and retrieve the API result more reliably. <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @vowelparrot Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-11 13:13:57 -07:00
Ikko Eltociear Ashimine	c868a3eef3	Update databricks.md (#6006 ) HuggingFace -> Hugging Face #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review?	2023-06-11 13:13:33 -07:00
Harrison Chase	20e9ce8a62	bump version to 197 (#6007 )	2023-06-11 10:14:57 -07:00
Harrison Chase	704d56e241	support kwargs (#5990 )	2023-06-11 10:09:22 -07:00
Mark Pors	b934677a81	Obey handler.raise_error in _ahandle_event_for_handler (#6001 ) Obey `handler.raise_error` in `_ahandle_event_for_handler` Exceptions for async callbacks were only logged as warnings, also when `raise_error = True` #### Who can review? @hwchase17 @agola11	2023-06-11 09:49:26 -07:00
Harrison Chase	2d038b57b2	Harrison/arxiv fix (#5993 ) Co-authored-by: Juanjo do Olmo <87780148+SimplyJuanjo@users.noreply.github.com>	2023-06-11 09:48:09 -07:00
Vincent	0b740c9baa	add ocr_languages param for ConfluenceLoader.load() (#5823 ) @eyurtsev 当Confluence文档内容中包含附件，且附件内容为非英文时，提取出来的文本是乱码的。 When the content of the document contains attachments, and the content of the attachments is not in English, the extracted text is garbled. 这主要是因为没有为pytesseract传递lang参数，默认情况下只支持英文。 This is mainly because lang parameter is not passed to pytesseract, and only English is supported by default. 所以我给ConfluenceLoader.load()添加了ocr_languages参数，以便支持多种语言。 So I added the ocr_languages parameter to ConfluenceLoader.load () to support multiple languages.	2023-06-10 16:51:04 -07:00
Thomas B	ac3e6e3944	Fix IndexError in RecursiveCharacterTextSplitter (#5902 ) Fixes (not reported) an error that may occur in some cases in the RecursiveCharacterTextSplitter. An empty `new_separators` array ([]) would end up in the else path of the condition below and used in a function where it is expected to be non empty. ```python if new_separators is None: ... else: # _split_text() expects this array to be non-empty! other_info = self._split_text(s, new_separators) ``` resulting in an `IndexError` ```python def _split_text(self, text: str, separators: List[str]) -> List[str]: """Split incoming text and return chunks.""" final_chunks = [] # Get appropriate separator to use > separator = separators[-1] E IndexError: list index out of range langchain/text_splitter.py:425: IndexError ``` #### Who can review? @hwchase17 @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-10 16:48:53 -07:00
Satheesh Valluru	d2270a2261	Fix: Grammer fix in documentation (#5925 ) Fix for grammatical errors in the documentation of `vectorstore`. @vowelparrot	2023-06-10 16:43:36 -07:00
Jens Madsen	1250cd4630	fix: use model token limit not tokenizer ditto (#5939 ) This fixes a token limit bug in the SentenceTransformersTokenTextSplitter. Before the token limit was taken from tokenizer used by the model. However, for some models the token limit of the tokenizer (from `AutoTokenizer.from_pretrained`) does not equal the token limit of the model. This was a false assumption. Therefore, the token limit of the text splitter is now taken from the sentence transformers model token limit. Twitter: @plasmajens #### Before submitting #### Who can review? @hwchase17 and/or @dev2049 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-10 16:36:03 -07:00
Ofer Mendelevitch	f8cf09a230	Update to Vectara integration (#5950 ) This PR updates the Vectara integration (@hwchase17 ): * Adds reuse of requests.session to imrpove efficiency and speed. * Utilizes Vectara's low-level API (instead of standard API) to better match user's specific chunking with LangChain * Now add_texts puts all the texts into a single Vectara document so indexing is much faster. * updated variables names from alpha to lambda_val (to be consistent with Vectara docs) and added n_context_sentence so it's available to use if needed. * Updates to documentation and tests --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-10 16:27:01 -07:00
qued	e4224a396b	feat: Add `UnstructuredXMLLoader` for `.xml` files (#5955 ) # Unstructured XML Loader Adds an `UnstructuredXMLLoader` class for .xml files. Works with unstructured>=0.6.7. A plain text representation of the text with the XML tags will be available under the `page_content` attribute in the doc. ### Testing ```python from langchain.document_loaders import UnstructuredXMLLoader loader = UnstructuredXMLLoader( "example_data/factbook.xml", ) docs = loader.load() ``` ## Who can review? @hwchase17 @eyurtsev	2023-06-10 16:24:42 -07:00
Lance Martin	21bd16bb59	Create Airtable loader (#5958 ) Create document loader for Airtable	2023-06-10 15:43:18 -07:00
Harrison Chase	9218684759	Add a new vector store - AwaDB (#5971 ) (#5992 ) Added AwaDB vector store, which is a wrapper over the AwaDB, that can be used as a vector storage and has an efficient similarity search. Added integration tests for the vector store Added jupyter notebook with the example Delete a unneeded empty file and resolve the conflict(https://github.com/hwchase17/langchain/pull/5886) Please check, Thanks! @dev2049 @hwchase17 --------- <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: ljeagle <vincent_jieli@yeah.net> Co-authored-by: vincent <awadb.vincent@gmail.com>	2023-06-10 15:42:32 -07:00
Tomaz Bratanic	d5819a7ca7	Add additional parameters to Graph Cypher Chain (#5979 ) Based on the inspiration from the SQL chain, the following three parameters are added to Graph Cypher Chain. - top_k: Limited the number of results from the database to be used as context - return_direct: Return database results without transforming them to natural language - return_intermediate_steps: Return intermediate steps	2023-06-10 14:39:55 -07:00
Daniel Grittner	0ca37e613c	Fix handling of missing action & input for async MRKL agent (#5985 ) Hi, This is a fix for https://github.com/hwchase17/langchain/pull/5014. This PR forgot to add the ability to self solve the ValueError(f"Could not parse LLM output: {llm_output}") error for `_atake_next_step`.	2023-06-10 14:38:20 -07:00
Harrison Chase	ca1afa7213	add test for structured tools (#5989 )	2023-06-10 14:37:26 -07:00
constDave	5f356b9993	Fixed typo missing "use" (#5991 ) <!-- Fixed a simple typo on https://python.langchain.com/en/latest/modules/indexes/retrievers/examples/vectorstore.html where the word "use" was missing. #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-10 14:31:58 -07:00
Kaarthik Andavar	d6f5d0c6b1	Fix: SnowflakeLoader returning empty documents (#5967 ) Fix SnowflakeLoader's Behavior of Returning Empty Documents Description: This PR addresses the issue where the SnowflakeLoader was consistently returning empty documents. After investigation, it was found that the query method within the SnowflakeLoader was not properly fetching and processing the data. Changes: 1. Modified the query method in SnowflakeLoader to handle data fetch and processing more accurately. 2. Enhanced error handling within the SnowflakeLoader to catch and log potential issues that may arise during data loading. Impact: This fix will ensure the SnowflakeLoader reliably returns the expected documents instead of empty ones, improving the efficiency and reliability of data processing tasks in the LangChain project. Before Fix: `[ Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}) ]` After Fix: `[Document(page_content='CUSTOMER_ID: 1\nFIRST_NAME: John\nLAST_NAME: Doe\nEMAIL: john.doe@example.com\nPHONE: 555-123-4567\nADDRESS: 123 Elm St, San Francisco, CA 94102', metadata={}), Document(page_content='CUSTOMER_ID: 2\nFIRST_NAME: Jane\nLAST_NAME: Doe\nEMAIL: jane.doe@example.com\nPHONE: 555-987-6543\nADDRESS: 456 Oak St, San Francisco, CA 94103', metadata={}), Document(page_content='CUSTOMER_ID: 3\nFIRST_NAME: Michael\nLAST_NAME: Smith\nEMAIL: michael.smith@example.com\nPHONE: 555-234-5678\nADDRESS: 789 Pine St, San Francisco, CA 94104', metadata={}), Document(page_content='CUSTOMER_ID: 4\nFIRST_NAME: Emily\nLAST_NAME: Johnson\nEMAIL: emily.johnson@example.com\nPHONE: 555-345-6789\nADDRESS: 321 Maple St, San Francisco, CA 94105', metadata={}), Document(page_content='CUSTOMER_ID: 5\nFIRST_NAME: David\nLAST_NAME: Williams\nEMAIL: david.williams@example.com\nPHONE: 555-456-7890\nADDRESS: 654 Birch St, San Francisco, CA 94106', metadata={}), Document(page_content='CUSTOMER_ID: 6\nFIRST_NAME: Emma\nLAST_NAME: Jones\nEMAIL: emma.jones@example.com\nPHONE: 555-567-8901\nADDRESS: 987 Cedar St, San Francisco, CA 94107', metadata={}), Document(page_content='CUSTOMER_ID: 7\nFIRST_NAME: Oliver\nLAST_NAME: Brown\nEMAIL: oliver.brown@example.com\nPHONE: 555-678-9012\nADDRESS: 147 Cherry St, San Francisco, CA 94108', metadata={}), Document(page_content='CUSTOMER_ID: 8\nFIRST_NAME: Sophia\nLAST_NAME: Davis\nEMAIL: sophia.davis@example.com\nPHONE: 555-789-0123\nADDRESS: 369 Walnut St, San Francisco, CA 94109', metadata={}), Document(page_content='CUSTOMER_ID: 9\nFIRST_NAME: James\nLAST_NAME: Taylor\nEMAIL: james.taylor@example.com\nPHONE: 555-890-1234\nADDRESS: 258 Hawthorn St, San Francisco, CA 94110', metadata={}), Document(page_content='CUSTOMER_ID: 10\nFIRST_NAME: Isabella\nLAST_NAME: Wilson\nEMAIL: isabella.wilson@example.com\nPHONE: 555-901-2345\nADDRESS: 963 Aspen St, San Francisco, CA 94111', metadata={})] ` Tests: All unit and integration tests have been run and passed successfully. Additional tests were added to validate the new behavior of the SnowflakeLoader. Checklist: - [x] Code changes are covered by tests - [x] Code passes `make format` and `make lint` - [x] This PR does not introduce any breaking changes Please review and let me know if any changes are required.	2023-06-10 13:03:50 -07:00
Harrison Chase	62ec10a7f5	bump version to 196 (#5988 )	2023-06-10 09:06:35 -07:00
German Martin	736a1819aa	LOTR: Lord of the Retrievers. A retriever that merge several retrievers together applying document_formatters to them. (#5798 ) "One Retriever to merge them all, One Retriever to expose them, One Retriever to bring them all and in and process them with Document formatters." Hi @dev2049! Here bothering people again! I'm using this simple idea to deal with merging the output of several retrievers into one. I'm aware of DocumentCompressorPipeline and ContextualCompressionRetriever but I don't think they allow us to do something like this. Also I was getting in trouble to get the pipeline working too. Please correct me if i'm wrong. This allow to do some sort of "retrieval" preprocessing and then using the retrieval with the curated results anywhere you could use a retriever. My use case is to generate diff indexes with diff embeddings and sources for a more colorful results then filtering them with one or many document formatters. I saw some people looking for something like this, here: https://github.com/hwchase17/langchain/issues/3991 and something similar here: https://github.com/hwchase17/langchain/issues/5555 This is just a proposal I know I'm missing tests , etc. If you think this is a worth it idea I can work on tests and anything you want to change. Let me know! --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-10 08:41:02 -07:00
Lance Martin	f3e7ac0a2c	Add load() to snowflake loader (#5956 ) Quick fix for recently added [snowflake data loader](https://github.com/hwchase17/langchain/pull/5825/files).	2023-06-09 11:27:29 -07:00
Harrison Chase	3678cba0be	bump ver to 195 (#5949 )	2023-06-09 09:17:08 -07:00
Harrison Chase	7af186fddf	fixes to docs (#5919 )	2023-06-09 09:15:53 -07:00
Kacper Łukawski	7cc200766e	Expose full params in Qdrant (#5947 ) # Expose full params in Qdrant There were many questions regarding supporting some additional parameters in Qdrant integration. Qdrant supports many vector search optimizations that were impossible to use directly in Qdrant before. That includes: 1. Possibility to manipulate collection params while using `Qdrant.from_texts`. The PR allows setting things such as quantization, HNWS config, optimizers config, etc. That makes it consistent with raw `QdrantClient`. 2. Extended options while searching. It includes HNSW options, exact search, score threshold filtering, and read consistency in distributed mode. After merging that PR, #4858 might also be closed. ## Who can review? VectorStores / Retrievers / Memory @dev2049 @hwchase17	2023-06-09 08:56:32 -07:00
Rubén Martínez	db7ef635c0	Add support for the endpoint URL in DynamoDBChatMesasgeHistory (#5836 ) This PR adds the possibility of specifying the endpoint URL to AWS in the DynamoDBChatMessageHistory, so that it is possible to target not only the AWS cloud services, but also a local installation. Specifying the endpoint URL, which is normally not done when addressing the cloud services, is very helpful when targeting a local instance (like [Localstack](https://localstack.cloud/)) when running local tests. Fixes #5835 #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-08 23:21:11 -07:00
Lior	0eb1bc1a02	Fix the issue where the parameters passed to VertexAI ignored #5889 (#5891 ) Fixes #5889 and fixes the name of the argument in init_vertexai @hwchase17 @agola11 Co-authored-by: Lior Durahly <lior.durahly@superwise.ai>	2023-06-08 23:15:22 -07:00
Fei Wang	63fcf41bea	Fix openai proxy error (#5914 ) Fixes proxy error. Since openai does not parse proxy parameters and uses openai.proxy directly, the proxy method needs to be modified. `7610c5adfa/openai/api_requestor.py (LL90)` #### Who can review? @hwchase17 - project lead Models - @hwchase17 - @agola11 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-08 23:15:06 -07:00
felpigeon	2791a753bf	Add start index to metadata in TextSplitter (#5912 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> #### Add start index to metadata in TextSplitter - Modified method `create_documents` to track start position of each chunk - The `start_index` is included in the metadata if the `add_start_index` parameter in the class constructor is set to `True` This enables referencing back to the original document, particularly useful when a specific chunk is retrieved. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @eyurtsev @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-08 23:09:32 -07:00
Philip Kiely - Baseten	a09a0e3511	Baseten integration (#5862 ) This PR adds a Baseten integration. I've done my best to follow the contributor's guidelines and add docs, an example notebook, and an integration test modeled after similar integrations' test. Please let me know if there is anything I can do to improve the PR. When it is merged, please tag https://twitter.com/basetenco and https://twitter.com/philip_kiely as contributors (the note on the PR template said to include Twitter accounts)	2023-06-08 23:05:57 -07:00
Tamara Lazarevic	0ce8745928	Fix typo (#5894 )	2023-06-08 23:05:22 -07:00
Andrew Grangaard	d8ae925425	arxiv: Correct name of search client attribute to 'arxiv_search' from incorrect 'arxiv_client' (#5917 ) + this private attribute is referenced as `arxiv_search` in internal usage and is set when verifying the environment twitter: @spazm #### Who can review? Any of @hwchase17, @leo-gan, or @bongsang might be interested in reviewing. + Mismatch between `arxiv_client` attribute vs `arxiv_search` in validation and usage is present in the initial commit by @hwchase17. + @leo-gan has made most of the edits. + @bongsang implemented pdf download.	2023-06-08 22:49:11 -07:00
sergiolrinditex	fe8bbc2da7	Create snowflake Loader (#5825 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-08 22:03:00 -07:00
Zander Chase	77c286cf02	Use LCP Client in Tracer (#5908 ) Move the LCP calls to the client.	2023-06-08 21:15:14 -07:00
Frank Hübner	3ec6400d70	Feature/add AWS Kendra Index Retriever (#5856 ) adding a new retriever for AWS Kendra @dev2049 please take a look!	2023-06-08 15:44:09 -07:00
Piyush Jain	a6ebffb695	Fixes model arguments for amazon models (#5896 ) Fixes #5713 #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @agola11 @aarora79 @rsgrewal-aws	2023-06-08 14:16:01 -07:00
小铭	767fa91eae	Fix the shortcut conflict for document page search (#5874 ) Fix the document page to open both search and Mendable when pressing Ctrl+K. I have changed the shortcut for Mendable to Ctrl+J. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @hwchase17 Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-08 14:15:19 -07:00
Zander Chase	5f74db4500	Update run eval imports in init (#5858 )	2023-06-08 10:44:36 -07:00
warjiang	511c12dd39	fix: update qa_chain doc for "chai_type" (#5877 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> `load_qa_with_sources_chain` method already support four type of chain, including `map_rerank`. update document to prevent any misunderstandings 😀. ![image](https://github.com/hwchase17/langchain/assets/6478745/325260b2-6121-4900-aef9-001febff811a) <!-- Remove if not applicable --> Fixes # (issue) No, just update document. #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @hwchase17 Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-08 07:32:51 -07:00
Harrison Chase	893d20f735	bump version to 194 (#5866 )	2023-06-07 22:47:48 -07:00
Harrison Chase	35cfd25db3	Harrison/nebula graph (#5865 ) Co-authored-by: Wey Gu <weyl.gu@gmail.com> Co-authored-by: chenweisomebody <chenweisomebody@gmail.com>	2023-06-07 21:56:43 -07:00
Harrison Chase	658f8bdee7	Harrison/fauna loader (#5864 ) Co-authored-by: Shadid12 <Shadid12@users.noreply.github.com>	2023-06-07 21:32:23 -07:00
Liang Zhang	5518f24ec3	Implement saving and loading of RetrievalQA chain (#5818 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #3983 Mimicing what we do for saving and loading VectorDBQA chain, I added the logic for RetrievalQA chain. Also added a unit test. I did not find how we test other chains for their saving and loading functionality, so I just added a file with one test case. Let me know if there are recommended ways to test it. #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-07 21:07:13 -07:00
Liang Zhang	b93638ef1e	Refactor and update databricks integration page (#5575 ) # Your PR Title (What it does) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-07 20:45:47 -07:00
volodymyr-memsql	a1549901ce	Added SingleStoreDB Vector Store (#5619 ) - Added `SingleStoreDB` vector store, which is a wrapper over the SingleStore DB database, that can be used as a vector storage and has an efficient similarity search. - Added integration tests for the vector store - Added jupyter notebook with the example @dev2049 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-07 20:45:33 -07:00
jjzhuo	78aa59c68b	Fix serialization issue with W&B (#5693 ) The chain input_documents are not displaying properly in W&B, due to serialization issue: <img width="1164" alt="Screenshot 2023-06-04 at 11 58 26 AM" src="https://github.com/hwchase17/langchain/assets/134809928/f31f14f6-0935-4cca-9913-6760cd40eadf"> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-07 20:44:59 -07:00
Alec Flett	ec0dd6e34a	propagate callbacks to ConversationalRetrievalChain (#5572 ) # Allow callbacks to monitor ConversationalRetrievalChain <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> I ran into an issue where load_qa_chain was not passing the callbacks down to the child LLM chains, and so made sure that callbacks are propagated. There are probably more improvements to do here but this seemed like a good place to stop. Note that I saw a lot of references to callbacks_manager, which seems to be deprecated. I left that code alone for now. ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-07 20:25:21 -07:00
Jeff Vestal	3294774148	Add knn and query search field options to ElasticKnnSearch (#5641 ) in the `ElasticKnnSearch` class added 2 arguments that were not exposed properly `knn_search` added: - `vector_query_field: Optional[str] = 'vector'` -- vector_query_field: Field name to use in knn search if not default 'vector' `knn_hybrid_search` added: - `vector_query_field: Optional[str] = 'vector'` -- vector_query_field: Field name to use in knn search if not default 'vector' - `query_field: Optional[str] = 'text'` -- query_field: Field name to use in search if not default 'text' Fixes # https://github.com/hwchase17/langchain/issues/5633 cc: @dev2049 @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-07 20:19:14 -07:00
Mark Marryatt	cef79ca579	Fix exporting GCP Vertex Matching Engine from vectorstores (#5793 ) The Vertex Matching Engine docs include [the line](`b177a29d3f/docs/modules/indexes/vectorstores/examples/matchingengine.ipynb (L32)`) `from langchain.vectorstores import MatchingEngine` which doesn't work as it wasn't added to the vectorestores module exports. - @dev2049	2023-06-07 19:45:33 -07:00
Dave Ingram	106364a45c	Update to Getting Started docs page for Memory (#5855 ) Simply fixing a small typo in the memory page. Also removed an extra code block at the end of the file. Along the way, the current outputs seem to have changed in a few places so left that for posterity, and updated the number of runs which seems harmless, though I can clean that up if preferred.	2023-06-07 19:45:21 -07:00
bnassivet	9355e3f5f5	qdrant vector store - search with relevancy scores (#5781 ) Implementation of similarity_search_with_relevance_scores for quadrant vector store. As implemented the method is also compatible with other capacities such as filtering. Integration tests updated. #### Who can review? Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049	2023-06-07 19:26:40 -07:00
Ning Ren	f15763518a	docs: add Shale Protocol integration guide (#5814 ) This PR adds documentation for Shale Protocol's integration with LangChain. [Shale Protocol](https://shaleprotocol.com) provides forever-free production-ready inference APIs to the open-source community. We have global data centers and plan to support all major open LLMs (estimated ~1,000 by 2025). The team consists of software and ML engineers, AI researchers, designers, and operators across North America and Asia. Combined together, the team has 50+ years experience in machine learning, cloud infrastructure, software engineering and product development. Team members have worked at places like Google and Microsoft. #### Who can review? Tag maintainers/contributors who might be interested: - @hwchase17 - @agola11 --------- Co-authored-by: Karen Sheng <46656667+karensheng@users.noreply.github.com>	2023-06-07 19:25:59 -07:00
Duarte OC	137da7e4b6	Update microsoft loader example with docx2txt dependency (#5832 ) @eyurtsev	2023-06-07 19:21:48 -07:00
Aidan Holland	9f4b720a63	Add additional VertexAI Params (#5837 ) ## Changes - Added the `stop` param to the `_VertexAICommon` class so it can be set at llm initialization ## Example Usage ```python VertexAI( # ... temperature=0.15, max_output_tokens=128, top_p=1, top_k=40, stop=["\n```"], ) ``` ## Possible Reviewers - @hwchase17 - @agola11	2023-06-07 19:20:37 -07:00
Eduard van Valkenburg	76fcd96dae	Add logging in PBI tool (#5841 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Add some logging into the powerbi tool so that you can see the queries being sent to PBI and attempts to correct them. <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @vowelparrot <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-07 19:19:21 -07:00
Matt Robinson	11fec7d4d1	feat: Add `UnstructuredCSVLoader` for CSV files (#5844 ) ### Summary Adds an `UnstructuredCSVLoader` for loading CSVs. One advantage of using `UnstructuredCSVLoader` relative to the standard `CSVLoader` is that if you use `UnstructuredCSVLoader` in `"elements"` mode, an HTML representation of the table will be available in the metadata. #### Who can review? @hwchase17 @eyurtsev	2023-06-07 19:18:01 -07:00
Soos3D	0b4a51930c	Add how to use a custom scraping function with the sitemap loader. (#5847 ) Hi! I just added an example of how to use a custom scraping function with the sitemap loader. I recently used this feature and had to dig in the source code to find it. I thought it might be useful to other devs to have an example in the Jupyter Notebook directly. I only added the example to the documentation page. @eyurtsev I was not able to run the lint. Please let me know if I have to do anything else. I know this is a very small contribution, but I hope it will be valuable. My Twitter handle is @web3Dav3. <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-07 19:16:51 -07:00
Yessen Kanapin	c66755b661	Add DeepInfra embeddings integration with tests and examples, better exception handling for Deep Infra LLM (#5854 ) #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 - project lead - @agola11 --------- Co-authored-by: Yessen Kanapin <yessen@deepinfra.com>	2023-06-07 19:14:30 -07:00
ugfly1210	4d8cda1c3b	FIX: backslash escaped (#5815 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> LatexTextSplitter needs to use "\n\\\chapter" when separators are escaped, such as "\n\\\chapter", otherwise it will report an error: (re.error: bad escape \c at position 1 (line 2, column 1)) Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use re.error: bad escape \c at position 1 (line 2, column 1) See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @hwchase17 @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> Co-authored-by: Pang <ugfly@qq.com>	2023-06-07 16:01:07 -07:00
Zander Chase	3af36943e8	Rm extraneous args to the trace group helper (#5801 ) These are being ignored	2023-06-07 13:09:29 -07:00
whysage	8ef7274ee6	feat: issue-5712 add sleep tool (#5715 ) Fixes # 5712 added sleep tool	2023-06-07 09:39:02 -07:00
Zander Chase	d9fcc45d05	Add in the async methods and link the run id (#5810 )	2023-06-07 08:27:44 -07:00
Harrison Chase	ce7c11625f	bump version to 193 (#5838 )	2023-06-07 07:38:57 -07:00
warjiang	5a207cce8f	fix: fullfill openai params when embedding (#5821 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #5822 I upgrade my langchain lib by execute `pip install -U langchain`, and the verion is 0.0.192。But i found that openai.api_base not working. I use azure openai service as openai backend, the openai.api_base is very import for me. I hava compared tag/0.0.192 and tag/0.0.191, and figure out that: ![image](https://github.com/hwchase17/langchain/assets/6478745/e183fdb2-8224-45c9-b3b4-26d62823999a) openai params is moved inside `_invocation_params` function，and used in some openai invoke: ![image](https://github.com/hwchase17/langchain/assets/6478745/5a55a048-5fa9-4bf4-aaef-3902226bec5e) ![image](https://github.com/hwchase17/langchain/assets/6478745/85b8cebc-eeb8-4538-a525-814719c8f8df) but still some case not covered like: ![image](https://github.com/hwchase17/langchain/assets/6478745/e0297620-f2b2-4f4f-98bd-d0ed19022dac) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-07 07:32:57 -07:00
Harrison Chase	b3ae6bcd3f	bump ver to 192 (#5812 )	2023-06-06 22:23:11 -07:00
Harrison Chase	5468528748	rm docs mongo (#5811 )	2023-06-06 22:22:44 -07:00
Andrew Switlyk	69f4ffb851	Update adding_memory.ipynb (#5806 ) just change "to" to "too" so it matches the above prompt <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-06 22:10:53 -07:00
Sun bin	2be4fbb835	add doc about reusing MongoDBAtlasVectorSearch (#5805 ) DOC: add doc about reusing MongoDBAtlasVectorSearch #### Who can review? Anyone authorized.	2023-06-06 22:10:36 -07:00
bnassivet	062c3c00a2	fixed faiss integ tests (#5808 ) Fixes # 5807 Realigned tests with implementation. Also reinforced folder unicity for the test_faiss_local_save_load test using date-time suffix #### Before submitting - Integration test updated - formatting and linting ok (locally) #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 - project lead VectorStores / Retrievers / Memory -@dev2049	2023-06-06 22:07:27 -07:00
SvMax	92b87c2fec	added support for different types in ResponseSchema class (#5789 ) I added support for specifing different types with ResponseSchema objects: ## before ` extracted_info = ResponseSchema(name="extracted_info", description="List of extracted information") ` generate the following doc: ```json\n{\n\t\"extracted_info\": string // List of extracted information}``` This brings GPT to create a JSON with only one string in the specified field even if you requested a List in the description. ## now `extracted_info = ResponseSchema(name="extracted_info", type="List[string]", description="List of extracted information") ` generate the following doc: ```json\n{\n\t\"extracted_info\": List[string] // List of extracted information}``` This way the model responds better to the prompt generating an array of strings. Tag maintainers/contributors who might be interested: Agents / Tools / Toolkits @vowelparrot Don't know who can be interested, I suppose this is a tool, so I tagged you vowelparrot, anyway, it's a minor change, and shouldn't impact any other part of the framework.	2023-06-06 22:00:48 -07:00
Harrison Chase	3954bcf396	WIP: openai settings (#5792 ) [] need to test more [] make sure they arent saved when serializing [] do for embeddings	2023-06-06 21:57:58 -07:00
Alex Lee	b7999a9bc1	Add UTF-8 json ouput support while langchain.debug is set to True. (#5802 ) Before: <img width="984" alt="image" src="https://github.com/hwchase17/langchain/assets/4317474/2b0807b4-a1d6-4df2-87cc-92b1c8e10534"> After: <img width="992" alt="image" src="https://github.com/hwchase17/langchain/assets/4317474/128c2c7d-2ed5-4c95-954d-b0964c83526a"> Thanks in advance. @agola11	2023-06-06 21:56:33 -07:00
kourosh hakhamaneshi	a0d847f636	[Docs][Hotfix] Fix broken links (#5800 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Some links were broken from the previous merge. This PR fixes them. Tested locally. #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>	2023-06-06 17:17:16 -07:00
Zander Chase	217b5cc72d	Base RunEvaluator Chain (#5750 ) Clean up a bit and only implement the QA and reference free implementations from https://github.com/hwchase17/langchain/pull/5618	2023-06-06 16:42:15 -07:00
Lance Martin	4092fd21dc	YoutubeAudioLoader and updates to OpenAIWhisperParser (#5772 ) This introduces the `YoutubeAudioLoader`, which will load blobs from a YouTube url and write them. Blobs are then parsed by `OpenAIWhisperParser()`, as show in this [PR](https://github.com/hwchase17/langchain/pull/5580), but we extend the parser to split audio such that each chuck meets the 25MB OpenAI size limit. As shown in the notebook, this enables a very simple UX: ``` # Transcribe the video to text loader = GenericLoader(YoutubeAudioLoader([url],save_dir),OpenAIWhisperParser()) docs = loader.load() ``` Tested on full set of Karpathy lecture videos: ``` # Karpathy lecture videos urls = ["https://youtu.be/VMj-3S1tku0" "https://youtu.be/PaCmpygFfXo", "https://youtu.be/TCH_1BHY58I", "https://youtu.be/P6sfmUTpUmc", "https://youtu.be/q8SA3rM6ckI", "https://youtu.be/t3YJ5hKiMQ0", "https://youtu.be/kCc8FmEb1nY"] # Directory to save audio files save_dir = "~/Downloads/YouTube" # Transcribe the videos to text loader = GenericLoader(YoutubeAudioLoader(urls,save_dir),OpenAIWhisperParser()) docs = loader.load() ```	2023-06-06 15:15:08 -07:00
Gengliang Wang	2a4b32dee2	Revise DATABRICKS_API_TOKEN as DATABRICKS_TOKEN (#5796 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> In the [Databricks integration](https://python.langchain.com/en/latest/integrations/databricks.html) and [Databricks LLM](https://python.langchain.com/en/latest/modules/models/llms/integrations/databricks.html), we suggestted users to set the ENV variable `DATABRICKS_API_TOKEN`. However, this is inconsistent with the other Databricks library. To make it consistent, this PR changes the variable from `DATABRICKS_API_TOKEN` to `DATABRICKS_TOKEN` After changes, there is no more `DATABRICKS_API_TOKEN` in the doc ``` $ git grep DATABRICKS_API_TOKEN\|wc -l 0 $ git grep DATABRICKS_TOKEN\|wc -l 8 ``` cc @hwchase17 @dev2049 @mengxr since you have reviewed the previous PRs.	2023-06-06 14:22:49 -07:00
Paul-Emile Brotons	daf3e99b96	fixing from_documents method of the MongoDB Atlas vector store (#5794 ) FIxed a bug in from_documents method --> Collection objects do not implement truth value testing or bool(). @dev2049	2023-06-06 14:22:23 -07:00
Ankush Gola	b177a29d3f	support returning run info for llms, chat models and chains (#5666 ) returning the run id is important for accessing the run later on	2023-06-06 10:07:46 -07:00
Yoann Poupart	65111eb2b3	Attribute support for html tags (#5782 ) # What does this PR do? Change the HTML tags so that a tag with attributes can be found. ## Before submitting - [x] Tests added - [x] CI/CD validated ### Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.	2023-06-06 09:27:37 -07:00
Zander Chase	0cfaa76e45	Set Falsey (#5783 ) Seems natural to try to disable logging by setting `MY_VAR=false` rather than unsetting (especially once you've already set it in the background)	2023-06-06 09:26:38 -07:00
Harrison Chase	2ae2d6cd1d	fix ver 191 (#5784 )	2023-06-06 09:17:23 -07:00
Zander Chase	204a73c1d9	Use client from LCP-SDK (#5695 ) - Remove the client implementation (this breaks backwards compatibility for existing testers. I could keep the stub in that file if we want, but not many people are using it yet - Add SDK as dependency - Update the 'run_on_dataset' method to be a function that optionally accepts a client as an argument - Remove the langchain plus server implementation (you get it for free with the SDK now) We could make the SDK optional for now, but the plan is to use w/in the tracer so it would likely become a hard dependency at some point.	2023-06-06 06:51:05 -07:00
Harrison Chase	08e2352f7b	bump ver 191 (#5766 )	2023-06-05 20:54:08 -07:00
berkedilekoglu	f907b62526	Scores are explained in vectorestore docs (#5613 ) # Scores in Vectorestores' Docs Are Explained Following vectorestores can return scores with similar documents by using `similarity_search_with_score`: - chroma - docarray_hnsw - docarray_in_memory - faiss - myscale - qdrant - supabase - vectara - weaviate However, in documents, these scores were either not explained at all or explained in a way that could lead to misunderstandings (e.g., FAISS). For instance in FAISS document: if we consider the score returned by the function as a similarity score, we understand that a document returning a higher score is more similar to the source document. However, since the scores returned by the function are distance scores, we should understand that smaller scores correspond to more similar documents. For the libraries other than Vectara, I wrote the scores they use by investigating from the source libraries. Since I couldn't be certain about the score metric used by Vectara, I didn't make any changes in its documentation. The links mentioned in Vectara's documentation became broken due to updates, so I replaced them with working ones. VectorStores / Retrievers / Memory - @dev2049 my twitter: [berkedilekoglu](https://twitter.com/berkedilekoglu) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-05 20:39:49 -07:00
Adil Ansari	233b52735e	feat: Support for `Tigris` Vector Database for vector search (#5703 ) ### Changes - New vector store integration - [Tigris](https://tigrisdata.com) - Adds [tigrisdb](https://pypi.org/project/tigrisdb/) optional dependency - Example notebook demonstrating usage Fixes #5535 Closes tigrisdata/tigris-client-python#40 #### Twitter handles We'd love a shoutout on our [@TigrisData](https://twitter.com/TigrisData) and [@adilansari](https://twitter.com/adilansari) twitter handles #### Who can review? @dev2049 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-05 20:39:16 -07:00
Edrick Da Corte Henriquez	38dabdbb3a	Update tutorials.md (#5761 ) # Added an overview of LangChain modules Aimed at introducing newcomers to LangChain's main modules :) Twitter handle is @edrick_dch ## Who can review? @eyurtsev	2023-06-05 20:37:11 -07:00
Ankush Gola	84a46753ab	Tracing Group (#5326 ) Add context manager to group all runs under a virtual parent --------- Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>	2023-06-05 19:18:43 -07:00
Ilya	d5b1608216	fix markdown text splitter horizontal lines (#5625 ) Fixes #5614 #### Issue The `**` combination produces an exception when used as a seperator in `re.split`. Instead `\\\` should be used for regex exprations. #### Who can review? @eyurtsev	2023-06-05 16:40:26 -07:00
Harrison Chase	25487fa5ee	Harrison/youtube multi language (#5758 ) Co-authored-by: rafly lesmana <raflylesmana111@gmail.com>	2023-06-05 16:38:07 -07:00
Shelby Jenkins	2dcda8a8ac	Strips whitespace and \n from loc before filtering urls from sitemap (#5728 ) Fixes #5699 #### Who can review? Tag maintainers/contributors who might be interested: @woodworker @LeSphax @johannhartmann --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-05 16:33:55 -07:00
Harrison Chase	98dd6d068a	cohere retries (#5757 ) …719) A minor update to retry Cohore API call in case of errors using tenacity as it is done for OpenAI LLMs. #### Who can review? @hwchase17, @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Sagar Sapkota <22609549+sagar-spkt@users.noreply.github.com>	2023-06-05 16:28:58 -07:00
M Waleed Kadous	5124c1e0d9	Add aviary support (#5661 ) Aviary is an open source toolkit for evaluating and deploying open source LLMs. You can find out more about it on [http://github.com/ray-project/aviary). You can try it out at [http://aviary.anyscale.com](aviary.anyscale.com). This code adds support for Aviary in LangChain. To minimize dependencies, it connects directly to the HTTP endpoint. The current implementation is not accelerated and uses the default implementation of `predict` and `generate`. It includes a test and a simple example. @hwchase17 and @agola11 could you have a look at this? --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-05 16:28:42 -07:00
felpigeon	a47c8618ec	Add class attribute "return_generated_question" to class "BaseConversationalRetrievalChain" (#5749 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Adding a class attribute "return_generated_question" to class "BaseConversationalRetrievalChain". If set to `True`, the chain's output has a key "generated_question" with the question generated by the sub-chain `question_generator` as the value. This way the generated question can be logged. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @dev2049 @vowelparrot	2023-06-05 16:10:12 -07:00
Leonid Ganeline	87ad4fc4b2	docs: updated `ecosystem/dependents` (#5753 ) updated `ecosystem/dependents` data (it was updated 2+ weeks ago) #### Who can review? @hwchase17 @eyurtsev @dev2049	2023-06-05 16:09:55 -07:00
Leonid Ganeline	92a5f00ffb	docs: `ecosystem/integrations` update 5 (#5752 ) - added missed integration to `docs/ecosystem/integrations/` - updated notebooks to consistent format: changed titles, file names; added descriptions #### Who can review? @hwchase17 @dev2049	2023-06-05 16:08:55 -07:00
Lance Martin	aea090045b	Create OpenAIWhisperParser for generating Documents from audio files (#5580 ) # OpenAIWhisperParser This PR creates a new parser, `OpenAIWhisperParser`, that uses the [OpenAI Whisper model](https://platform.openai.com/docs/guides/speech-to-text/quickstart) to perform transcription of audio files to text (`Documents`). Please see the notebook for usage.	2023-06-05 15:51:13 -07:00
Hao Chen	a4c9053d40	Integrate Clickhouse as Vector Store (#5650 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> #### Description This PR is mainly to integrate open source version of ClickHouse as Vector Store as it is easy for both local development and adoption of LangChain for enterprises who already have large scale clickhouse deployment. ClickHouse is a open source real-time OLAP database with full SQL support and a wide range of functions to assist users in writing analytical queries. Some of these functions and data structures perform distance operations between vectors, [enabling ClickHouse to be used as a vector database](https://clickhouse.com/blog/vector-search-clickhouse-p1). Recently added ClickHouse capabilities like [Approximate Nearest Neighbour (ANN) indices](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/annindexes) support faster approximate matching of vectors and provide a promising development aimed to further enhance the vector matching capabilities of ClickHouse. In LangChain, some ClickHouse based commercial variant vector stores like [Chroma](https://github.com/hwchase17/langchain/blob/master/langchain/vectorstores/chroma.py) and [MyScale](https://github.com/hwchase17/langchain/blob/master/langchain/vectorstores/myscale.py), etc are already integrated, but for some enterprises with large scale Clickhouse clusters deployment, it will be more straightforward to upgrade existing clickhouse infra instead of moving to another similar vector store solution, so we believe it's a valid requirement to integrate open source version of ClickHouse as vector store. As `clickhouse-connect` is already included by other integrations, this PR won't include any new dependencies. #### Before submitting <!-- If you're adding a new integration, please include: 1. Added a test for the integration: https://github.com/haoch/langchain/blob/clickhouse/tests/integration_tests/vectorstores/test_clickhouse.py 2. Added an example notebook and document showing its use: * Notebook: https://github.com/haoch/langchain/blob/clickhouse/docs/modules/indexes/vectorstores/examples/clickhouse.ipynb * Doc: https://github.com/haoch/langchain/blob/clickhouse/docs/integrations/clickhouse.md See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> 1. Added a test for the integration: https://github.com/haoch/langchain/blob/clickhouse/tests/integration_tests/vectorstores/test_clickhouse.py 2. Added an example notebook and document showing its use: * Notebook: https://github.com/haoch/langchain/blob/clickhouse/docs/modules/indexes/vectorstores/examples/clickhouse.ipynb * Doc: https://github.com/haoch/langchain/blob/clickhouse/docs/integrations/clickhouse.md #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @hwchase17 @dev2049 Could you please help review? --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-05 13:32:04 -07:00
Gustavo Brian	2f2d27fd82	Error in documentation: Chroma constructor (#5731 ) Chroma("langchain_store", embeddings.embed_query) must be Chroma("langchain_store", embeddings)	2023-06-05 13:30:58 -07:00
George Geddes	019eb13681	Fix a typo in the documentation for the Slack document loader (#5745 ) Fixes a typo I noticed while reading the docs.	2023-06-05 13:30:24 -07:00
Andrew Grangaard	450eb91fe2	Removes unnecessary backslash escaping for backticks in python (#5751 ) Fixed python deprecation warning: DeprecationWarning: invalid escape sequence '`' backticks (`) do not have special meaning in python strings and should not be escaped. -- @spazm on twitter ### Who can review: @nfcampos ported this change from javascript, @hwchase17 wrote the original STRUCTURED_FORMAT_INSTRUCTIONS,	2023-06-05 13:30:11 -07:00
Daniel Chalef	0551bc90a5	Zep Hybrid Search (#5742 ) Zep now supports persisting custom metadata with messages and hybrid search across both message embeddings and structured metadata. This PR implements custom metadata and enhancements to the `ZepChatMessageHistory` and `ZepRetriever` classes to implement this support. Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049 --------- Co-authored-by: Daniel Chalef <daniel.chalef@private.org>	2023-06-05 12:59:28 -07:00
Tomaz Bratanic	a0ea6f6b6b	Cypher search: Check if generated Cypher is provided in backticks (#5541 ) # Check if generated Cypher code is wrapped in backticks Some LLMs like the VertexAI like to explain how they generated the Cypher statement and wrap the actual code in three backticks: ![Screenshot from 2023-06-01 08-08-23](https://github.com/hwchase17/langchain/assets/19948365/1d8eecb3-d26c-4882-8f5b-6a9bc7e93690) I have observed a similar pattern with OpenAI chat models in a conversational settings, where multiple user and assistant message are provided to the LLM to generate Cypher statements, where then the LLM wants to maybe apologize for previous steps or explain its thoughts. Interestingly, both OpenAI and VertexAI wrap the code in three backticks if they are doing any explaining or apologizing. Checking if the generated cypher is wrapped in backticks seems like a low-hanging fruit to expand the cypher search to other LLMs and conversational settings.	2023-06-05 12:48:13 -07:00
Abhijeet Malamkar	1a9ac3b1f9	Adding support to save multiple memories at a time. Cuts save time by … (#5172 ) # Adding support to save multiple memories at a time. Cuts save time by more then half <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 - VectorStores / Retrievers / Memory - @dev2049 --> @dev2049 @vowelparrot --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-05 12:47:48 -07:00
kourosh hakhamaneshi	625717daa8	docs: Added Deploying LLMs into production + a new ecosystem (#4047 ) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Co-authored-by: Kamil Kaczmarek <kaczmarek.poczta@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-05 12:47:27 -07:00
Ralph Schlosser	74f8e603d9	Addresses GPT4All wrapper model_type attribute issues #5720 . (#5743 ) Fixes #5720. A more in-depth discussion is in my comment here: https://github.com/hwchase17/langchain/issues/5720#issuecomment-1577047018 In a nutshell, there has been a subtle change in the latest version of GPT4Alls Python bindings. The change I submitted yesterday is compatible with this version, however, this version is as of yet unreleased and thus the code change breaks Langchain's wrapper under the currently released version of GPT4All. This pull request proposes a backwards-compatible solution.	2023-06-05 12:45:29 -07:00
Harrison Chase	d0d89d39ef	bump version to 190 (#5704 )	2023-06-04 20:04:50 -07:00
mheguy-stingray	b64c39dfe7	top_k and top_p transposed in vertexai (#5673 ) Fix transposed properties in vertexai model Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-04 16:59:53 -07:00
Tobias Herbold	3fb0e4872a	sqlalchemy MovedIn20Warning declarative_base DEPRICATION fix (#5676 ) fix for the sqlalchemy deprecated declarative_base import : ``` MovedIn20Warning: The ``declarative_base()`` function is now available as sqlalchemy.orm.declarative_base(). (deprecated since: 2.0) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9) Base = declarative_base() # type: Any ``` Import is wrapped in an try catch Block to fallback to the old import if needed. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-04 16:52:52 -07:00
Jens Madsen	8d9e9e013c	refactor: extract token text splitter function (#5179 ) # Token text splitter for sentence transformers The current TokenTextSplitter only works with OpenAi models via the `tiktoken` package. This is not clear from the name `TokenTextSplitter`. In this (first PR) a token based text splitter for sentence transformer models is added. In the future I think we should work towards injecting a tokenizer into the TokenTextSplitter to make ti more flexible. Could perhaps be reviewed by @dev2049 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-04 14:41:44 -07:00
Nathan Azrak	26ec845921	Raise an exception in MKRL and Chat Output Parsers if parsing text which contains both an action and a final answer (#5609 ) Raises exception if OutputParsers receive a response with both a valid action and a final answer Currently, if an OutputParser receives a response which includes both an action and a final answer, they return a FinalAnswer object. This allows the parser to accept responses which propose an action and hallucinate an answer without the action being parsed or taken by the agent. This PR changes the logic to: 1. store a variable checking whether a response contains the `FINAL_ANSWER_ACTION` (this is the easier condition to check). 2. store a variable checking whether the response contains a valid action 3. if both are present, raise a new exception stating that both are present 4. if an action is present, return an AgentAction 5. if an answer is present, return an AgentAnswer 6. if neither is present, raise the relevant exception based around the action format (these have been kept consistent with the prior exception messages) Disclaimer: * Existing mock data included strings which did include an action and an answer. This might indicate that prioritising returning AgentAnswer was always correct, and I am patching out desired behaviour? @hwchase17 to advice. Curious if there are allowed cases where this is not hallucinating, and we do want the LLM to output an action which isn't taken. * I have not passed `send_to_llm` through this new exception Fixes #5601 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 - project lead @vowelparrot	2023-06-04 14:40:49 -07:00
Lucas Rodrigues	c112d7334d	Update MongoDBChatMessageHistory to create an index on SessionId (#5632 ) All the queries to the database are done based on the SessionId property, this will optimize how Mongo retrieves all messages from a session #### Who can review? Tag maintainers/contributors who might be interested: @dev2049	2023-06-04 14:39:56 -07:00
Jason Weill	6c11f94013	Retitles Bedrock doc to appear in correct alphabetical order in site nav (#5639 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #5638. Retitles "Amazon Bedrock" page to "Bedrock" so that the Integrations section of the left nav is properly sorted in alphabetical order. #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-04 14:39:25 -07:00
Will Smith	6e25e65085	SQL agent : Improved prompt engineering prevents agent guessing database column names. (#5671 ) @vowelparrot: Minor change to the SQL agent: Tells agent to introspect the schema of the most relevant tables, I found this to dramatically decrease the chance that the agent wastes times guessing column names.	2023-06-04 14:39:00 -07:00
Nuhman Pk	8f98592ac9	Added Dependencies Status, Open issues and releases badges in Readme.md (#5681 ) [![Dependency Status](https://img.shields.io/librariesio/github/hwchase17/langchain)](https://libraries.io/github/hwchase17/langchain) [![Open Issues](https://img.shields.io/github/issues-raw/hwchase17/langchain)](https://github.com/hwchase17/langchain/issues) [![Release Notes](https://img.shields.io/github/release/hwchase17/langchain)](https://github.com/hwchase17/langchain/releases)	2023-06-04 14:30:52 -07:00
Harrison Chase	b9040669a0	Harrison/pipeline prompt (#5540 ) idea is to make prompts more composable	2023-06-04 14:29:37 -07:00
George Roberts	647210a4b9	Add args_schema to google_places tool (#5680 ) Tiny change to actually add the args_schema to the tool. @vowelparrot	2023-06-04 14:28:46 -07:00
Ralph Schlosser	8fea0529c1	This fixes issue #5651 - GPT4All wrapper loading issue (#5657 ) Fixes #5651 Small typo in wrapper code. Note the `model_type` parameter is currently unused by GPT4All. https://github.com/hwchase17/langchain/issues/5651 #### Who can review?	2023-06-04 07:21:16 -07:00
Jiayao Yu	6a3ceaa377	Support similarity_score_threshold retrieval with Chroma (#5655 ) Fixes https://github.com/hwchase17/langchain/issues/5067 Verified the following code now works correctly: ``` db = Chroma(persist_directory=index_directory(index_name), embedding_function=embeddings) retriever = db.as_retriever(search_type="similarity_score_threshold", search_kwargs={"score_threshold": 0.4}) docs = retriever.get_relevant_documents(query) ```	2023-06-03 16:57:00 -07:00
Hao Chen	3e45b83065	Improve Error Messaging for APOC Procedure Failure in Neo4jGraph (#5547 ) ## Improve Error Messaging for APOC Procedure Failure in Neo4jGraph This commit revises the error message provided when the 'apoc.meta.data()' procedure fails. Previously, the message simply instructed the user to install the APOC plugin in Neo4j. The new error message is more specific. Also removed an unnecessary newline in the Cypher statement variable: `node_properties_query`. Fixes #5545 ## Who can review? - @vowelparrot - @dev2049	2023-06-03 16:56:39 -07:00
Ricardo Reis	33ea606f45	Update youtube.py - Fix metadata validation error in YoutubeLoader (#5479 ) This commit addresses a ValueError occurring when the YoutubeLoader class tries to add datetime metadata from a YouTube video's publish date. The error was happening because the ChromaDB metadata validation only accepts str, int, or float data types. In the `_get_video_info` method of the `YoutubeLoader` class, the publish date retrieved from the YouTube video was of datetime type. This commit fixes the issue by converting the datetime object to a string before adding it to the metadata dictionary. Additionally, this commit introduces error handling in the `_get_video_info` method to ensure that all metadata fields have valid values. If a metadata field is found to be None, a default value is assigned. This prevents potential errors during metadata validation when metadata fields are None. The file modified in this commit is youtube.py. # Your PR Title (What it does) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-03 16:56:17 -07:00
Shuqian	5af2c51e78	refactor: BaseStringMessagePromptTemplate from_template method (#5332 ) # refactor BaseStringMessagePromptTemplate from_template method Refactor the `from_template` method of the `BaseStringMessagePromptTemplate` class to allow passing keyword arguments to the `from_template` method of `PromptTemplate`. Enable the usage of arguments like `template_format`. In my scenario, I intend to utilize Jinja2 for formatting the human message prompt in the chat template. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Models - @hwchase17 - @agola11 - @jonasalexander --> --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-03 16:55:58 -07:00
mbchang	d3bdb8ea6d	FileCallbackHandler (#5589 ) # like [StdoutCallbackHandler](https://github.com/hwchase17/langchain/blob/master/langchain/callbacks/stdout.py), but writes to a file When running experiments I have found myself wanting to log the outputs of my chains in a more lightweight way than using WandB tracing. This PR contributes a callback handler that writes to file what `StdoutCallbackHandler` would print. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> ## Example Notebook <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> See the included `filecallbackhandler.ipynb` notebook for usage. Would it be better to include this notebook under `modules/callbacks` or under `integrations/`? ![image](https://github.com/hwchase17/langchain/assets/6439365/c624de0e-343f-4eab-a55b-8808a887489f) ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-03 16:48:48 -07:00
rajib	1c51d3db0f	Created fix for 5475 (#5659 ) Created fix for 5475 Currently in PGvector, we do not have any function that returns the instance of an existing store. The from_documents always adds embeddings and then returns the store. This fix is to add a function that will return the instance of an existing store Also changed the jupyter example for PGVector to show the example of using the function <!-- Remove if not applicable --> Fixes # 5475 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @dev2049 @hwchase17 Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: rajib76 <rajib76@yahoo.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-03 16:47:52 -07:00
Michael Landis	475007d63a	fix: correct momento chat history notebook typo and title (#5646 ) This PR corrects a minor typo in the Momento chat message history notebook and also expands the title from "Momento" to "Momento Chat History", inline with other chat history storage providers. #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? cc @dev2049 who reviewed the original integration	2023-06-03 16:39:27 -07:00
Paul-Emile Brotons	92f218207b	removing client+namespace in favor of collection (#5610 ) removing client+namespace in favor of collection for an easier instantiation and to be similar to the typescript library @dev2049	2023-06-03 16:27:31 -07:00
Harrison Chase	ad09367a92	Harrison/pubmed integration (#5664 ) Co-authored-by: younis basher <71520361+younis-ba@users.noreply.github.com> Co-authored-by: Younis Bashir <younis@omicmd.com>	2023-06-03 16:25:28 -07:00
Harrison Chase	9921f8cc3a	Harrison/update azure nb (#5665 ) Co-authored-by: NEWTON MALLICK <38786893+N-E-W-T-O-N@users.noreply.github.com>	2023-06-03 16:25:08 -07:00
C.J. Jameson	4e71a1702b	nit: pgvector python example notebook, fix variable reference (#5595 ) # Your PR Title (What it does) Fixes the pgvector python example notebook : one of the variables was not referencing anything ## Before submitting ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049	2023-06-03 15:29:34 -07:00
Leonid Ganeline	b201cfaa0f	docs `ecosystem/integrations` update 4 (#5590 ) # docs `ecosystem/integrations` update 4 Added missed integrations. Fixed inconsistencies. ## Who can review? @hwchase17 @dev2049	2023-06-03 15:29:03 -07:00
Davis Chase	ae3611730a	handle single arg to and/or (#5637 ) @ryderwishart @eyurtsev thoughts on handling this in the parser itself? related to #5570	2023-06-03 15:18:46 -07:00
khallbobo	934319fc28	Add parameters to send_message() call for vertexai chat models (PaLM2) (#5566 ) # Ensure parameters are used by vertexai chat models (PaLM2) The current version of the google aiplatform contains a bug where parameters for a chat model are not used as intended. See https://github.com/googleapis/python-aiplatform/issues/2263 Params can be passed both to start_chat() and send_message(); however, the parameters passed to start_chat() will not be used if send_message() is called without the overrides. This is due to the defaults in send_message() being global values rather than None (there is code in send_message() which would use the params from start_chat() if the param passed to send_message() evaluates to False, but that won't happen as the defaults are global values). Fixes # 5531 @hwchase17 @agola11	2023-06-03 15:17:38 -07:00
UmerHA	44ad9628c9	QuickFix for FinalStreamingStdOutCallbackHandler: Ignore new lines & white spaces (#5497 ) # Make FinalStreamingStdOutCallbackHandler more robust by ignoring new lines & white spaces `FinalStreamingStdOutCallbackHandler` doesn't work out of the box with `ChatOpenAI`, as it tokenized slightly differently than `OpenAI`. The response of `OpenAI` contains the tokens `["\nFinal", " Answer", ":"]` while `ChatOpenAI` contains `["Final", " Answer", ":"]`. This PR make `FinalStreamingStdOutCallbackHandler` more robust by ignoring new lines & white spaces when determining if the answer prefix has been reached. Fixes #5433 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: Tracing / Callbacks - @agola11 Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) \| Discord: RicChilligerDude#7589	2023-06-03 15:05:58 -07:00
Nathan Azrak	1f4abb265a	Adds the option to pass the original prompt into the AgentExecutor for PlanAndExecute agents (#5401 ) # Adds the option to pass the original prompt into the AgentExecutor for PlanAndExecute agents This PR allows the user to optionally specify that they wish for the original prompt/objective to be passed into the Executor agent used by the PlanAndExecute agent. This solves a potential problem where the plan is formed referring to some context contained in the original prompt, but which is not included in the current prompt. Currently, the prompt format given to the Executor is: ``` System: Respond to the human as helpfully and accurately as possible. You have access to the following tools: <Tool and Action Description> <Output Format Description> Begin! Reminder to ALWAYS respond with a valid json blob of a single action. Use tools if necessary. Respond directly if appropriate. Format is Action:```$JSON_BLOB```then Observation:. Thought: Human: <Previous steps> <Current step> ``` This PR changes the final part after `Human:` to optionally insert the objective: ``` Human: <objective> <Previous steps> <Current step> ``` I have given a specific example in #5400 where the context of a database path is lost, since the plan refers to the "given path". The PR has been linted and formatted. So that existing behaviour is not changed, I have defaulted the argument to `False` and added it as the last argument in the signature, so it does not cause issues for any users passing args positionally as opposed to using keywords. Happy to take any feedback or make required changes! Fixes #5400 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @vowelparrot --------- Co-authored-by: Nathan Azrak <nathan.azrak@gmail.com>	2023-06-03 14:59:09 -07:00
Felipe Ferreira	ae2cf1f598	Implements support for Personal Access Token Authentication in the ConfluenceLoader (#5385 ) # Implements support for Personal Access Token Authentication in the ConfluenceLoader Fixes #5191 Implements a new optional parameter for the ConfluenceLoader: `token`. This allows the use of personal access authentication when using the on-prem server version of Confluence. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev @Jflick58 Twitter Handle: felipe_yyc --------- Co-authored-by: Felipe <feferreira@ea.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-03 14:57:49 -07:00
Gardner Bickford	b81f98b8a6	Update confluence.py to return spaces between elements (#5383 ) # Update confluence.py to return spaces between elements like headers and links. Please see https://stackoverflow.com/questions/48913975/how-to-return-nicely-formatted-text-in-beautifulsoup4-when-html-text-is-across-m Given: ```html <address> 183 Main St<br>East Copper<br>Massachusetts<br>U S A<br> MA 01516-113 </address> ``` The document loader currently returns: ``` '183 Main StEast CopperMassachusettsU S A MA 01516-113' ``` After this change, the document loader will return: ``` 183 Main St East Copper Massachusetts U S A MA 01516-113 ``` @eyurtsev would you prefer this to be an option that can be passed in?	2023-06-03 14:57:25 -07:00
Zeeland	b72401b47b	pref: reduce DB query error rate (#5339 ) # Reduce DB query error rate If you use sql agent of `SQLDatabaseToolkit` to query data, it is prone to errors in query fields and often uses fields that do not exist in database tables for queries. However, the existing prompt does not effectively make the agent aware that there are problems with the fields they query. At this time, we urgently need to improve the prompt so that the agent realizes that they have queried non-existent fields and allows them to use the `schema_sql_db`, that is,` ListSQLDatabaseTool` first queries the corresponding fields in the table in the database, and then uses `QuerySQLDatabaseTool` for querying. There is a demo of my project to show this problem. Original Agent ```python def create_mysql_kit(): db = SQLDatabase.from_uri("mysql+pymysql://xxxxxxx") llm = OpenAI(temperature=0) toolkit = SQLDatabaseToolkit(db=db, llm=llm) agent_executor = create_sql_agent( llm=OpenAI(temperature=0), toolkit=toolkit, verbose=True ) agent_executor.run("Who are the users of sysuser in this system? Tell me the username of all users") if __name__ == '__main__': create_mysql_kit() ``` original output ``` > Entering new AgentExecutor chain... Action: list_tables_sql_db Action Input: "" Observation: app_sysrole_menus, app_bimfacemodel, app_project_users, app_measuringpointdata, auth_user, auth_user_groups, django_apscheduler_djangojobexecution, app_project, app_elementpoint, django_apscheduler_djangojob, django_content_type, app_sysrole, django_admin_log, app_bimfaceaccount, app_measuringpoint_warning_thresholds, app_measuringpoint, app_company, auth_group_permissions, app_sysuser, app_sysuser_companies, app_sysmenu, app_datawarningthreshold, auth_group, auth_permission, app_datawarningrecord, auth_user_user_permissions, app_bimfaceaccount_bimface_models, django_migrations, app_measuringitem, django_session Thought: I should query the app_sysuser table to get the users in the system. Action: query_sql_db Action Input: SELECT username FROM app_sysuser LIMIT 10; Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'username' in 'field list'") [SQL: SELECT username FROM app_sysuser LIMIT 10;] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should double check my query before executing it. Action: query_checker_sql_db Action Input: SELECT username FROM app_sysuser LIMIT 10; Observation: SELECT username FROM app_sysuser LIMIT 10; Thought: The query looks correct, so I should execute it. Action: query_sql_db Action Input: SELECT username FROM app_sysuser LIMIT 10; Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'username' in 'field list'") [SQL: SELECT username FROM app_sysuser LIMIT 10;] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should double check my query before executing it. Action: query_checker_sql_db Action Input: SELECT name FROM app_sysuser LIMIT 10; Observation: SELECT name FROM app_sysuser LIMIT 10; Thought: The query looks correct, so I should execute it. Action: query_sql_db Action Input: SELECT name FROM app_sysuser LIMIT 10; Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'name' in 'field list'") [SQL: SELECT name FROM app_sysuser LIMIT 10;] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should double check my query before executing it. Action: query_checker_sql_db Action Input: SELECT id, username FROM app_sysuser LIMIT 10; Observation: SELECT id, username FROM app_sysuser LIMIT 10; Thought: The query looks correct, so I should execute it. Action: query_sql_db Action Input: SELECT id, username FROM app_sysuser LIMIT 10; Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'id' in 'field list'") [SQL: SELECT id, username FROM app_sysuser LIMIT 10;] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should double check my query before executing it. Action: query_checker_sql_db Action Input: SELECT user_id, username FROM app_sysuser LIMIT 10; Observation: SELECT user_id, username FROM app_sysuser WHERE user_id IS NOT NULL AND user_id BETWEEN 0 AND 10 UNION ALL SELECT user_id, username FROM app_sysuser WHERE username IS NOT NULL ORDER BY user_id ASC LIMIT 10; Thought: The query looks correct, so I should execute it. Action: query_sql_db Action Input: SELECT user_id, username FROM app_sysuser WHERE user_id IS NOT NULL AND user_id BETWEEN 0 AND 10 UNION ALL SELECT user_id, username FROM app_sysuser WHERE username IS NOT NULL ORDER BY user_id ASC LIMIT 10; Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'username' in 'field list'") [SQL: SELECT user_id, username FROM app_sysuser WHERE user_id IS NOT NULL AND user_id BETWEEN 0 AND 10 UNION ALL SELECT user_id, username FROM app_sysuser WHERE username IS NOT NULL ORDER BY user_id ASC LIMIT 10;] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should double check my query before executing it. Action: query_checker_sql_db Action Input: SELECT user_id, user_name FROM app_sysuser LIMIT 10; Observation: SELECT user_id, user_name FROM app_sysuser WHERE user_id NOT IN (NULL) AND user_name BETWEEN 'A' AND 'Z' UNION ALL SELECT user_id, user_name FROM app_sysuser WHERE user_id IS NOT NULL AND user_name LIKE '%a%' LIMIT 10; Thought: The query looks correct, so I should execute it. Action: query_sql_db Action Input: SELECT user_id, user_name FROM app_sysuser WHERE user_id NOT IN (NULL) AND user_name BETWEEN 'A' AND 'Z' UNION ALL SELECT user_id, user_name FROM app_sysuser WHERE user_id IS NOT NULL AND user_name LIKE '%a%' LIMIT 10; Observation: [] Thought: I now know the final answer. Final Answer: There are no users in the sysuser table. > Finished chain. Process finished with exit code 0 ``` As you can see, agent always use incorrect sql to query the data. Actually, the username field name of my `app_sysuser` table is `user_name`, not the `username` believed by the agent. Optimized agent output ``` > Entering new AgentExecutor chain... Action: list_tables_sql_db Action Input: "" Observation: app_project_users, django_content_type, app_measuringpoint_warning_thresholds, app_bimfaceaccount_bimface_models, django_session, app_sysuser, django_migrations, app_bimfacemodel, app_elementpoint, app_measuringpoint, django_apscheduler_djangojobexecution, auth_permission, app_sysuser_companies, app_measuringpointdata, app_measuringitem, app_sysrole_menus, django_admin_log, auth_group, django_apscheduler_djangojob, app_sysmenu, app_project, app_bimfaceaccount, app_datawarningthreshold, app_datawarningrecord, auth_user_groups, auth_user_user_permissions, auth_group_permissions, auth_user, app_company, app_sysrole Thought: I should query the app_sysuser table to get the usernames of all the users. Action: query_sql_db Action Input: SELECT username FROM app_sysuser Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'username' in 'field list'") [SQL: SELECT username FROM app_sysuser] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should use the schema_sql_db tool to get the correct table fields. Action: schema_sql_db Action Input: app_sysuser Observation: CREATE TABLE app_sysuser ( user_id INTEGER NOT NULL AUTO_INCREMENT, user_account VARCHAR(50) COLLATE utf8mb4_general_ci NOT NULL, user_password VARCHAR(255) COLLATE utf8mb4_general_ci NOT NULL, user_name VARCHAR(50) COLLATE utf8mb4_general_ci NOT NULL, user_avatar VARCHAR(255) COLLATE utf8mb4_general_ci, user_email VARCHAR(254) COLLATE utf8mb4_general_ci, user_phone VARCHAR(20) COLLATE utf8mb4_general_ci, user_create_time DATETIME(6) NOT NULL, user_update_time DATETIME(6) NOT NULL, user_status SMALLINT, role_id_id INTEGER, PRIMARY KEY (user_id), CONSTRAINT app_sysuser_role_id_id_51b50f39_fk_app_sysrole_role_id FOREIGN KEY(role_id_id) REFERENCES app_sysrole (role_id) )COLLATE utf8mb4_general_ci DEFAULT CHARSET=utf8mb4 ENGINE=InnoDB /* 3 rows from app_sysuser table: user_id user_account user_password user_name user_avatar user_email user_phone user_create_time user_update_time user_status role_id_id xxxxxxxxxxxxxx */ Thought: I should query the app_sysuser table to get the usernames of all the users. Action: query_sql_db Action Input: SELECT user_account FROM app_sysuser LIMIT 10 Observation: [('baiyun',), ('eatrice',), ('lisi',), ('pingxiang',), ('wangwu',), ('zeeland',), ('zsj',), ('zzw',)] Thought: I now know the final answer Final Answer: The usernames of the users in the sysuser table are baiyun, eatrice, lisi, pingxiang, wangwu, zeeland, zsj, and zzw. > Finished chain. Process finished with exit code 0 ``` I have tested about 10 related prompts and they all work properly, with a much lower error rate compared to before ## Who can review? @vowelparrot --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-03 14:55:56 -07:00
mbchang	ce6dbe41a9	minor refactor GenerativeAgentMemory (#5315 ) # minor refactor of GenerativeAgentMemory <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> - refactor `format_memories_detail` to be more reusable - modified prompts for getting topics for reflection and for generating insights - update `characters.ipynb` to reflect changes ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @vowelparrot @hwchase17 @dev2049	2023-06-03 14:53:14 -07:00
Leonid Ganeline	95c6ed0568	docs: `modules` pages simplified (#5116 ) # docs: modules pages simplified Fixied #5627 issue Merged several repetitive sections in the `modules` pages. Some texts, that were hard to understand, were also simplified. ## Who can review? @hwchase17 @dev2049	2023-06-03 14:44:32 -07:00
Chandan Routray	bc875a9df1	Fixed multi input prompt for MapReduceChain (#4979 ) # Fixed multi input prompt for MapReduceChain Added `kwargs` support for inner chains of `MapReduceChain` via `from_params` method Currently the `from_method` method of intialising `MapReduceChain` chain doesn't work if prompt has multiple inputs. It happens because it uses `StuffDocumentsChain` and `MapReduceDocumentsChain` underneath, both of them require specifying `document_variable_name` if `prompt` of their `llm_chain` has more than one `input`. With this PR, I have added support for passing their respective `kwargs` via the `from_params` method. ## Fixes https://github.com/hwchase17/langchain/issues/4752 ## Who can review? @dev2049 @hwchase17 @agola11 --------- Co-authored-by: imeckr <chandanroutray2012@gmail.com>	2023-06-03 14:41:03 -07:00
Matt Robinson	a97e4252e3	feat: add `UnstructuredExcelLoader` for `.xlsx` and `.xls` files (#5617 ) # Unstructured Excel Loader Adds an `UnstructuredExcelLoader` class for `.xlsx` and `.xls` files. Works with `unstructured>=0.6.7`. A plain text representation of the Excel file will be available under the `page_content` attribute in the doc. If you use the loader in `"elements"` mode, an HTML representation of the Excel file will be available under the `text_as_html` metadata key. Each sheet in the Excel document is its own document. ### Testing ```python from langchain.document_loaders import UnstructuredExcelLoader loader = UnstructuredExcelLoader( "example_data/stanley-cups.xlsx", mode="elements" ) docs = loader.load() ``` ## Who can review? @hwchase17 @eyurtsev	2023-06-03 12:44:12 -07:00
Leonid Ganeline	9a7488a5ce	fix import issue (#5636 ) # fix for the import issue Added document loader classes from [`figma`, `iugu`, `onedrive_file`] to `document_loaders/__inti__.py` imports Also sorted `__all__` Fixed #5623 issue	2023-06-02 14:58:41 -07:00
Zander Chase	20ec1173f4	Update Tracer Auth / Reduce Num Calls (#5517 ) Update the session creation and calls --------- Co-authored-by: Ankush Gola <ankush.gola@gmail.com>	2023-06-02 12:13:56 -07:00
Sean Morgan	949729ff5c	Fix bedrock llm boto3 client instantiation (#5629 ) Same issue as https://github.com/hwchase17/langchain/pull/5574	2023-06-02 12:04:49 -07:00
Caleb Ellington	c5a7a85a4e	fix chroma update_document to embed entire documents, fixes a characer-wise embedding bug (#5584 ) # Chroma update_document full document embeddings bugfix Chroma update_document takes a single document, but treats the page_content sting of that document as a list when getting the new document embedding. This is a two-fold problem, where the resulting embedding for the updated document is incorrect (it's only an embedding of the first character in the new page_content) and it calls the embedding function for every character in the new page_content string, using many tokens in the process. Fixes #5582 Co-authored-by: Caleb Ellington <calebellington@Calebs-MBP.hsd1.ca.comcast.net>	2023-06-02 11:12:48 -07:00
Davis Chase	3c6fa9126a	bump 189 (#5620 )	2023-06-02 09:09:22 -07:00
Davis Chase	d784401215	Dev2049/add argilla callback (#5621 ) Co-authored-by: Alvaro Bartolome <alvarobartt@gmail.com> Co-authored-by: Daniel Vila Suero <daniel@argilla.io> Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com> Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>	2023-06-02 09:05:06 -07:00
Kacper Łukawski	71a7c16ee0	Fix: Qdrant ids (#5515 ) # Fix Qdrant ids creation There has been a bug in how the ids were created in the Qdrant vector store. They were previously calculated based on the texts. However, there are some scenarios in which two documents may have the same piece of text but different metadata, and that's a valid case. Deduplication should be done outside of insertion. It has been fixed and covered with the integration tests. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-02 08:57:34 -07:00
Jeff Vestal	d1f65d8dc1	Es knn index search 5346 (#5569 ) # Create elastic_vector_search.ElasticKnnSearch class This extends `langchain/vectorstores/elastic_vector_search.py` by adding a new class `ElasticKnnSearch` Features: - Allow creating an index with the `dense_vector` mapping compataible with kNN search - Store embeddings in index for use with kNN search (correct mapping creates HNSW data structure) - Perform approximate kNN search - Perform hybrid BM25 (`query{}`) + kNN (`knn{}`) search - perform knn search by either providing a `query_vector` or passing a hosted `model_id` to use query_vector_builder to automatically generate a query_vector at search time Connection options - Using `cloud_id` from Elastic Cloud - Passing elasticsearch client object search options - query - k - query_vector - model_id - size - source - knn_boost (hybrid search) - query_boost (hybrid search) - fields This also adds examples to `docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb` Fixes # [5346](https://github.com/hwchase17/langchain/issues/5346) cc: @dev2049 --> --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-02 08:40:35 -07:00
Davis Chase	8b3df18bcc	human approval callback (#5581 ) ![Screenshot 2023-06-01 at 2 39 40 PM](https://github.com/hwchase17/langchain/assets/130488702/769f1480-7e51-46d9-bcde-698d0b091803)	2023-06-02 06:59:33 -07:00
Zander Chase	6655f43282	Rm Template Title (#5616 ) Remove the redundant title from the PR template #### Before submitting	2023-06-02 06:54:55 -07:00
Bharat Ramanathan	28d6277396	docs(integration): update colab and external links in WandbTracing docs (#5602 ) # Update Wandb Tracking documentation This PR updates the Wandb Tracking documentation for formatting, updated broken links and colab notebook links --------- Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com>	2023-06-02 02:58:42 -07:00
Waldecir Santos	db45970a66	Fix SQLAlchemy truncating text when it is too big (#5206 ) # Fixes SQLAlchemy truncating the result if you have a big/text column with many chars. SQLAlchemy truncates columns if you try to convert a Row or Sequence to a string directly For comparison: - Before: ```[('Harrison', 'That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ... (2 characters truncated) ... hat is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ')]``` - After: ```[('Harrison', 'That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ')]``` ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: I'm not sure who to tag for chains, maybe @vowelparrot ?	2023-06-01 21:33:31 -04:00
Davis Chase	4c572ffe95	nit (#5578 )	2023-06-01 14:21:15 -07:00
sseide	001b147450	Documentation fixes (linting and broken links) (#5563 ) # Lint sphinx documentation and fix broken links This PR lints multiple warnings shown in generation of the project documentation (using "make docs_linkcheck" and "make docs_build"). Additionally documentation internal links to (now?) non-existent files are modified to point to existing documents as it seemed the new correct target. The documentation is not updated content wise. There are no source code changes. Fixes # (issue) - broken documentation links to other files within the project - sphinx formatting (linting) ## Before submitting No source code changes, so no new tests added. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-01 13:06:17 -07:00
Sean Morgan	8441cff1d7	Fix bedrock auth validation (#5574 ) https://github.com/hwchase17/langchain/pull/5523 has a small bug if client was not passed in constructor	2023-06-01 12:35:06 -07:00
Andrew Lei	6258f72a00	Add missing comma in conv chat agent prompt json (#5573 ) # Add missing comma in conversational chat agent prompt json Inspired by: https://github.com/hwchase17/langchainjs/pull/1498	2023-06-01 12:12:44 -07:00
Ikko Eltociear Ashimine	14a611775c	Fix typo in docugami.ipynb (#5571 ) # Fix typo in docugami.ipynb Fixed typo. infromation -> information	2023-06-01 11:45:56 -07:00
Blithe	80b3fdf2f7	make the elasticsearch api support version which below 8.x (#5495 ) the api which create index or search in the elasticsearch below 8.x is different with 8.x. When use the es which below 8.x , it will throw error. I fix the problem Co-authored-by: gaofeng27692 <gaofeng27692@hundsun.com>	2023-06-01 10:58:20 -07:00
Davis Chase	6632188606	bump 188 (#5568 )	2023-06-01 08:50:54 -07:00
Davis Chase	6afb463e9b	Qdrant self query (#5567 ) Add self query abilities to qdrant vectorstore	2023-06-01 08:40:31 -07:00
Patrick Keane	47c2ec2d0b	Corrects inconsistently misspelled variable name. (#5559 ) Corrects a spelling error (of the word separator) in several variable names. Three cut/paste instances of this were corrected, amidst instances of it also being named properly, which would likely would lead to issues for someone in the future. Here is one such example: ``` seperators = self.get_separators_for_language(Language.PYTHON) super().__init__(separators=seperators, kwargs) ``` becomes ``` separators = self.get_separators_for_language(Language.PYTHON) super().__init__(separators=separators, kwargs) ``` Make test results below: ``` ============================== 708 passed, 52 skipped, 27 warnings in 11.70s ============================== ```	2023-06-01 10:27:58 -04:00
Harrison Chase	342b671d05	add brave search util (#5538 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-01 01:11:51 -07:00
Davis Chase	983a213bdc	add maxcompute (#5533 ) cc @pengwork (fresh branch, no creds)	2023-06-01 00:54:42 -07:00
Bharat Ramanathan	22603d19e0	feat(integrations): Add WandbTracer (#4521 ) # WandbTracer This PR adds the `WandbTracer` and deprecates the existing `WandbCallbackHandler`. Added an example notebook under the docs section alongside the `LangchainTracer` Here's an example [colab](https://colab.research.google.com/drive/1pY13ym8ENEZ8Fh7nA99ILk2GcdUQu0jR?usp=sharing) with the same notebook and the [trace](https://wandb.ai/parambharat/langchain-tracing/runs/8i45cst6) generated from the colab run Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-01 00:01:19 -07:00
Leonid Ganeline	373ad49157	docs `ecosystem/integrations` update 3 (#5470 ) # docs: `ecosystem_integrations` update 3 Next cycle of updating the `ecosystem/integrations` * Added an integration `template` file * Added missed integration files * Fixed several document_loaders/notebooks ## Who can review? Is it possible to assign somebody to review PRs on docs? Thanks.	2023-05-31 17:54:05 -07:00
Aditi Viswanathan	bc66b3fb8d	make BaseEntityStore inherit from BaseModel (#5478 ) # Make BaseEntityStore inherit from BaseModel This enables initializing InMemoryEntityStore by optionally passing in a value for the store field. ## Who can review? It's a small change so I think any of the reviewers can review, but tagging @dev2049 who seems most relevant since the change relates to Memory.	2023-05-31 17:32:19 -07:00
Sheng Han Lim	3bae595182	Add texts with embeddings to PGVector wrapper (#5500 ) Similar to #1813 for faiss, this PR is to extend functionality to pass text and its vector pair to initialize and add embeddings to the PGVector wrapper. Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @dev2049	2023-05-31 17:31:52 -07:00
Tobias van der Werff	8d07ba0d51	Fix wrong class instantiation in docs MMR example (#5501 ) # Fix wrong class instantiation in docs MMR example <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> When looking at the Maximal Marginal Relevance ExampleSelector example at https://python.langchain.com/en/latest/modules/prompts/example_selectors/examples/mmr.html, I noticed that there seems to be an error. Initially, the `MaxMarginalRelevanceExampleSelector` class is used as an `example_selector` argument to the `FewShotPromptTemplate` class. Then, according to the text, a comparison is made to regular similarity search. However, the `FewShotPromptTemplate` still uses the `MaxMarginalRelevanceExampleSelector` class, so the output is the same. To fix it, I added an instantiation of the `SemanticSimilarityExampleSelector` class, because this seems to be what is intended. ## Who can review? @hwchase17	2023-05-31 17:30:59 -07:00
Taras Tsugrii	b61f50665e	[retrievers][knn] Replace loop appends with list comprehension. (#5529 ) # Replace loop appends with list comprehension. It's much faster, more idiomatic and slightly more readable.	2023-05-31 16:57:24 -07:00
Taras Tsugrii	0ad76c3380	Replace loop appends with list comprehension. (#5528 ) # Replace loop appends with list comprehension. It's significantly faster because it avoids repeated method lookup. It's also more idiomatic and readable.	2023-05-31 16:56:13 -07:00
Timothy Ji	bd9e0f3934	Add param requests_kwargs for WebBaseLoader (#5485 ) # Add param `requests_kwargs` for WebBaseLoader Fixes # (issue) #5483 ## Who can review? @eyurtsev	2023-05-31 15:27:38 -07:00
Taras Tsugrii	359fb8fa3a	Replace list comprehension with generator. (#5526 ) # Replace list comprehension with generator. Since these strings can be fairly long, it's best to not construct unnecessary temporary list just to pass it to `join`. Generators produce items one-by-one and even though they are slightly more expensive than lists in terms of CPU they are much more memory-friendly and slightly more readable.	2023-05-31 15:10:43 -07:00
Matt Robinson	4c8aad0d1b	docs: unstructured no longer requires installing detectron2 from source (#5524 ) # Update Unstructured docs to remove the `detectron2` install instructions Removes `detectron2` installation instructions from the Unstructured docs because installing `detectron2` is no longer required for `unstructured>=0.7.0`. The `detectron2` model now runs using the ONNX runtime. ## Who can review? @hwchase17 @eyurtsev	2023-05-31 15:03:21 -07:00
Rithwik Ediga Lakhamsani	d765d77e9b	Add minor fixes for PySpark Document Loader Docs (#5525 ) # Add minor fixes for PySpark Document Loader Docs Renamed "PySpack" to "PySpark" and executed the notebook to show outputs.	2023-05-31 15:02:57 -07:00
Taras Tsugrii	af41cdfc8b	Replace enumerate with zip. (#5527 ) # Replace enumerate with zip. It's more idiomatic and slightly more readable.	2023-05-31 15:02:23 -07:00
James O'Dwyer	226a7521ed	Add Managed Motorhead (#5507 ) # Add Managed Motorhead This change enabled MotorheadMemory to utilize Metal's managed version of Motorhead. We can easily enable this by passing in a `api_key` and `client_id` in order to hit the managed url and access the memory api on Metal. Twitter: [@softboyjimbo](https://twitter.com/softboyjimbo) ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049 @hwchase17 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 14:55:41 -07:00
Piyush Jain	5ffa924488	Skips creating boto client for Bedrock if passed in constructor (#5523 ) # Skips creating boto client if passed in constructor Current LLM and Embeddings class always creates a new boto client, even if one is passed in a constructor. This blocks certain users from passing in externally created boto clients, for example in SSO authentication. ## Who can review? @hwchase17 @jasondotparse @rsgrewal-aws <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-31 14:54:12 -07:00
Leonid Ganeline	6b47aaab82	added DeepLearing.AI course link (#5518 ) # added DeepLearing.AI course link ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: not @hwchase17 - hehe	2023-05-31 14:53:14 -07:00
Víctor Navarro Aránguiz	f39340ff6b	Add allow_download as attribute for GPT4All (#5512 ) # Added support for download GPT4All model if does not exist I've include the class attribute `allow_download` to the GPT4All class. By default, `allow_download` is set to False. ## Changes Made - Added a new attribute `allow_download` to the GPT4All class. - Updated the `validate_environment` method to pass the `allow_download` parameter to the GPT4All model constructor. ## Context This change provides more control over model downloading in the GPT4All class. Previously, if the model file was not found in the cache directory `~/.cache/gpt4all/`, the package returned error "Failed to retrieve model (type=value_error)". Now, if `allow_download` is set as True then it will use GPT4All package to download it . With the addition of the `allow_download` attribute, users can now choose whether the wrapper is allowed to download the model or not. ## Dependencies There are no new dependencies introduced by this change. It only utilizes existing functionality provided by the GPT4All package. ## Testing Since this is a minor change to the existing behavior, the existing test suite for the GPT4All package should cover this scenario Co-authored-by: Vokturz <victornavarrrokp47@gmail.com>	2023-05-31 13:32:31 -07:00
Zander Chase	ea09c0846f	Add Feedback Methods + Evaluation examples (#5166 ) Add CRUD methods to interact with feedback endpoints + added eval examples to the notebook	2023-05-31 11:14:27 -07:00
Davis Chase	46b7181f13	bump 187 (#5504 )	2023-05-31 07:35:09 -07:00
Harrison Chase	f0ea77b230	add more vars to text splitter (#5503 )	2023-05-31 07:21:20 -07:00
Piyush Jain	562fdfc8f9	Bedrock llm and embeddings (#5464 ) # Bedrock LLM and Embeddings This PR adds a new LLM and an Embeddings class for the [Bedrock](https://aws.amazon.com/bedrock) service. The PR also includes example notebooks for using the LLM class in a conversation chain and embeddings usage in creating an embedding for a query and document. Note: AWS is doing a private release of the Bedrock service on 05/31/2023; users need to request access and added to an allowlist in order to start using the Bedrock models and embeddings. Please use the [Bedrock Home Page](https://aws.amazon.com/bedrock) to request access and to learn more about the models available in Bedrock. <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-31 07:17:01 -07:00
Harrison Chase	5ce74b5958	code splitter docs (#5480 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 07:11:53 -07:00
Harrison Chase	470b2822a3	Add matching engine vectorstore (#3350 ) Co-authored-by: Tom Piaggio <tomaspiaggio@google.com> Co-authored-by: scafati98 <jupyter@matchingengine.us-central1-a.c.scafati-joonix.internal> Co-authored-by: scafati98 <scafatieugenio@gmail.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 02:28:02 -07:00
Kacper Łukawski	8bcaca435a	Feature: Qdrant filters supports (#5446 ) # Support Qdrant filters Qdrant has an [extensive filtering system](https://qdrant.tech/documentation/concepts/filtering/) with rich type support. This PR makes it possible to use the filters in Langchain by passing an additional param to both the `similarity_search_with_score` and `similarity_search` methods. ## Who can review? @dev2049 @hwchase17 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 02:26:16 -07:00
Harrison Chase	f72bb966f8	Harrison/html splitter (#5468 ) Co-authored-by: David Revillas <26328973+r3v1@users.noreply.github.com>	2023-05-30 21:06:07 -07:00
Ankush Gola	1671c2afb2	py tracer fixes (#5377 )	2023-05-30 18:47:06 -07:00
Jose Ignacio Hervás Díaz	ce8b7a2a69	SQLite-backed Entity Memory (#5129 ) # SQLite-backed Entity Memory Following the initiative of https://github.com/hwchase17/langchain/pull/2397 I think it would be helpful to be able to persist Entity Memory on disk by default Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 18:39:47 -07:00
Jeff Vestal	46e181aa8b	Allow ElasticsearchEmbeddings to create a connection with ES Client object (#5321 ) This PR adds a new method `from_es_connection` to the `ElasticsearchEmbeddings` class allowing users to use Elasticsearch clusters outside of Elastic Cloud. Users can create an Elasticsearch Client object and pass that to the new function. The returned object is identical to the one returned by calling `from_credentials` ``` # Create Elasticsearch connection es_connection = Elasticsearch( hosts=['https://es_cluster_url:port'], basic_auth=('user', 'password') ) # Instantiate ElasticsearchEmbeddings using es_connection embeddings = ElasticsearchEmbeddings.from_es_connection( model_id, es_connection, ) ``` I also added examples to the elasticsearch jupyter notebook Fixes # https://github.com/hwchase17/langchain/issues/5239 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 17:26:30 -07:00
Mark Pors	0a44bfdca3	Allow for async use of SelfAskWithSearchChain (#5394 ) # Allow for async use of SelfAskWithSearchChain Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 17:02:39 -07:00
Víctor Navarro Aránguiz	8121e04200	added n_threads functionality for gpt4all (#5427 ) # Added support for modifying the number of threads in the GPT4All model I have added the capability to modify the number of threads used by the GPT4All model. This allows users to adjust the model's parallel processing capabilities based on their specific requirements. ## Changes Made - Updated the `validate_environment` method to set the number of threads for the GPT4All model using the `values["n_threads"]` parameter from the `GPT4All` class constructor. ## Context Useful in scenarios where users want to optimize the model's performance by leveraging multi-threading capabilities. Please note that the `n_threads` parameter was included in the `GPT4All` class constructor but was previously unused. This change ensures that the specified number of threads is utilized by the model . ## Dependencies There are no new dependencies introduced by this change. It only utilizes existing functionality provided by the GPT4All package. ## Testing Since this is a minor change testing is not required. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 16:31:30 -07:00
Blithe	e31705b5ab	convert the parameter 'text' to uppercase in the function 'parse' of the class BooleanOutputParser (#5397 ) when the LLMs output 'yes\|no'，BooleanOutputParser can parse it to 'True\|False', fix the ValueError in parse(). <!-- when use the BooleanOutputParser in the chain_filter.py, the LLMs output 'yes\|no'，the function 'parse' will throw ValueError。 --> Fixes # (issue) #5396 https://github.com/hwchase17/langchain/issues/5396 --------- Co-authored-by: gaofeng27692 <gaofeng27692@hundsun.com>	2023-05-30 16:26:17 -07:00
Natalie	199cc700a3	Ability to specify credentials wihen using Google BigQuery as a data loader (#5466 ) # Adds ability to specify credentials when using Google BigQuery as a data loader Fixes #5465 . Adds ability to set credentials which must be of the `google.auth.credentials.Credentials` type. This argument is optional and will default to `None. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 16:25:22 -07:00
Harrison Chase	eab4b4ccd7	add simple test for imports (#5461 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 16:24:27 -07:00
Janos Tolgyesi	1111f18eb4	Add maximal relevance search to SKLearnVectorStore (#5430 ) # Add maximal relevance search to SKLearnVectorStore This PR implements the maximum relevance search in SKLearnVectorStore. Twitter handle: jtolgyesi (I submitted also the original implementation of SKLearnVectorStore) ## Before submitting Unit tests are included. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 16:13:33 -07:00
Ayan Bandyopadhyay	8181f9e362	Update psychicapi version (#5471 ) Update [psychicapi](https://pypi.org/project/psychicapi/) python package dependency to the latest version 0.5. The newest python package version addresses breaking changes in the Psychic http api.	2023-05-30 15:55:22 -07:00
Kacper Łukawski	f93d256190	Feat: Add batching to Qdrant (#5443 ) # Add batching to Qdrant Several people requested a batching mechanism while uploading data to Qdrant. It is important, as there are some limits for the maximum size of the request payload, and without batching implemented in Langchain, users need to implement it on their own. This PR exposes a new optional `batch_size` parameter, so all the documents/texts are loaded in batches of the expected size (64, by default). The integration tests of Qdrant are extended to cover two cases: 1. Documents are sent in separate batches. 2. All the documents are sent in a single request.	2023-05-30 15:33:54 -07:00
Camille Van Hoffelen	80e133f16d	Added async _acall to FakeListLLM (#5439 ) # Added Async _acall to FakeListLLM FakeListLLM is handy when unit testing apps built with langchain. This allows the use of FakeListLLM inside concurrent code with [asyncio](https://docs.python.org/3/library/asyncio.html). I also changed the pydocstring which was out of date. ## Who can review? @hwchase17 - project lead @agola11 - async	2023-05-30 14:34:36 -07:00
Leonid Ganeline	1f11f80641	docs: cleaning (#5413 ) # docs cleaning Changed docs to consistent format (probably, we need an official doc integration template): - ClearML - added product descriptions; changed title/headers - Rebuff - added product descriptions; changed title/headers - WhyLabs - added product descriptions; changed title/headers - Docugami - changed title/headers/structure - Airbyte - fixed title - Wolfram Alpha - added descriptions, fixed title - OpenWeatherMap - - added product descriptions; changed title/headers - Unstructured - changed description ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 @dev2049	2023-05-30 13:58:16 -07:00
Matt Wells	1d861dc37a	MRKL output parser no longer breaks well formed queries (#5432 ) # Handles the edge scenario in which the action input is a well formed SQL query which ends with a quoted column There may be a cleaner option here (or indeed other edge scenarios) but this seems to robustly determine if the action input is likely to be a well formed SQL query in which we don't want to arbitrarily trim off `"` characters Fixes #5423 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Agents / Tools / Toolkits - @vowelparrot	2023-05-30 15:58:47 -04:00
Yoann Poupart	c1807d8408	`encoding_kwargs` for InstructEmbeddings (#5450 ) # What does this PR do? Bring support of `encode_kwargs` for ` HuggingFaceInstructEmbeddings`, change the docstring example and add a test to illustrate with `normalize_embeddings`. Fixes #3605 (Similar to #3914) Use case: ```python from langchain.embeddings import HuggingFaceInstructEmbeddings model_name = "hkunlp/instructor-large" model_kwargs = {'device': 'cpu'} encode_kwargs = {'normalize_embeddings': True} hf = HuggingFaceInstructEmbeddings( model_name=model_name, model_kwargs=model_kwargs, encode_kwargs=encode_kwargs ) ```	2023-05-30 11:57:04 -07:00
Patrick Keane	e09afb4b44	Removes duplicated call from langchain/client/langchain.py (#5449 ) This removes duplicate code presumably introduced by a cut-and-paste error, spotted while reviewing the code in ```langchain/client/langchain.py```. The original code had back to back occurrences of the following code block: ``` response = self._get( path, params=params, ) raise_for_status_with_text(response) ```	2023-05-30 11:52:46 -07:00
Jan Brinkmann	0d3a9d481f	Fixed docstring in faiss.py for load_local (#5440 ) # Fix for docstring in faiss.py vectorstore (load_local) The doctring should reflect that load_local loads something FROM the disk.	2023-05-30 11:41:00 -07:00
Davis Chase	4379bd4cbb	bump 186 (#5459 )	2023-05-30 10:47:59 -07:00
Davis Chase	2649b638dd	fix (#5457 )	2023-05-30 10:42:20 -07:00
Davis Chase	64b4165c8d	bump 185 (#5442 )	2023-05-30 08:08:11 -07:00
ByronHsu	9d658aaa5a	Add more code splitters (go, rst, js, java, cpp, scala, ruby, php, swift, rust) (#5171 ) As the title says, I added more code splitters. The implementation is trivial, so i don't add separate tests for each splitter. Let me know if any concerns. Fixes # (issue) https://github.com/hwchase17/langchain/issues/5170 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev @hwchase17 --------- Signed-off-by: byhsu <byhsu@linkedin.com> Co-authored-by: byhsu <byhsu@linkedin.com>	2023-05-30 11:04:05 -04:00
Paul-Emile Brotons	a61b7f7e7c	adding MongoDBAtlasVectorSearch (#5338 ) # Add MongoDBAtlasVectorSearch for the python library Fixes #5337 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 07:59:01 -07:00
Harrison Chase	c4b502a470	Harrison/condense q llm (#5438 )	2023-05-30 07:15:37 -07:00
Lei Xu	ee57054d05	Rename and fix typo in lancedb (#5425 ) # Fix typo in LanceDB notebook filename	2023-05-30 00:24:17 -07:00
Zander Chase	26ff18575c	Set old LCTracer to default to port 8000 (#5381 ) Issue from: https://discord.com/channels/1038097195422978059/1069478035918688346/1112445980466483222	2023-05-29 22:42:53 -07:00
Harrison Chase	760632b292	Harrison/spark reader (#5405 ) Co-authored-by: Rithwik Ediga Lakhamsani <rithwik.ediga@databricks.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 20:23:17 -07:00
UmerHA	8259f9b7fa	DocumentLoader for GitHub (#5408 ) # Creates GitHubLoader (#5257) GitHubLoader is a DocumentLoader that loads issues and PRs from GitHub. Fixes #5257 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 20:11:21 -07:00
German Martin	0b3e0dd1d2	New Trello document loader (#4767 ) # Added New Trello loader class and documentation Simple Loader on top of py-trello wrapper. With a board name you can pull cards and to do some field parameter tweaks on load operation. I included documentation and examples. Included unit test cases using patch and a fixture for py-trello client class. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 19:47:56 -07:00
Harrison Chase	72f99ff953	Harrison/text splitter (#5417 ) adds support for keeping separators around when using recursive text splitter	2023-05-29 16:56:31 -07:00
小铭	cf5803e44c	Add ToolException that a tool can throw. (#5050 ) # Add ToolException that a tool can throw This is an optional exception that tool throws when execution error occurs. When this exception is thrown, the agent will not stop working,but will handle the exception according to the handle_tool_error variable of the tool,and the processing result will be returned to the agent as observation,and printed in pink on the console.It can be used like this: ```python from langchain.schema import ToolException from langchain import LLMMathChain, SerpAPIWrapper, OpenAI from langchain.agents import AgentType, initialize_agent from langchain.chat_models import ChatOpenAI from langchain.tools import BaseTool, StructuredTool, Tool, tool from langchain.chat_models import ChatOpenAI llm = ChatOpenAI(temperature=0) llm_math_chain = LLMMathChain(llm=llm, verbose=True) class Error_tool: def run(self, s: str): raise ToolException('The current search tool is not available.') def handle_tool_error(error) -> str: return "The following errors occurred during tool execution:"+str(error) search_tool1 = Error_tool() search_tool2 = SerpAPIWrapper() tools = [ Tool.from_function( func=search_tool1.run, name="Search_tool1", description="useful for when you need to answer questions about current events.You should give priority to using it.", handle_tool_error=handle_tool_error, ), Tool.from_function( func=search_tool2.run, name="Search_tool2", description="useful for when you need to answer questions about current events", return_direct=True, ) ] agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True, handle_tool_errors=handle_tool_error) agent.run("Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?") ``` ![image](https://github.com/hwchase17/langchain/assets/32786500/51930410-b26e-4f85-a1e1-e6a6fb450ada) ## Who can review? - @vowelparrot --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 20:05:58 +00:00
Harrison Chase	cce731c3c2	bump version 184 (#5407 )	2023-05-29 07:53:32 -07:00
Harrison Chase	2da8c48be1	Harrison/datetime parser (#4693 ) Co-authored-by: Jacob Valdez <jacobfv@msn.com> Co-authored-by: Jacob Valdez <jacob.valdez@limboid.ai> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-05-29 07:52:30 -07:00
Leonid Ganeline	1837caa70d	docs: `ecosystem/integrations` update 1 (#5219 ) # docs: ecosystem/integrations update It is the first in a series of `ecosystem/integrations` updates. The ecosystem/integrations list is missing many integrations. I'm adding the missing integrations in a consistent format: 1. description of the integrated system 2. `Installation and Setup` section with 'pip install ...`, Key setup, and other necessary settings 3. Sections like `LLM`, `Text Embedding Models`, `Chat Models`... with links to correspondent examples and imports of the used classes. This PR keeps new docs, that are presented in the `docs/modules/models/text_embedding/examples` but missed in the `ecosystem/integrations`. The next PRs will cover the next example sections. Also updated `integrations.rst`: added the `Dependencies` section with a link to the packages used in LangChain. ## Who can review? @hwchase17 @eyurtsev @dev2049	2023-05-29 07:25:17 -07:00
Leonid Ganeline	a3598193a0	docs: `ecosystem/integrations` update 2 (#5282 ) # docs: ecosystem/integrations update 2 #5219 - part 1 The second part of this update (parts are independent of each other! no overlap): - added diffbot.md - updated confluence.ipynb; added confluence.md - updated college_confidential.md - updated openai.md - added blackboard.md - added bilibili.md - added azure_blob_storage.md - added azlyrics.md - added aws_s3.md ## Who can review? @hwchase17@agola11 @agola11 @vowelparrot @dev2049	2023-05-29 07:19:43 -07:00
Eduard van Valkenburg	ccb6238de1	Implemented appending arbitrary messages (#5293 ) # Implemented appending arbitrary messages to the base chat message history, the in-memory and cosmos ones. <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> As discussed this is the alternative way instead of #4480, with a add_message method added that takes a BaseMessage as input, so that the user can control what is in the base message like kwargs. <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-29 07:18:59 -07:00
Harrison Chase	d6fb25c439	Harrison/prediction guard update (#5404 ) Co-authored-by: Daniel Whitenack <whitenack.daniel@gmail.com>	2023-05-29 07:14:59 -07:00
Harrison Chase	416c8b1da3	Harrison/deep infra (#5403 ) Co-authored-by: Yessen Kanapin <yessenzhar@gmail.com> Co-authored-by: Yessen Kanapin <yessen@deepinfra.com>	2023-05-29 07:10:50 -07:00
Timothy Ji	100d6655df	Reformat openai proxy setting as code (#5330 ) # Reformat the openai proxy setting as code Only affect the doc for openai Model - @hwchase17 - @agola11	2023-05-29 07:02:47 -07:00
Justin Flick	c09f8e4ddc	Add pagination for Vertex AI embeddings (#5325 ) Fixes #5316 --------- Co-authored-by: Justin Flick <jflick@homesite.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-29 06:57:41 -07:00
Harrison Chase	3e16468423	Harrison/llamacpp (#5402 ) Co-authored-by: Gavin S <gavinswanson@gmail.com>	2023-05-29 06:44:58 -07:00
Chandan Routray	642ae83d86	Removed deprecated llm attribute for load_chain (#5343 ) # Removed deprecated llm attribute for load_chain Currently `load_chain` for some chain types expect `llm` attribute to be present but `llm` is deprecated attribute for those chains and might not be persisted during their `chain.save`. Fixes #5224 [(issue)](https://github.com/hwchase17/langchain/issues/5224) ## Who can review? @hwchase17 @dev2049 --------- Co-authored-by: imeckr <chandanroutray2012@gmail.com>	2023-05-29 06:44:47 -07:00
Oleh Kuznetsov	f6615cac41	Update llamacpp demonstration notebook (#5344 ) # Update llamacpp demonstration notebook Add instructions to install with BLAS backend, and update the example of model usage. Fixes #5071. However, it is more like a prevention of similar issues in the future, not a fix, since there was no problem in the framework functionality ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @hwchase17 - @agola11	2023-05-29 06:43:26 -07:00
Martin Holecek	44b48d9518	Fix update_document function, add test and documentation. (#5359 ) # Fix for `update_document` Function in Chroma ## Summary This pull request addresses an issue with the `update_document` function in the Chroma class, as described in [#5031](https://github.com/hwchase17/langchain/issues/5031#issuecomment-1562577947). The issue was identified as an `AttributeError` raised when calling `update_document` due to a missing corresponding method in the `Collection` object. This fix refactors the `update_document` method in `Chroma` to correctly interact with the `Collection` object. ## Changes 1. Fixed the `update_document` method in the `Chroma` class to correctly call methods on the `Collection` object. 2. Added the corresponding test `test_chroma_update_document` in `tests/integration_tests/vectorstores/test_chroma.py` to reflect the updated method call. 3. Added an example and explanation of how to use the `update_document` function in the Jupyter notebook tutorial for Chroma. ## Test Plan All existing tests pass after this change. In addition, the `test_chroma_update_document` test case now correctly checks the functionality of `update_document`, ensuring that the function works as expected and updates the content of documents correctly. ## Reviewers @dev2049 This fix will ensure that users are able to use the `update_document` function as expected, without encountering the previous `AttributeError`. This will enhance the usability and reliability of the Chroma class for all users. Thank you for considering this pull request. I look forward to your feedback and suggestions.	2023-05-29 06:39:25 -07:00
Louis Amaudruz	e455ba4ed5	Add async support to routing chains (#5373 ) # Add async support for (LLM) routing chains <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Add asynchronous LLM calls support for the routing chains. More specifically: - Add async `aroute` function (i.e. async version of `route`) to the `RouterChain` which calls the routing LLM asynchronously - Implement the async `_acall` for the `LLMRouterChain` - Implement the async `_acall` function for `MultiRouteChain` which first calls asynchronously the routing chain with its new `aroute` function, and then calls asynchronously the relevant destination chain. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? - @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Async - @agola11 -->	2023-05-29 06:37:26 -07:00
Gael Grosch	8b7721ebbb	fix: Blob.from_data mimetype is lost (#5395 ) # Fix lost mimetype when using Blob.from_data method The mimetype is lost due to a typo in the class attribue name Fixes # - (no issue opened but I can open one if needed) ## Changes * Fixed typo in name * Added unit-tests to validate the output Blob ## Review @eyurtsev	2023-05-29 06:36:50 -07:00
Jacob Lee	f77f27163d	Update PR template with Twitter handle request (#5382 ) # Updates PR template to request Twitter handle for shoutouts! Makes it easier for maintainers to show their appreciation 😄	2023-05-29 06:23:17 -07:00
Zander Chase	14099f1b93	Use Default Factory (#5380 ) We shouldn't be calling a constructor for a default value - should use default_factory instead. This is especially ad in this case since it requires an optional dependency and an API key to be set. Resolves #5361	2023-05-29 06:22:35 -07:00
Harrison Chase	6df90ad9fd	handle json parsing errors (#5371 ) adds tests cases, consolidates a lot of PRs	2023-05-29 06:18:19 -07:00
玄猫	99a1e3f3a3	Fix: Handle empty documents in ContextualCompressionRetriever (Issue #5304 ) (#5306 ) # Fix: Handle empty documents in ContextualCompressionRetriever (Issue #5304) Fixes #5304 Prevent cohere.error.CohereAPIError caused by an empty list of documents by adding a condition to check if the input documents list is empty in the compress_documents method. If the list is empty, return an empty list immediately, avoiding the error and unnecessary processing. @dev2049 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-28 13:19:34 -07:00
os1ma	1366d070fc	Add path validation to DirectoryLoader (#5327 ) # Add path validation to DirectoryLoader This PR introduces a minor adjustment to the DirectoryLoader by adding validation for the path argument. Previously, if the provided path didn't exist or wasn't a directory, DirectoryLoader would return an empty document list due to the behavior of the `glob` method. This could potentially cause confusion for users, as they might expect a file-loading error instead. So, I've added two validations to the load method of the DirectoryLoader: - Raise a FileNotFoundError if the provided path does not exist - Raise a ValueError if the provided path is not a directory Due to the relatively small scope of these changes, a new issue was not created. ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev	2023-05-28 15:31:23 -04:00
Harrison Chase	ad7f4c0317	bump to 183 (#5372 )	2023-05-28 11:42:58 -07:00
Harrison Chase	b6927970f1	revert bad json (#5370 )	2023-05-28 10:22:02 -07:00
Matt Wells	9a5c9df809	Fixes iter error in FAISS add_embeddings call (#5367 ) # Remove re-use of iter within add_embeddings causing error As reported in https://github.com/hwchase17/langchain/issues/5336 there is an issue currently involving the atempted re-use of an iterator within the FAISS vectorstore adapter Fixes # https://github.com/hwchase17/langchain/issues/5336 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049	2023-05-28 09:59:30 -07:00
Davis Chase	b705f260f4	bump 182 (#5364 )	2023-05-28 09:16:18 -07:00
Janos Tolgyesi	5f4552391f	Add SKLearnVectorStore (#5305 ) # Add SKLearnVectorStore This PR adds SKLearnVectorStore, a simply vector store based on NearestNeighbors implementations in the scikit-learn package. This provides a simple drop-in vector store implementation with minimal dependencies (scikit-learn is typically installed in a data scientist / ml engineer environment). The vector store can be persisted and loaded from json, bson and parquet format. SKLearnVectorStore has soft (dynamic) dependency on the scikit-learn, numpy and pandas packages. Persisting to bson requires the bson package, persisting to parquet requires the pyarrow package. ## Before submitting Integration tests are provided under `tests/integration_tests/vectorstores/test_sklearn.py` Sample usage notebook is provided under `docs/modules/indexes/vectorstores/examples/sklear.ipynb` Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-28 08:17:42 -07:00
Aymen Furter	e2742953a6	feat: support for shopping search in SerpApi (#5259 ) # Support for shopping search in SerpApi ## Who can review? @vowelparrot	2023-05-27 21:20:24 -07:00
Eduard van Valkenburg	1daa7068b2	added cosmos kwargs option (#5292 ) # Added the ability to pass kwargs to cosmos client constructor The cosmos client has a ton of options that can be set, so allowing those to be passed to the constructor from the chat memory constructor with this PR.	2023-05-27 21:19:40 -07:00
Kenton	881dfe8179	Sample Notebook for DynamoDB Chat Message History (#5351 ) # Sample Notebook for DynamoDB Chat Message History @dev2049 Adding a sample notebook for the DynamoDB Chat Message History class. <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-27 21:16:24 -07:00
mbchang	f079cdf479	fix: remove empty lines that cause InvalidRequestError (#5320 ) # remove empty lines in GenerativeAgentMemory that cause InvalidRequestError in OpenAIEmbeddings <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Let's say the text given to `GenerativeAgent._parse_list` is ``` text = """ Insight 1: <insight 1> Insight 2: <insight 2> """ ``` This creates an `openai.error.InvalidRequestError: [''] is not valid under any of the given schemas - 'input'` because `GenerativeAgent.add_memory()` tries to add an empty string to the vectorstore. This PR fixes the issue by removing the empty line between `Insight 1` and `Insight 2` ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @hwchase17 @vowelparrot @dev2049	2023-05-27 21:15:03 -07:00
Deepak S V	c6e5d90eff	Fixing blank thoughts in verbose for "_Exception" Action (#5331 ) Fixed the issue of blank Thoughts being printed in verbose when `handle_parsing_errors=True`, as below: Before Fix: ``` Observation: There are 38175 accounts available in the dataframe. Thought: Observation: Invalid or incomplete response Thought: Observation: Invalid or incomplete response Thought: ``` After Fix: ``` Observation: There are 38175 accounts available in the dataframe. Thought:AI: { "action": "Final Answer", "action_input": "There are 38175 accounts available in the dataframe." } Observation: Invalid Action or Action Input format Thought:AI: { "action": "Final Answer", "action_input": "The number of available accounts is 38175." } Observation: Invalid Action or Action Input format ``` @vowelparrot currently I have set the colour of thought to green (same as the colour when `handle_parsing_errors=False`). If you want to change the colour of this "_Exception" case to red or something else (when `handle_parsing_errors=True`), feel free to change it in line 789.	2023-05-27 21:14:16 -07:00
DanConstantini	c49c6ac97a	Add Chainlit to deployment options (#5314 ) # Add Chainlit to deployment options Add [Chainlit](https://github.com/Chainlit/chainlit) as deployment options Used links to Github examples and Chainlit doc on the LangChain integration Co-authored-by: Dan Constantini <danconstantini@Dan-Constantini-MacBook.local>	2023-05-27 21:12:53 -07:00
Harrison Chase	5292e855c0	add enum output parser (#5165 )	2023-05-27 20:59:24 -07:00
Harrison Chase	179ddbe88b	add enum output parser (#5165 )	2023-05-27 20:58:23 -07:00
Leonid Ganeline	465a970724	docs: added link to LangChain Handbook (#5311 ) # added a link to LangChain Handbook ## Who can review? Community members can review the PR once tests pass.	2023-05-27 20:57:40 -07:00
Russ	6e974b5f04	Fix typos (#5323 ) # Documentation typo fixes Fixes # (issue) Simple typos in the blockchain .ipynb documentation	2023-05-26 18:55:21 -07:00
Michael Landis	f75f0dbad6	docs: improve flow of llm caching notebook (#5309 ) # docs: improve flow of llm caching notebook The notebook `llm_caching` demos various caching providers. In the previous version, there was setup common to all examples but under the `In Memory Caching` heading. If a user comes and only wants to try a particular example, they will run the common setup, then the cells for the specific provider they are interested in. Then they will get import and variable reference errors. This commit moves the common setup to the top to avoid this. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049	2023-05-26 13:34:11 -04:00
Eugene Yurtsev	0a8d6bc402	Add instructions to pyproject.toml (#5138 ) # Add instructions to pyproject.toml * Add instructions to pyproject.toml about how to handle optional dependencies. ## Before submitting ## Who can review? --------- Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com> Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>	2023-05-26 13:29:07 -04:00
Shukri	58e95cd11e	Better docs for weaviate hybrid search (#5290 ) # Better docs for weaviate hybrid search <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes: NA ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @dev2049	2023-05-26 09:30:41 -07:00
Davis Chase	641303a361	bump 181 (#5302 )	2023-05-26 08:44:19 -07:00
Leonid Kuligin	aa3c7b3271	Fixed passing creds to VertexAI LLM (#5297 ) # Fixed passing creds to VertexAI LLM Fixes #5279 It looks like we should drop a type annotation for Credentials. Co-authored-by: Leonid Kuligin <kuligin@google.com>	2023-05-26 08:31:02 -07:00
Eugene Yurtsev	a669abf16b	Update CONTRIBUTION guidelines and PR Template (#5140 ) # Update contribution guidelines and PR template This PR updates the contribution guidelines to include more information on how to handle optional dependencies. The PR template is updated to include a link to the contribution guidelines document.	2023-05-26 10:18:11 -04:00
Peng Qu	d481d887bc	Add an example to make the prompt more robust (#5291 ) # Add example to LLMMath to help with power operator Add example to LLMMath that helps the model to interpret `^` as the power operator rather than the python xor operator.	2023-05-26 09:32:35 -04:00
Xiangrui Meng	aec642febb	LLM wrapper for Databricks (#5142 ) This PR adds LLM wrapper for Databricks. It supports two endpoint types: * serving endpoint * cluster driver proxy app An integration notebook is included to show how it works. Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com> Co-authored-by: Gengliang Wang <gengliang@apache.org> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 19:19:37 -07:00
Ted Martinez	1cb6498fdb	Tedma4/twilio tool (#5136 ) # Add twilio sms tool --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 19:19:22 -07:00
Moonsik Kang	a0281f5acb	Fixed typo: 'ouput' to 'output' in all documentation (#5272 ) # Fixed typo: 'ouput' to 'output' in all documentation In this instance, the typo 'ouput' was amended to 'output' in all occurrences within the documentation. There are no dependencies required for this change.	2023-05-25 19:18:31 -07:00
Michael Landis	7047a2c1af	feat: add Momento as a standard cache and chat message history provider (#5221 ) # Add Momento as a standard cache and chat message history provider This PR adds Momento as a standard caching provider. Implements the interface, adds integration tests, and documentation. We also add Momento as a chat history message provider along with integration tests, and documentation. [Momento](https://www.gomomento.com/) is a fully serverless cache. Similar to S3 or DynamoDB, it requires zero configuration, infrastructure management, and is instantly available. Users sign up for free and get 50GB of data in/out for free every month. ## Before submitting ✅ We have added documentation, notebooks, and integration tests demonstrating usage. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 19:13:21 -07:00
Hassan Ouda	56ad56c812	Support bigquery dialect - SQL (#5261 ) # Your PR Title (What it does) Adding an if statement to deal with bigquery sql dialect. When I use bigquery dialect before, it failed while using SET search_path TO. So added a condition to set dataset as the schema parameter which is equivalent to SET search_path TO . I have tested and it works. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049	2023-05-25 18:19:17 -07:00
Abdelsalam ElTamawy	2ef5579eae	Added pipline args to `HuggingFacePipeline.from_model_id` (#5268 ) The current `HuggingFacePipeline.from_model_id` does not allow passing of pipeline arguments to the transformer pipeline. This PR enables adding important pipeline parameters like setting `max_new_tokens` for example. Previous to this PR it would be necessary to manually create the pipeline through huggingface transformers then handing it to langchain. For example instead of this ```py model_id = "gpt2" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, max_new_tokens=10 ) hf = HuggingFacePipeline(pipeline=pipe) ``` You can write this ```py hf = HuggingFacePipeline.from_model_id( model_id="gpt2", task="text-generation", pipeline_kwargs={"max_new_tokens": 10} ) ``` Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 17:54:52 -07:00
Davis Chase	f01dfe858d	OpenAI lint (#5273 ) Causing lint issues if you have openai installed, annoying for local dev	2023-05-25 16:20:06 -07:00
Nicholas Liu	7652d2abb0	Add Multi-CSV/DF support in CSV and DataFrame Toolkits (#5009 ) Add Multi-CSV/DF support in CSV and DataFrame Toolkits * CSV and DataFrame toolkits now accept list of CSVs/DFs * Add default prompts for many dataframes in `pandas_dataframe` toolkit Fixes #1958 Potentially fixes #4423 ## Testing * Add single and multi-dataframe integration tests for `pandas_dataframe` toolkit with permutations of `include_df_in_prompt` * Add single and multi-CSV integration tests for csv toolkit --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-25 14:23:11 -07:00
Alex Rothberg	3223a97dc6	Add visible_only and strict_mode options to ClickTool (#4088 ) Partially addresses: https://github.com/hwchase17/langchain/issues/4066	2023-05-25 14:10:39 -07:00
Ravindra Marella	b3988621c5	Add C Transformers for GGML Models (#5218 ) # Add C Transformers for GGML Models I created Python bindings for the GGML models: https://github.com/marella/ctransformers Currently it supports GPT-2, GPT-J, GPT-NeoX, LLaMA, MPT, etc. See [Supported Models](https://github.com/marella/ctransformers#supported-models). It provides a unified interface for all models: ```python from langchain.llms import CTransformers llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2') print(llm('AI is going to')) ``` It can be used with models hosted on the Hugging Face Hub: ```py llm = CTransformers(model='marella/gpt-2-ggml') ``` It supports streaming: ```py from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler llm = CTransformers(model='marella/gpt-2-ggml', callbacks=[StreamingStdOutCallbackHandler()]) ``` Please see [README](https://github.com/marella/ctransformers#readme) for more details. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 13:42:44 -07:00
Davis Chase	ca88b25da6	Zep sdk version (#5267 ) zep-python's sync methods no longer need an asyncio wrapper. This was causing issues with FastAPI deployment. Zep also now supports putting and getting of arbitrary message metadata. Bump zep-python version to v0.30 Remove nest-asyncio from Zep example notebooks. Modify tests to include metadata. --------- Co-authored-by: Daniel Chalef <daniel.chalef@private.org> Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	2023-05-25 13:42:10 -07:00
Janil Wörst	5525602df0	Docs link custom agent page in getting started (#5250 ) # Docs: link custom agent page in getting started	2023-05-25 13:11:30 -07:00
Alon Diament	d3cd21ccf8	Fixed regression in JoplinLoader's get note url (#5265 ) Fixes a regression in JoplinLoader that was introduced during the code review (bad `page` wildcard in _get_note_url). ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049 @leo-gan	2023-05-25 13:10:10 -07:00
Davis Chase	3be9ba14f3	OpenSearch top k parameter fix (#5216 ) For most queries it's the `size` parameter that determines final number of documents to return. Since our abstractions refer to this as `k`, set this to be `k` everywhere instead of expecting a separate param. Would be great to have someone more familiar with OpenSearch validate that this is reasonable (e.g. that having `size` and what OpenSearch calls `k` be the same won't lead to any strange behavior). cc @naveentatikonda Closes #5212	2023-05-25 09:51:23 -07:00
Yves Maurer	88ed8e1cd6	Added the option of specifying a proxy for the OpenAI API (#5246 ) # Added the option of specifying a proxy for the OpenAI API Fixes #5243 Co-authored-by: Yves Maurer <>	2023-05-25 09:50:25 -07:00
mwinterde	9c0cb90997	Resolve error in StructuredOutputParser docs (#5240 ) # Resolve error in StructuredOutputParser docs Documentation for `StructuredOutputParser` currently not reproducible, that is, `output_parser.parse(output)` raises an error because the LLM returns a response with an invalid format ```python _input = prompt.format_prompt(question="what's the capital of france") output = model(_input.to_string()) output # ? # # ```json # { # "answer": "Paris", # "source": "https://www.worldatlas.com/articles/what-is-the-capital-of-france.html" # } # ``` ``` Was fixed by adding a question mark to the prompt	2023-05-25 07:47:25 -07:00
Peng Qu	c7e2151a4b	remove extra "\n" to ensure that the format of the description, examp… (#5232 ) remove extra "\n" to ensure that the format of the description, example, and prompt&generation are completely consistent.	2023-05-25 07:46:39 -07:00
Davis Chase	15b17f9334	bump 180 (#5248 )	2023-05-25 07:09:50 -07:00
mwinterde	9e57be4b5c	Fix typo in docstring of RetryWithErrorOutputParser (#5244 )	2023-05-25 09:59:31 -04:00
Shukri	09e246f306	Weaviate: Add QnA with sources example (#5247 ) # Add QnA with sources example <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes: see https://stackoverflow.com/questions/76207160/langchain-doesnt-work-with-weaviate-vector-database-getting-valueerror/76210017#76210017 ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @dev2049	2023-05-25 09:58:33 -04:00
Archon	5cdd9ab7e1	Add MiniMax embeddings (#5174 ) - Add support for MiniMax embeddings Doc: [MiniMax embeddings](https://api.minimax.chat/document/guides/embeddings?id=6464722084cdc277dfaa966a) --------- Co-authored-by: Archon <archongum@outlook.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 06:57:49 -07:00
Eugene Yurtsev	5cfa72a130	Bibtex integration for document loader and retriever (#5137 ) # Bibtex integration Wrap bibtexparser to retrieve a list of docs from a bibtex file. * Get the metadata from the bibtex entries * `page_content` get from the local pdf referenced in the `file` field of the bibtex entry using `pymupdf` * If no valid pdf file, `page_content` set to the `abstract` field of the bibtex entry * Support Zotero flavour using regex to get the file path * Added usage example in `docs/modules/indexes/document_loaders/examples/bibtex.ipynb` --------- Co-authored-by: Sébastien M. Popoff <sebastien.popoff@espci.fr> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 00:21:31 -07:00
Ati Sharma	40b086d6e8	Allow to specify ID when adding to the FAISS vectorstore. (#5190 ) # Allow to specify ID when adding to the FAISS vectorstore This change allows unique IDs to be specified when adding documents / embeddings to a faiss vectorstore. - This reflects the current approach with the chroma vectorstore. - It allows rejection of inserts on duplicate IDs - will allow deletion / update by searching on deterministic ID (such as a hash). - If not specified, a random UUID is generated (as per previous behaviour, so non-breaking). This commit fixes #5065 and #3896 and should fix #2699 indirectly. I've tested adding and merging. Kindly tagging @Xmaster6y @dev2049 for review. --------- Co-authored-by: Ati Sharma <ati@agalmic.ltd> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-24 22:26:46 -07:00
Nicholas Liu	f0ea093de8	Change Default GoogleDriveLoader Behavior to not Load Trashed Files (issue #5104 ) (#5220 ) # Change Default GoogleDriveLoader Behavior to not Load Trashed Files (issue #5104) Fixes #5104 If the previous behavior of loading files that used to live in the folder, but are now trashed, you can use the `load_trashed_files` parameter: ``` loader = GoogleDriveLoader( folder_id="1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5", recursive=False, load_trashed_files=True ) ``` As not loading trashed files should be expected behavior, should we 1. even provide the `load_trashed_files` parameter? 2. add documentation? Feels most users will stick with default behavior ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: DataLoaders - @eyurtsev Twitter: [@nicholasliu77](https://twitter.com/nicholasliu77)	2023-05-24 22:26:17 -07:00
Keno	eff31a3361	Remove API key from docs (#5223 ) I found an API key for `serpapi_api_key` while reading the docs. It seems to have been modified very recently. Removed it in this PR @hwchase17 - project lead	2023-05-24 22:25:39 -07:00
maspotts	95c9aa1ccb	Create async copy of from_text() inside GraphIndexCreator. (#5214 ) Copies `GraphIndexCreator.from_text()` to make an async version called `GraphIndexCreator.afrom_text()`. This is (should be) a trivial change: it just adds a copy of `GraphIndexCreator.from_text()` which is async and awaits a call to `chain.apredict()` instead of `chain.predict()`. There is no unit test for GraphIndexCreator, and I did not create one, but this code works for me locally. @agola11 @hwchase17	2023-05-24 21:54:12 -07:00
Leonid Ganeline	2ad29f410d	fix a mistake in concepts.md (#5222 ) # fix a mistake in concepts.md ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:	2023-05-24 21:47:22 -07:00
Harrison Chase	a775aa6389	Harrison/vertex (#5049 ) Co-authored-by: Leonid Kuligin <kuligin@google.com> Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru> Co-authored-by: sasha-gitg <44654632+sasha-gitg@users.noreply.github.com> Co-authored-by: Justin Flick <Justinjayflick@gmail.com> Co-authored-by: Justin Flick <jflick@homesite.com>	2023-05-24 15:51:12 -07:00
Zander Chase	e6c4571191	Add 'status' command to get server status (#5197 ) Example: ``` $ langchain plus start --expose ... $ langchain plus status The LangChainPlus server is currently running. Service Status Published Ports langchain-backend Up 40 seconds 1984 langchain-db Up 41 seconds 5433 langchain-frontend Up 40 seconds 80 ngrok Up 41 seconds 4040 To connect, set the following environment variables in your LangChain application: LANGCHAIN_TRACING_V2=true LANGCHAIN_ENDPOINT=https://5cef-70-23-89-158.ngrok.io $ langchain plus stop $ langchain plus status The LangChainPlus server is not running. $ langchain plus start The LangChainPlus server is currently running. Service Status Published Ports langchain-backend Up 5 seconds 1984 langchain-db Up 6 seconds 5433 langchain-frontend Up 5 seconds 80 To connect, set the following environment variables in your LangChain application: LANGCHAIN_TRACING_V2=true LANGCHAIN_ENDPOINT=http://localhost:1984 ```	2023-05-24 21:43:16 +00:00
Zander Chase	e76e68b211	Add Delete Session Method (#5193 )	2023-05-24 21:06:03 +00:00
Zander Chase	66113c2a62	Log warning (#5192 ) Changes debug log to warning log when LC Tracer fails to instantiate	2023-05-24 21:05:13 +00:00
Ankush Gola	b7fcb35a39	add option to pass openai key to langchain plus command (#5213 )	2023-05-24 21:05:03 +00:00
Davis Chase	dcee8936c1	nit (#5208 )	2023-05-24 12:52:20 -07:00
Alon Diament	44abe925df	Add Joplin document loader (#5153 ) # Add Joplin document loader [Joplin](https://joplinapp.org/) is an open source note-taking app. Joplin has a [REST API](https://joplinapp.org/api/references/rest_api/) for accessing its local database. The proposed `JoplinLoader` uses the API to retrieve all notes in the database and their metadata. Joplin needs to be installed and running locally, and an access token is required. - The PR includes an integration test. - The PR includes an example notebook. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 12:31:55 -07:00
Rodrigo Siqueira	f10be072ff	Add Iugu document loader (#5162 ) Create IUGU loader --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 11:47:01 -07:00
ByronHsu	f0730c6489	Allow readthedoc loader to pass custom html tag (#5175 ) ## Description The html structure of readthedocs can differ. Currently, the html tag is hardcoded in the reader, and unable to fit into some cases. This pr includes the following changes: 1. Replace `find_all` with `find` because we just want one tag. 2. Provide `custom_html_tag` to the loader. 3. Add tests for readthedoc loader 4. Refactor code ## Issues See more in https://github.com/hwchase17/langchain/pull/2609. The problem was not completely fixed in that pr. --------- Signed-off-by: byhsu <byhsu@linkedin.com> Co-authored-by: byhsu <byhsu@linkedin.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 10:40:27 -07:00
Alexander Dibrov	d8eed6018f	Output parsing variation allowance (#5178 ) # Output parsing variation allowance for self-ask with search This change makes self-ask with search easier for Llama models to follow, as they tend toward returning 'Followup:' instead of 'Follow up:' despite an otherwise valid remaining output. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 10:39:09 -07:00
Matt Wells	c173bf1c62	Fixes scope of query Session in PGVector (#5194 ) `vectorstore.PGVector`: The transactional boundary should be increased to cover the query itself Currently, within the `similarity_search_with_score_by_vector` the transactional boundary (created via the `Session` call) does not include the select query being made. This can result in un-intended consequences when interacting with the PGVector instance methods directly --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 10:37:45 -07:00
Tommaso De Lorenzo	52714cedd4	fixing total cost finetuned model giving zero (#5144 ) # OpanAI finetuned model giving zero tokens cost Very simple fix to the previously committed solution to allowing finetuned Openai models. Improves #5127 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 10:04:08 -07:00
Harrison Chase	94cf391ef1	standardize json parsing (#5168 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 10:03:53 -07:00
Davis Chase	2b2176a3c1	tfidf retriever (#5114 ) Co-authored-by: vempaliakhil96 <vempaliakhil96@gmail.com>	2023-05-24 10:02:09 -07:00
Shukri	b00c77dc62	Improve weaviate vectorstore docs (#5201 ) # Improve weaviate vectorstore docs	2023-05-24 09:31:48 -07:00
Tomaz Bratanic	fd866d1801	Update Cypher QA prompt (#5173 ) # Improve Cypher QA prompt The current QA prompt is optimized for networkX answer generation, which returns all the possible triples. However, Cypher search is a bit more focused and doesn't necessary return all the context information. Due to that reason, the model sometimes refuses to generate an answer even though the information is provided: ![Screenshot from 2023-05-24 08-36-23](https://github.com/hwchase17/langchain/assets/19948365/351cf9c1-2567-447c-91fd-284ae3fa1ccf) To fix this issue, I have updated the prompt. Interestingly, I tried many variations with less instructions and they didn't work properly. However, the current fix works nicely. ![Screenshot from 2023-05-24 08-37-25](https://github.com/hwchase17/langchain/assets/19948365/fc830603-e6ec-4a23-8a86-eaf572996014)	2023-05-24 08:31:30 -07:00
Zach Schillaci	aa14e223ee	Reuse `length_func` in `MapReduceDocumentsChain` (#5181 ) # Reuse `length_func` in `MapReduceDocumentsChain` Pretty straightforward refactor in `MapReduceDocumentsChain`. Reusing the local variable `length_func`, instead of the longer alternative `self.combine_document_chain.prompt_length`. @hwchase17	2023-05-24 08:28:37 -07:00
Harrison Chase	11c26ebb55	Harrison/modelscope (#5156 ) Co-authored-by: thomas-yanxin <yx20001210@163.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 08:06:45 -07:00
Davis Chase	2d5588c5f0	bump 179 (#5200 )	2023-05-24 07:55:27 -07:00
Saba Sturua	47e4ee4370	adjust docarray docstrings (#5185 ) Follow up of https://github.com/hwchase17/langchain/pull/5015 Thanks for catching this! Just a small PR to adjust couple of strings to these changes Signed-off-by: jupyterjazz <saba.sturua@jina.ai>	2023-05-24 07:50:35 -07:00
Jeff Vestal	cf19a2a59f	example usage (#5182 ) Adding example usage for elasticsearch knn embeddings [per](https://github.com/hwchase17/langchain/pull/3401#issuecomment-1548518389) https://github.com/hwchase17/langchain/blob/master/langchain/embeddings/elasticsearch.py	2023-05-24 07:47:15 -07:00
Ikko Eltociear Ashimine	fff21a0b35	Update rellm_experimental.ipynb (#5189 ) # Your PR Title (What it does) HuggingFace -> Hugging Face	2023-05-24 11:41:00 +00:00
Nolan Tremelling	faa26650c9	Beam (#4996 ) # Beam Calls the Beam API wrapper to deploy and make subsequent calls to an instance of the gpt2 LLM in a cloud deployment. Requires installation of the Beam library and registration of Beam Client ID and Client Secret. Additional calls can then be made through the instance of the large language model in your code or by calling the Beam API. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 01:25:18 -07:00
Ofer Mendelevitch	c81fb88035	Vectara (#5069 ) # Vectara Integration This PR provides integration with Vectara. Implemented here are: * langchain/vectorstore/vectara.py * tests/integration_tests/vectorstores/test_vectara.py * langchain/retrievers/vectara_retriever.py And two IPYNB notebooks to do more testing: * docs/modules/chains/index_examples/vectara_text_generation.ipynb * docs/modules/indexes/vectorstores/examples/vectara.ipynb --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 01:24:58 -07:00
Jason Bosco	9c4b43b494	Add Typesense vector store (#1674 ) Closes #931. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-23 23:20:45 -07:00
Leonid Ganeline	33929489b9	docs: added missed `document_loaders` examples (#5150 ) # DOCS added missed document_loader examples Added missed examples: `JSON`, `Open Document Format (ODT)`, `Wikipedia`, `tomarkdown`. Updated them to a consistent format. ## Who can review? @hwchase17 @dev2049	2023-05-23 21:56:41 -07:00
Daniel Quinteros	c111134a55	Clarification of the reference to the "get_text_legth" function in ge… (#5154 ) # Clarification of the reference to the "get_text_legth" function in getting_started.md Reference to the function "get_text_legth" in the documentation did not make sense. Comment added for clarification. @hwchase17	2023-05-23 20:43:38 -07:00
Daniel Quinteros	de4ef24f75	Docs: updated getting_started.md (#5151 ) # Docs: updated getting_started.md Just accommodating some unnecessary spaces in the example of "pass few shot examples to a prompt template". @vowelparrot	2023-05-23 20:43:26 -07:00
mbchang	b1b7f3541c	fix: fix current_time=Now bug for aadd_documents in TimeWeightedRetriever (#5155 ) # Same as PR #5045, but for async <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes #4825 I had forgotten to update the asynchronous counterpart `aadd_documents` with the bug fix from PR #5045, so this PR also fixes `aadd_documents` too. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-23 20:31:45 -07:00
Jeremiah Lowin	925dd3e59e	Add async versions of predict() and predict_messages() (#4867 ) # Add async versions of predict() and predict_messages() #4615 introduced a unifying interface for "base" and "chat" LLM models via the new `predict()` and `predict_messages()` methods that allow both types of models to operate on string and message-based inputs, respectively. This PR adds async versions of the same (`apredict()` and `apredict_messages()`) that are identical except for their use of `agenerate()` in place of `generate()`, which means they repurpose all existing work on the async backend. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 (follows his work on #4615) @agola11 (async) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-23 17:22:49 -07:00
Junlin Zhou	9242998db1	Empty check before pop (#4929 ) # Check whether 'other' is empty before popping This PR could fix a potential 'popping empty set' error. Co-authored-by: Junlin Zhou <jlzhou@zjuici.com>	2023-05-23 16:46:50 -07:00
Daniel King	de6e6c764e	Add MosaicML inference endpoints (#4607 ) # Add MosaicML inference endpoints This PR adds support in langchain for MosaicML inference endpoints. We both serve a select few open source models, and allow customers to deploy their own models using our inference service. Docs are here (https://docs.mosaicml.com/en/latest/inference.html), and sign up form is here (https://forms.mosaicml.com/demo?utm_source=langchain). I'm not intimately familiar with the details of langchain, or the contribution process, so please let me know if there is anything that needs fixing or this is the wrong way to submit a new integration, thanks! I'm also not sure what the procedure is for integration tests. I have tested locally with my api key. ## Who can review? @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-23 15:59:08 -07:00
Adheeban Manoharan	68f0d45485	Adding Weather Loader (#5056 ) Co-authored-by: Tyler Hutcherson <tyler.hutcherson@redis.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-23 15:57:33 -07:00
Jeff Vestal	0b542a9706	Add ElasticsearchEmbeddings class for generating embeddings using Elasticsearch models (#3401 ) This PR introduces a new module, `elasticsearch_embeddings.py`, which provides a wrapper around Elasticsearch embedding models. The new ElasticsearchEmbeddings class allows users to generate embeddings for documents and query texts using a [model deployed in an Elasticsearch cluster](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding). ### Main features: 1. The ElasticsearchEmbeddings class initializes with an Elasticsearch connection object and a model_id, providing an interface to interact with the Elasticsearch ML client through [infer_trained_model](https://elasticsearch-py.readthedocs.io/en/v8.7.0/api.html?highlight=trained%20model%20infer#elasticsearch.client.MlClient.infer_trained_model) . 2. The `embed_documents()` method generates embeddings for a list of documents, and the `embed_query()` method generates an embedding for a single query text. 3. The class supports custom input text field names in case the deployed model expects a different field name than the default `text_field`. 4. The implementation is compatible with any model deployed in Elasticsearch that generates embeddings as output. ### Benefits: 1. Simplifies the process of generating embeddings using Elasticsearch models. 2. Provides a clean and intuitive interface to interact with the Elasticsearch ML client. 3. Allows users to easily integrate Elasticsearch-generated embeddings. Related issue https://github.com/hwchase17/langchain/issues/3400 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-23 14:50:33 -07:00
Theodore Rolle	754b5133e9	Improve PlanningOutputParser whitespace handling (#5143 ) Some LLM's will produce numbered lists with leading whitespace, i.e. in response to "What is the sum of 2 and 3?": ``` Plan: 1. Add 2 and 3. 2. Given the above steps taken, please respond to the users original question. ``` This commit updates the PlanningOutputParser regex to ignore leading whitespace before the step number, enabling it to correctly parse this format.	2023-05-23 12:47:26 -07:00
Tommaso De Lorenzo	5002f3ae35	solving #2887 (#5127 ) # Allowing openAI fine-tuned models Very simple fix that checks whether a openAI `model_name` is a fine-tuned model when loading `context_size` and when computing call's cost in the `openai_callback`. Fixes #2887 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-23 11:18:03 -07:00
Myeongseop Kim	7a75bb2121	docs: fix minor typo + add wikipedia package installation part in human_input_llm.ipynb (#5118 ) # Fix typo + add wikipedia package installation part in human_input_llm.ipynb This PR 1. Fixes typo ("the the human input LLM"), 2. Addes wikipedia package installation part (in accordance with `WikipediaQueryRun` [documentation](https://python.langchain.com/en/latest/modules/agents/tools/examples/wikipedia.html)) in `human_input_llm.ipynb` (`docs/modules/models/llms/examples/human_input_llm.ipynb`)	2023-05-23 10:59:30 -07:00
Davis Chase	753f4cfc26	bump 178 (#5130 )	2023-05-23 07:43:56 -07:00
Ayan Bandyopadhyay	5c87dbf5a8	Add link to Psychic from document loaders documentation page (#5115 ) # Add link to Psychic from document loaders documentation page In my previous PR I forgot to update `document_loaders.rst` to link to `psychic.ipynb` to make it discoverable from the main documentation.	2023-05-23 06:47:23 -07:00
Tian Wei	d7f807b71f	Add AzureCognitiveServicesToolkit to call Azure Cognitive Services API (#5012 ) # Add AzureCognitiveServicesToolkit to call Azure Cognitive Services API: achieve some multimodal capabilities This PR adds a toolkit named AzureCognitiveServicesToolkit which bundles the following tools: - AzureCogsImageAnalysisTool: calls Azure Cognitive Services image analysis API to extract caption, objects, tags, and text from images. - AzureCogsFormRecognizerTool: calls Azure Cognitive Services form recognizer API to extract text, tables, and key-value pairs from documents. - AzureCogsSpeech2TextTool: calls Azure Cognitive Services speech to text API to transcribe speech to text. - AzureCogsText2SpeechTool: calls Azure Cognitive Services text to speech API to synthesize text to speech. This toolkit can be used to process image, document, and audio inputs. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-23 06:45:48 -07:00
Jamie Broomall	d4fd589638	WhyLabs callback (#4906 ) # Add a WhyLabs callback handler * Adds a simple WhyLabsCallbackHandler * Add required dependencies as optional * protect against missing modules with imports * Add docs/ecosystem basic example based on initial prototype from @andrewelizondo > this integration gathers privacy preserving telemetry on text with whylogs and sends stastical profiles to WhyLabs platform to monitoring these metrics over time. For more information on what WhyLabs is see: https://whylabs.ai After you run the notebook (if you have env variables set for the API Keys, org_id and dataset_id) you get something like this in WhyLabs: ![Screenshot (443)](https://github.com/hwchase17/langchain/assets/88007022/6bdb3e1c-4243-4ae8-b974-23a8bb12edac) Co-authored-by: Andre Elizondo <andre@whylabs.ai> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-22 20:29:47 -07:00
Eugene Yurtsev	d56313acba	Improve effeciency of TextSplitter.split_documents, iterate once (#5111 ) # Improve TextSplitter.split_documents, collect page_content and metadata in one iteration ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev In the case where documents is a generator that can only be iterated once making this change is a huge help. Otherwise a silent issue happens where metadata is empty for all documents when documents is a generator. So we expand the argument from `List[Document]` to `Union[Iterable[Document], Sequence[Document]]` --------- Co-authored-by: Steven Tartakovsky <tartakovsky.developer@gmail.com>	2023-05-22 23:00:24 -04:00
Jettro Coenradie	b950022894	Fixes issue #5072 - adds additional support to Weaviate (#5085 ) Implementation is similar to search_distance and where_filter # adds 'additional' support to Weaviate queries Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-22 18:57:10 -07:00
Zander Chase	87bba2e8d3	Pass Dataset Name by Name not Position (#5108 ) Pass dataset name by name	2023-05-23 01:21:39 +00:00
Matt Rickard	de6a401a22	Add OpenLM LLM multi-provider (#4993 ) OpenLM is a zero-dependency OpenAI-compatible LLM provider that can call different inference endpoints directly via HTTP. It implements the OpenAI Completion class so that it can be used as a drop-in replacement for the OpenAI API. This changeset utilizes BaseOpenAI for minimal added code. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-22 18:09:53 -07:00
Gergely Imreh	69de33e024	Add Mastodon toots loader (#5036 ) # Add Mastodon toots loader. Loader works either with public toots, or Mastodon app credentials. Toot text and user info is loaded. I've also added integration test for this new loader as it works with public data, and a notebook with example output run now. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-22 16:43:07 -07:00
mbchang	e173e032bc	fix: assign current_time to datetime.now() if current_time is None (#5045 ) # Assign `current_time` to `datetime.now()` if it `current_time is None` in `time_weighted_retriever` Fixes #4825 As implemented, `add_documents` in `TimeWeightedVectorStoreRetriever` assigns `doc.metadata["last_accessed_at"]` and `doc.metadata["created_at"]` to `datetime.datetime.now()` if `current_time` is not in `kwargs`. ```python def add_documents(self, documents: List[Document], kwargs: Any) -> List[str]: """Add documents to vectorstore.""" current_time = kwargs.get("current_time", datetime.datetime.now()) # Avoid mutating input documents dup_docs = [deepcopy(d) for d in documents] for i, doc in enumerate(dup_docs): if "last_accessed_at" not in doc.metadata: doc.metadata["last_accessed_at"] = current_time if "created_at" not in doc.metadata: doc.metadata["created_at"] = current_time doc.metadata["buffer_idx"] = len(self.memory_stream) + i self.memory_stream.extend(dup_docs) return self.vectorstore.add_documents(dup_docs, kwargs) ``` However, from the way `add_documents` is being called from `GenerativeAgentMemory`, `current_time` is set as a `kwarg`, but it is given a value of `None`: ```python def add_memory( self, memory_content: str, now: Optional[datetime] = None ) -> List[str]: """Add an observation or memory to the agent's memory.""" importance_score = self._score_memory_importance(memory_content) self.aggregate_importance += importance_score document = Document( page_content=memory_content, metadata={"importance": importance_score} ) result = self.memory_retriever.add_documents([document], current_time=now) ``` The default of `now` was set in #4658 to be None. The proposed fix is the following: ```python def add_documents(self, documents: List[Document], **kwargs: Any) -> List[str]: """Add documents to vectorstore.""" current_time = kwargs.get("current_time", datetime.datetime.now()) # `current_time` may exist in kwargs, but may still have the value of None. if current_time is None: current_time = datetime.datetime.now() ``` Alternatively, we could just set the default of `now` to be `datetime.datetime.now()` everywhere instead. Thoughts @hwchase17? If we still want to keep the default to be `None`, then this PR should fix the above issue. If we want to set the default to be `datetime.datetime.now()` instead, I can update this PR with that alternative fix. EDIT: seems like from #5018 it looks like we would prefer to keep the default to be `None`, in which case this PR should fix the error.	2023-05-22 15:47:03 -07:00
Leonid Ganeline	c28cc0f1ac	changed ValueError to ImportError (#5103 ) # changed ValueError to ImportError Code cleaning. Fixed inconsistencies in ImportError handling. Sometimes it raises ImportError and sometime ValueError. I've changed all cases to the `raise ImportError` Also: - added installation instruction in the error message, where it missed; - fixed several installation instructions in the error message; - fixed several error handling in regards to the ImportError	2023-05-22 15:24:45 -07:00
venetisgr	5e47c648ed	Update serpapi.py (#4947 ) Added link option in _process_response <!-- In _process_respons "snippet" provided non working links for the case that "links" had the correct answer. Thus added an elif statement before snippet --> <!-- Remove if not applicable --> Fixes # (issue) In _process_response link provided correct answers while the snippet reply provided non working links @vowelparrot ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-22 13:34:36 -07:00
Ankit Arya	5b2b436fab	Fixed import error for AutoGPT e.g. from langchain.experimental.auton… (#5101 ) `from langchain.experimental.autonomous_agents.autogpt.agent import AutoGPT` results in an import error as AutoGPT is not defined in the __init__.py file https://python.langchain.com/en/latest/use_cases/autonomous_agents/marathon_times.html An Alternate, way would be to be directly update the import statement to be `from langchain.experimental import AutoGPT` Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-22 13:26:25 -07:00
Ankush Gola	467ca6f025	update langchainplus client and docker file to reflect port changes (#5005 ) # Currently, only the dev images are updated	2023-05-22 12:53:05 -07:00
Shawn91	9e649462ce	fix: add_texts method of Weaviate vector store creats wrong embeddings (#4933 ) # fix a bug in the add_texts method of Weaviate vector store that creats wrong embeddings The following is the original code in the `add_texts` method of the Weaviate vector store, from line 131 to 153, which contains a bug. The code here includes some extra explanations in the form of comments and some omissions. ```python for i, doc in enumerate(texts): # some code omitted if self._embedding is not None: # variable texts is a list of string and doc here is just a string. # list(doc) actually breaks up the string into characters. # so, embeddings[0] is just the embedding of the first character embeddings = self._embedding.embed_documents(list(doc)) batch.add_data_object( data_object=data_properties, class_name=self._index_name, uuid=_id, vector=embeddings[0], ) ``` To fix this bug, I pulled the embedding operation out of the for loop and embed all texts at once. Co-authored-by: Shawn91 <zyx199199@qq.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-22 12:35:52 -07:00
Eduard van Valkenburg	1cb04f2b26	PowerBI major refinement in working of tool and tweaks in the rest (#5090 ) # PowerBI major refinement in working of tool and tweaks in the rest I've gained some experience with more complex sets and the earlier implementation had too many tries by the agent to create DAX, so refactored the code to run the LLM to create dax based on a question and then immediately run the same against the dataset, with retries and a prompt that includes the error for the retry. This works much better! Also did some other refactoring of the inner workings, making things clearer, more concise and faster.	2023-05-22 11:58:28 -07:00
hwaking	e57ebf3922	add get_top_k_cosine_similarity method to get max top k score and index (#5059 ) # Row-wise cosine similarity between two equal-width matrices and return the max top_k score and index, the score all greater than threshold_score. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-22 11:55:48 -07:00
Donger	039f8f1abb	Add the usage of SSL certificates for Elasticsearch and user password authentication (#5058 ) Enhance the code to support SSL authentication for Elasticsearch when using the VectorStore module, as previous versions did not provide this capability. @dev2049 --------- Co-authored-by: caidong <zhucaidong1992@gmail.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-22 11:51:32 -07:00
Andreas Liebschner	44dc959584	Improve pinecone hybrid search retriever adding metadata support (#5098 ) # Improve pinecone hybrid search retriever adding metadata support I simply remove the hardwiring of metadata to the existing implementation allowing one to pass `metadatas` attribute to the constructors and in `get_relevant_documents`. I also add one missing pip install to the accompanying notebook (I am not adding dependencies, they were pre-existing). First contribution, just hoping to help, feel free to critique :) my twitter username is `@andreliebschner` While looking at hybrid search I noticed #3043 and #1743. I think the former can be closed as following the example right now (even prior to my improvements) works just fine, the latter I think can be also closed safely, maybe pointing out the relevant classes and example. Should I reply those issues mentioning someone? @dev2049, @hwchase17 --------- Co-authored-by: Andreas Liebschner <a.liebschner@shopfully.com>	2023-05-22 11:42:54 -07:00
Deepak S V	5cd12102be	Improving Resilience of MRKL Agent (#5014 ) This is a highly optimized update to the pull request https://github.com/hwchase17/langchain/pull/3269 Summary: 1) Added ability to MRKL agent to self solve the ValueError(f"Could not parse LLM output: `{llm_output}`") error, whenever llm (especially gpt-3.5-turbo) does not follow the format of MRKL Agent, while returning "Action:" & "Action Input:". 2) The way I am solving this error is by responding back to the llm with the messages "Invalid Format: Missing 'Action:' after 'Thought:'" & "Invalid Format: Missing 'Action Input:' after 'Action:'" whenever Action: and Action Input: are not present in the llm output respectively. For a detailed explanation, look at the previous pull request. New Updates: 1) Since @hwchase17 , requested in the previous PR to communicate the self correction (error) message, using the OutputParserException, I have added new ability to the OutputParserException class to store the observation & previous llm_output in order to communicate it to the next Agent's prompt. This is done, without breaking/modifying any of the functionality OutputParserException previously performs (i.e. OutputParserException can be used in the same way as before, without passing any observation & previous llm_output too). --------- Co-authored-by: Deepak S V <svdeepak99@users.noreply.github.com>	2023-05-22 11:08:08 -07:00
Michael Landis	6eacd88ae7	fix: revert docarray explicit transitive dependencies and use extras instead (#5015 ) tldr: The docarray [integration PR](https://github.com/hwchase17/langchain/pull/4483) introduced a pinned dependency to protobuf. This is a docarray dependency, not a langchain dependency. Since this is handled by the docarray dependencies, it is unnecessary here. Further, as a pinned dependency, this quickly leads to incompatibilities with application code that consumes the library. Much less with a heavily used library like protobuf. Detail: as we see in the [docarray integration](https://github.com/hwchase17/langchain/pull/4483/files#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711R81-R83), the transitive dependencies of docarray were also listed as langchain dependencies. This is unnecessary as the docarray project has an appropriate [extras](`a01a05542d/pyproject.toml (L70)`). The docarray project also does not require this _pinned_ version of protobuf, rather [a minimum version](`a01a05542d/pyproject.toml (L41)`). So this pinned version was likely in error. To fix this, this PR reverts the explicit hnswlib and protobuf dependencies and adds the hnswlib extras install for docarray (which installs hnswlib and protobuf, as originally intended). Because version `0.32.0` of the docarray hnswlib extras added protobuf, we bump the docarray dependency from `^0.31.0` to `^0.32.0`. # revert docarray explicit transitive dependencies and use extras instead ## Who can review? @dev2049 -- reviewed the original PR @eyurtsev -- bumped the pinned protobuf dependency a few days ago --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-22 12:48:09 -04:00
Davis Chase	fcd88bccb3	Bump 177 (#5095 )	2023-05-22 08:19:06 -07:00
Harrison Chase	10ba201d05	Harrison/neo4j (#5078 ) Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-22 07:31:48 -07:00
Deepak S V	49ca02711e	Improved query, print & exception handling in REPL Tool (#4997 ) Update to pull request https://github.com/hwchase17/langchain/pull/3215 Summary: 1) Improved the sanitization of query (using regex), by removing python command (since gpt-3.5-turbo sometimes assumes python console as a terminal, and runs python command first which causes error). Also sometimes 1 line python codes contain single backticks. 2) Added 7 new test cases. For more details, view the previous pull request. --------- Co-authored-by: Deepak S V <svdeepak99@users.noreply.github.com>	2023-05-22 13:43:44 +00:00
Zander Chase	785502edb3	Add 'get_token_ids' method (#4784 ) Let user inspect the token ids in addition to getting th enumber of tokens --------- Co-authored-by: Zach Schillaci <40636930+zachschillaci27@users.noreply.github.com>	2023-05-22 13:17:26 +00:00
Zander Chase	ef7d015be5	Separate Runner Functions from Client (#5079 ) Extract the methods specific to running an LLM or Chain on a dataset to separate utility functions. This simplifies the client a bit and lets us separate concerns of LCP details from running examples (e.g., for evals)	2023-05-22 05:28:47 +00:00
Leonid Ganeline	443ebe22f4	docs: `Deployments` page moved into `Ecosystem/` (#4949 ) # docs: `deployments` page moved into `ecosystem/` The `Deployments` page moved into the `Ecosystem/` group Small fixes: - `index` page: fixed order of items in the `Modules` list, in the `Use Cases` list - item `References/Installation` was lost in the `index` page (not on the Navbar!). Restored it. - added `\|` marker in several places. NOTE: I also thought about moving the `Additional Resources/Gallery` page into the `Ecosystem` group but decided to leave it unchanged. Please, advise on this. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049	2023-05-21 21:18:22 -07:00
Hans van Dam	a395ff7c90	preserve language in conversation retrieval (#4969 ) Without the addition of 'in its original language', the condensing response, more often than not, outputs the rephrased question in English, even when the conversation is in another language. This question in English then transfers to the question in the retrieval prompt and the chatbot is stuck in English. I'm sometimes surprised that this does not happen more often, but apparently the GPT models are smart enough to understand that when the template contains Question: .... Answer: then the answer should be in in the language of the question.	2023-05-21 21:16:03 -07:00
Matt Robinson	bf3f554357	feat: batch multiple files in a single Unstructured API request (#4525 ) ### Submit Multiple Files to the Unstructured API Enables batching multiple files into a single Unstructured API requests. Support for requests with multiple files was added to both `UnstructuredAPIFileLoader` and `UnstructuredAPIFileIOLoader`. Note that if you submit multiple files in "single" mode, the result will be concatenated into a single document. We recommend using this feature in "elements" mode. ### Testing The following should load both documents, using two of the example docs from the integration tests folder. ```python from langchain.document_loaders import UnstructuredAPIFileLoader file_paths = ["examples/layout-parser-paper.pdf", "examples/whatsapp_chat.txt"] loader = UnstructuredAPIFileLoader( file_paths=file_paths, api_key="FAKE_API_KEY", strategy="fast", mode="elements", ) docs = loader.load() ```	2023-05-21 20:48:20 -07:00
Harrison Chase	0c3de0a0b3	Merge branch 'master' of github.com:hwchase17/langchain	2023-05-21 09:22:43 -07:00
Harrison Chase	224f73e978	move docs	2023-05-21 09:22:35 -07:00
Harrison Chase	6c25f860fd	bump to 176 (#5064 )	2023-05-21 09:19:25 -07:00
Harrison Chase	b0431c672b	Harrison/psychic (#5063 ) Co-authored-by: Ayan Bandyopadhyay <ayanb9440@gmail.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-21 09:13:20 -07:00
Harrison Chase	8c661baefb	change to type checking (#5062 )	2023-05-21 09:09:49 -07:00
Jeffrey Zheng	424a573266	DOC: Misspelling in agents.rst documentation (#5038 ) # Corrected Misspelling in agents.rst Documentation <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get --> In the [documentation](https://python.langchain.com/en/latest/modules/agents.html) it says "in fact, it is often best to have an Action Agent be in change of the execution for the Plan and Execute agent." Suggested Change: I propose correcting change to charge. Fix for issue: #5039	2023-05-20 22:24:08 -07:00
Gengliang Wang	f9f08c4b69	Add documentation for Databricks integration (#5013 ) # Add documentation for Databricks integration This is a follow-up of https://github.com/hwchase17/langchain/pull/4702 It documents the details of how to integrate Databricks using langchain. It also provides examples in a notebook. ## Who can review? @dev2049 @hwchase17 since you are aware of the context. We will promote the integration after this doc is ready. Thanks in advance!	2023-05-20 22:06:24 -07:00
tornikeo	a6ef20d7fe	Fix annoying typo in docs (#5029 ) # Fixes an annoying typo in docs <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes Annoying typo in docs - "Therefor" -> "Therefore". It's so annoying to read that I just had to make this PR.	2023-05-20 22:02:21 -07:00
Davis Chase	9d1280d451	bump v175 (#5041 )	2023-05-20 09:24:17 -07:00
UmerHA	7388248b3e	Streaming only final output of agent (#2483 ) (#4630 ) # Streaming only final output of agent (#2483) As requested in issue #2483, this Callback allows to stream only the final output of an agent (ie not the intermediate steps). Fixes #2483 Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-20 09:20:17 -07:00
Davis Chase	3bc0bf0079	fix prompt saving (#4987 ) will add unit tests	2023-05-20 08:21:52 -07:00
Zander Chase	27e63b977a	Add logs command (#5007 ) to the plus server	2023-05-20 00:06:17 +00:00
Marcus Winter	2aa3754024	Check for single prompt in __call__ method of the BaseLLM class (#4892 ) # Ensuring that users pass a single prompt when calling a LLM - This PR adds a check to the `__call__` method of the `BaseLLM` class to ensure that it is called with a single prompt - Raises a `ValueError` if users try to call a LLM with a list of prompt and instructs them to use the `generate` method instead ## Why this could be useful I stumbled across this by accident. I accidentally called the OpenAI LLM with a list of prompts instead of a single string and still got a result: ``` >>> from langchain.llms import OpenAI >>> llm = OpenAI() >>> llm(["Tell a joke"]2) "\n\nQ: Why don't scientists trust atoms?\nA: Because they make up everything!" ``` It might be better to catch such a scenario preventing unnecessary costs and irritation for the user. ## Proposed behaviour ``` >>> from langchain.llms import OpenAI >>> llm = OpenAI() >>> llm(["Tell a joke"]2) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/marcus/Projects/langchain/langchain/llms/base.py", line 291, in __call__ raise ValueError( ValueError: Argument `prompt` is expected to be a single string, not a list. If you want to run the LLM on multiple prompts, use `generate` instead. ```	2023-05-19 16:54:26 -07:00
domchan	6c60251f52	Add self query translator for weaviate vectorstore (#4804 ) # Add self query translator for weaviate vectorstore Adds support for the EQ comparator and the AND/OR operators. Co-authored-by: Dominic Chan <dchan@cppib.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-19 16:41:12 -07:00
Davis Chase	9928fb2193	Revert "API update: Engines -> Models (#4915 )" (#5008 ) This reverts commit `8c28ad6dac`. Seems to be causing #5001	2023-05-19 16:38:08 -07:00
SimFG	f07b9fde74	Update the GPTCache example (#4985 ) # Update the GPTCache example Fixes #4757	2023-05-19 16:35:36 -07:00
Leonid Ganeline	ddc2d4c21e	added instruction about pip install google-gerativeai (#5004 ) # added instruction about pip install google-gerativeai added instruction about pip install google-gerativeai	2023-05-19 15:32:24 -07:00
Nicolas	02632d52b3	docs: Big Mendable Improvements (#4964 ) - Higher accuracy on the responses - New redesigned UI - Pretty Sources: display the sources by title / sub-section instead of long URL. - Fixed Reset Button bugs and some other UI issues - Other tweaks	2023-05-19 15:31:48 -07:00
Leonid Ganeline	2ab0e1d526	changed ValueError to ImportError (#5006 ) # changed ValueError to ImportError in except Several places with this bug. ValueError does not catch ImportError.	2023-05-19 15:28:08 -07:00
Davis Chase	080eb1b3fc	Fix graphql tool (#4984 ) Fix construction and add unit test.	2023-05-19 15:27:50 -07:00
Mike McGarry	ddd595fe81	feature/4493 Improve Evernote Document Loader (#4577 ) # Improve Evernote Document Loader When exporting from Evernote you may export more than one note. Currently the Evernote loader concatenates the content of all notes in the export into a single document and only attaches the name of the export file as metadata on the document. This change ensures that each note is loaded as an independent document and all available metadata on the note e.g. author, title, created, updated are added as metadata on each document. It also uses an existing optional dependency of `html2text` instead of `pypandoc` to remove the need to download the pandoc application via `download_pandoc()` to be able to use the `pypandoc` python bindings. Fixes #4493 Co-authored-by: Mike McGarry <mike.mcgarry@finbourne.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-19 14:28:17 -07:00
Juanma Tristancho	729e935ea4	PGVector logger message level (#4920 ) # Change the logger message level The library is logging at `error` level a situation that is not an error. We noticed this error in our logs, but from our point of view it's an expected behavior and the log level should be `warning`.	2023-05-19 14:01:26 -07:00
Peng Wang	62d0a01a0f	Update python.py (#4971 ) # Delete a useless "print"	2023-05-19 13:57:16 -07:00
Eugene Yurtsev	0ff59569dc	Adds 'IN' metadata filter for pgvector for checking set presence (#4982 ) # Adds "IN" metadata filter for pgvector to all checking for set presence PGVector currently supports metadata filters of the form: ``` {"filter": {"key": "value"}} ``` which will return documents where the "key" metadata field is equal to "value". This PR adds support for metadata filters of the form: ``` {"filter": {"key": { "IN" : ["list", "of", "values"]}}} ``` Other vector stores support this via an "$in" syntax. I chose to use "IN" to match postgres' syntax, though happy to switch. Tested locally with PGVector and ChatVectorDBChain. @dev2049 --------- Co-authored-by: jade@spanninglabs.com <jade@spanninglabs.com>	2023-05-19 13:53:23 -07:00
Davis Chase	56cb77a828	Make test gha workflow manually runnable (#4998 ) if https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#workflow_dispatch is to be believed this should make it possible to manually kick of test workflow, but i don't know much about these things	2023-05-19 13:46:33 -07:00
Jiaping(JP) Zhang	22d844dc07	Add async search with relevance score (#4558 ) Add the async version for the search with relevance score Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-19 13:05:24 -07:00
Adheeban Manoharan	616e9a93e0	Bug fixes and error handling in Redis - Vectorstore (#4932 ) # Bug fixes in Redis - Vectorstore (Added the version of redis to the error message and removed the cls argument from a classmethod) Co-authored-by: Tyler Hutcherson <tyler.hutcherson@redis.com>	2023-05-19 13:02:03 -07:00
Gengliang Wang	a87a2524c7	Remove autoreload in examples (#4994 ) # Remove autoreload in examples Remove the `autoreload` in examples since it is not necessary for most users: ``` %load_ext autoreload, %autoreload 2 ```	2023-05-19 17:35:58 +00:00
Davis Chase	2abf6b9f17	bump v0.0.174 (#4988 )	2023-05-19 09:34:28 -07:00
Eugene Yurtsev	06e524416c	power bi api wrapper integration tests & bug fix (#4983 ) # Powerbi API wrapper bug fix + integration tests - Bug fix by removing `TYPE_CHECKING` in in utilities/powerbi.py - Added integration test for power bi api in utilities/test_powerbi_api.py - Added integration test for power bi agent in agent/test_powerbi_agent.py - Edited .env.examples to help set up power bi related environment variables - Updated demo notebook with working code in docs../examples/powerbi.ipynb - AzureOpenAI -> ChatOpenAI Notes: Chat models (gpt3.5, gpt4) are much more capable than davinci at writing DAX queries, so that is important to getting the agent to work properly. Interestingly, gpt3.5-turbo needed the examples=DEFAULT_FEWSHOT_EXAMPLES to write consistent DAX queries, so gpt4 seems necessary as the smart llm. Fixes #4325 ## Before submitting Azure-core and Azure-identity are necessary dependencies check integration tests with the following: `pytest tests/integration_tests/utilities/test_powerbi_api.py` `pytest tests/integration_tests/agent/test_powerbi_agent.py` You will need a power bi account with a dataset id + table name in order to test. See .env.examples for details. ## Who can review? @hwchase17 @vowelparrot --------- Co-authored-by: aditya-pethe <adityapethe1@gmail.com>	2023-05-19 11:25:52 -04:00
Viswanadh Rayavarapu	e68dfa7062	Update planner_prompt.py (#4967 ) Typos in the OpenAPI agent Prompt.	2023-05-19 11:17:10 -04:00
Edrick Da Corte Henriquez	e80585bab0	Update tutorials.md (#4960 ) # Added a YouTube Tutorial Added a LangChain tutorial playlist aimed at onboarding newcomers to LangChain and its use cases. I've shared the video in the #tutorials channel and it seemed to be well received. I think this could be useful to the greater community. ## Who can review? @dev2049	2023-05-19 10:40:14 -04:00
Rahul Rao	13c376345e	Fixed assumptions misspelling (#4961 ) Fixed assumptions misspelling in the link mentioned below:- https://python.langchain.com/en/latest/modules/chains/examples/llm_summarization_checker.html ![image](https://github.com/hwchase17/langchain/assets/16189966/94cf2be0-b3d0-495b-98ad-e1f44331727e) Fix for Issue:- #4959 @hwchase17	2023-05-19 10:40:04 -04:00
Gengliang Wang	bf5a3c6dec	Support Databricks in SQLDatabase (#4702 ) This PR adds support for Databricks runtime and Databricks SQL by using [Databricks SQL Connector for Python](https://docs.databricks.com/dev-tools/python-sql-connector.html). As a cloud data platform, accessing Databricks requires a URL as follows `databricks://token:{api_token}@{hostname}?http_path={http_path}&catalog={catalog}&schema={schema}`. The URL is complicated and it may take users a while to figure it out. Since the fields `api_token`/`hostname`/`http_path` fields are known in the Databricks notebook, I am proposing a new method `from_databricks` to simplify the connection to Databricks. ## In Databricks Notebook After changes, Databricks users only need to specify the `catalog` and `schema` field when using langchain. <img width="881" alt="image" src="https://github.com/hwchase17/langchain/assets/1097932/984b4c57-4c2d-489d-b060-5f4918ef2f37"> ## In Jupyter Notebook The method can be used on the local setup as well: <img width="678" alt="image" src="https://github.com/hwchase17/langchain/assets/1097932/142e8805-a6ef-4919-b28e-9796ca31ef19">	2023-05-19 00:42:06 -07:00
Harrison Chase	88a3a56c1a	Add Spark SQL support (#4602 ) (#4956 ) # Add Spark SQL support * Add Spark SQL support. It can connect to Spark via building a local/remote SparkSession. * Include a notebook example I tried some complicated queries (window function, table joins), and the tool works well. Compared to the [Spark Dataframe agent](https://python.langchain.com/en/latest/modules/agents/toolkits/examples/spark.html), this tool is able to generate queries across multiple tables. --------- # Your PR Title (What it does) <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Gengliang Wang <gengliang@apache.org> Co-authored-by: Mike W <62768671+skcoirz@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: UmerHA <40663591+UmerHA@users.noreply.github.com> Co-authored-by: 张城铭 <z@hyperf.io> Co-authored-by: assert <zhangchengming@kkguan.com> Co-authored-by: blob42 <spike@w530> Co-authored-by: Yuekai Zhang <zhangyuekai@foxmail.com> Co-authored-by: Richard He <he.yucheng@outlook.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com> Co-authored-by: Alexey Nominas <60900649+Chae4ek@users.noreply.github.com> Co-authored-by: elBarkey <elbarkey@gmail.com> Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com> Co-authored-by: Jeffrey D <1289344+verygoodsoftwarenotvirus@users.noreply.github.com> Co-authored-by: so2liu <yangliu35@outlook.com> Co-authored-by: Viswanadh Rayavarapu <44315599+vishwa-rn@users.noreply.github.com> Co-authored-by: Chakib Ben Ziane <contact@blob42.xyz> Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com> Co-authored-by: Daniel Chalef <daniel.chalef@private.org> Co-authored-by: Jari Bakken <jari.bakken@gmail.com> Co-authored-by: escafati <scafatieugenio@gmail.com>	2023-05-18 20:53:08 -07:00
Harrison Chase	5feb60f426	Harrison/spell executor (#4914 ) Co-authored-by: Jan Minar <rdancer@rdancer.org>	2023-05-18 20:43:33 -07:00
Aidan Boland	c06973261a	Fix for syntax when setting search_path for Snowflake database (#4747 ) # Fixes syntax for setting Snowflake database search_path An error occurs when using a Snowflake database and providing a schema argument. I have updated the syntax to run a Snowflake specific query when the database dialect is 'snowflake'.	2023-05-18 20:30:38 -07:00
Mike Wang	db6f7ed0ba	[nit] Simplify Spark Creation Validation Check A Little Bit (#4761 ) - simplify the validation check a little bit. - re-tested in jupyter notebook. Reviewer: @hwchase17	2023-05-18 18:57:54 -07:00
escafati	e027a38f33	NIT: Instead of hardcoding k in each definition, define it as a param above. (#2675 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com>	2023-05-18 17:35:31 -07:00
Jari Bakken	3df2d831f9	Fix get_num_tokens for Anthropic models (#4911 ) The Anthropic classes used `BaseLanguageModel.get_num_tokens` because of an issue with multiple inheritance. Fixed by moving the method from `_AnthropicCommon` to both its subclasses. This change will significantly speed up token counting for Anthropic users.	2023-05-18 16:32:27 -07:00
Daniel Chalef	c8c2276ccb	Zep Retriever - Vector Search Over Chat History (#4533 ) # Zep Retriever - Vector Search Over Chat History with the Zep Long-term Memory Service More on Zep: https://github.com/getzep/zep Note: This PR is related to and relies on https://github.com/hwchase17/langchain/pull/4834. I did not want to modify the `pyproject.toml` file to add the `zep-python` dependency a second time. Co-authored-by: Daniel Chalef <daniel.chalef@private.org>	2023-05-18 16:27:18 -07:00
Chakib Ben Ziane	5525b704cc	Chatconv agent: output parser exception (#4923 ) the output parser form chat conversational agent now raises `OutputParserException` like the rest. The `raise OutputParserExeption(...) from e` form also carries through the original error details on what went wrong. I added the `ValueError` as a base class to `OutputParserException` to avoid breaking code that was relying on `ValueError` as a way to catch exceptions from the agent. So catching ValuError still works. Not sure if this is a good idea though ?	2023-05-18 16:20:35 -07:00
Leonid Ganeline	a9bb3147d7	docs: vectorstores, different updates and fixes (#4939 ) # docs: vectorstores, different updates and fixes Multiple updates: - added/improved descriptions - fixed header levels - added headers - fixed headers	2023-05-18 15:35:47 -07:00
Leonid Ganeline	8f8593aac5	docs: added `ecosystem/dependents` page (#4941 ) # docs: added `ecosystem/dependents` page Added `ecosystem/dependents` page. Can we propose a better page name?	2023-05-18 13:11:08 -07:00
Viswanadh Rayavarapu	c9f963e295	Update custom_multi_action_agent.ipynb (#4931 ) Updated the docs from "An agent consists of three parts:" to "An agent consists of two parts:" since there are only two parts in the documentation	2023-05-18 11:53:12 -07:00
so2liu	3002c1d508	fix: error in gptcache example nb (#4930 )	2023-05-18 11:49:45 -07:00
Jeffrey D	7e8e21c914	Correct typo in APIChain example notebook (Farenheit -> Fahrenheit) (#4938 ) Correct typo in APIChain example notebook (Farenheit -> Fahrenheit)	2023-05-18 11:48:02 -07:00
Leonid Ganeline	c75c0775e1	docs supabase update (#4935 ) # docs: updated `Supabase` notebook - the title of the notebook was inconsistent (included redundant "Vectorstore"). Removed this "Vectorstore" - added `Postgress` to the title. It is important. The `Postgres` name is much more popular than `Supabase`. - added description for the `Postrgress` - added more info to the `Supabase` description	2023-05-18 10:42:08 -07:00
Davis Chase	55baa0d153	Update redis integration tests (#4937 )	2023-05-18 10:22:17 -07:00
Davis Chase	440b8761f4	Redis kwargs fix (#4936 ) cc @tylerhutcherson	2023-05-18 10:02:46 -07:00
elBarkey	a8ded21b69	FIX: GPTCache cache_obj creation loop (#4827 ) _get_gptcache method keep creating new gptcache instance, here's the fix # Fix GPTCache cache_obj creation loop Fixes #4830 Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-18 09:42:35 -07:00
Alexey Nominas	c9e2a01875	Update GPT4ALL integration (#4567 ) # Update GPT4ALL integration GPT4ALL have completely changed their bindings. They use a bit odd implementation that doesn't fit well into base.py and it will probably be changed again, so it's a temporary solution. Fixes #3839, #4628	2023-05-18 09:38:54 -07:00
Leonid Ganeline	e2d7677526	docs: compound ecosystem and integrations (#4870 ) # Docs: compound ecosystem and integrations Problem statement: We have a big overlap between the References/Integrations and Ecosystem/LongChain Ecosystem pages. It confuses users. It creates a situation when new integration is added only on one of these pages, which creates even more confusion. - removed References/Integrations page (but move all its information into the individual integration pages - in the next PR). - renamed Ecosystem/LongChain Ecosystem into Integrations/Integrations. I like the Ecosystem term. It is more generic and semantically richer than the Integration term. But it mentally overloads users. The `integration` term is more concrete. UPDATE: after discussion, the Ecosystem is the term. Ecosystem/Integrations is the page (in place of Ecosystem/LongChain Ecosystem). As a result, a user gets a single place to start with the individual integration.	2023-05-18 09:29:57 -07:00
Harrison Chase	d5a0704544	dont error on sql import (#4647 ) this makes it so we dont throw errors when importing langchain when sqlalchemy==1.3.1 we dont really want to support 1.3.1 (seems like unneccessary maintance cost) BUT we would like it to not terribly error should someone decide to run on it	2023-05-18 09:27:09 -07:00
Harrison Chase	c9a362e482	add alias for model (#4553 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-18 09:12:23 -07:00
Richard He	7642f2159c	Add human message as input variable to chat agent prompt creation (#4542 ) # Add human message as input variable to chat agent prompt creation This PR adds human message and system message input to `CHAT_ZERO_SHOT_REACT_DESCRIPTION` agent, similar to [conversational chat agent](`7bcf238a1a/langchain/agents/conversational_chat/base.py (L64-L71)`). I met this issue trying to use `create_prompt` function when using the [BabyAGI agent with tools notebook](https://python.langchain.com/en/latest/use_cases/autonomous_agents/baby_agi_with_agent.html), since BabyAGI uses “task” instead of “input” input variable. For normal zero shot react agent this is fine because I can manually change the suffix to “{input}/n/n{agent_scratchpad}” just like the notebook, but I cannot do this with conversational chat agent, therefore blocking me to use BabyAGI with chat zero shot agent. I tested this in my own project [Chrome-GPT](https://github.com/richardyc/Chrome-GPT) and this fix worked. ## Request for review Agents / Tools / Toolkits - @vowelparrot	2023-05-18 09:09:31 -07:00
Yuekai Zhang	1ed4228822	Fix bilibili (#4860 ) # Fix bilibili api import error bilibili-api package is depracated and there is no sync module. <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes #2673 #2724 ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @vowelparrot @liaokongVFX <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-18 09:56:51 -04:00
Eugene Yurtsev	e46202829f	feat #4479 : TextLoader auto detect encoding and improved exceptions (#4927 ) # TextLoader auto detect encoding and enhanced exception handling - Add an option to enable encoding detection on `TextLoader`. - The detection is done using `chardet` - The loading is done by trying all detected encodings by order of confidence or raise an exception otherwise. ### New Dependencies: - `chardet` Fixes #4479 ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @eyurtsev --------- Co-authored-by: blob42 <spike@w530>	2023-05-18 09:55:14 -04:00
张城铭	8c28ad6dac	API update: Engines -> Models (#4915 ) # API update: Engines -> Models see: https://community.openai.com/t/api-update-engines-models/18597 Co-authored-by: assert <zhangchengming@kkguan.com>	2023-05-18 09:54:42 -04:00
Eugene Yurtsev	c06a47a691	Load specific file types from Google Drive (issue #4878 ) (#4926 ) # Load specific file types from Google Drive (issue #4878) Add the possibility to define what file types you want to load from Google Drive. ``` loader = GoogleDriveLoader( folder_id="1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5", file_types=["document", "pdf"] recursive=False ) ``` Fixes ##4878 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: DataLoaders - @eyurtsev Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) \| Discord: RicChilligerDude#7589 --------- Co-authored-by: UmerHA <40663591+UmerHA@users.noreply.github.com>	2023-05-18 09:27:53 -04:00
Harrison Chase	dfbf45f028	bump version to 173 (#4910 )	2023-05-17 23:36:45 -07:00
Harrison Chase	b8d48939a2	Harrison/unified objectives (#4905 ) Co-authored-by: Matthias Samwald <samwald@gmx.at>	2023-05-17 23:03:57 -07:00
Harrison Chase	9165267f8a	Harrison/improved retry tool (#4842 )	2023-05-17 21:41:01 -07:00
Harrison Chase	ba023d53ca	Harrison/faiss norm (#4903 ) Co-authored-by: Jiaxin Shan <seedjeffwan@gmail.com>	2023-05-17 21:40:49 -07:00
Harrison Chase	9e2227ba11	Harrison/serper api bug (#4902 ) Co-authored-by: Jerry Luan <xmaswillyou@gmail.com>	2023-05-17 21:40:39 -07:00
Leonid Ganeline	c998569c8f	docs: text splitters improvements (#4490 ) #docs: text splitters improvements Changes are only in the Jupyter notebooks. - added links to the source packages and a short description of these packages - removed " Text Splitters" suffixes from the TOC elements (they made the list of the text splitters messy) - moved text splitters, based on the length function into a separate list. They can be mixed with any classes from the "Text Splitters", so it is a different classification. ## Who can review? @hwchase17 - project lead @eyurtsev @vowelparrot NOTE: please, check out the results of the `Python code` text splitter example (text_splitters/examples/python.ipynb). It looks suboptimal.	2023-05-17 21:33:34 -07:00
Steve Kim	613bf9b514	Update getting_started.md (#4482 ) # Added another helpful way for developers who want to set OpenAI API Key dynamically Previous methods like exporting environment variables are good for project-wide settings. But many use cases need to assign API keys dynamically, recently. ```python from langchain.llms import OpenAI llm = OpenAI(openai_api_key="OPENAI_API_KEY") ``` ## Before submitting ```bash export OPENAI_API_KEY="..." ``` Or, ```python import os os.environ["OPENAI_API_KEY"] = "..." ``` <hr> Thank you. Cheers, Bongsang	2023-05-17 21:32:25 -07:00
Ismael G Serrano	41e2394c9c	Fix AzureOpenAI embeddings documentation example. model -> deployment (#4389 ) # Documentation for Azure OpenAI embeddings model - OPENAI_API_VERSION environment variable is needed for the endpoint - The constructor does not work with model, it works with deployment. I fixed it in the notebook. (This is my first contribution) ## Who can review? @hwchase17 @agola Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-17 21:05:53 -07:00
Davis Chase	a4ac006658	Update gallery (#4873 )	2023-05-17 20:59:41 -07:00
Davis Chase	8966f61ca5	Zep memory (#4898 ) Co-authored-by: Daniel Chalef <daniel.chalef@private.org> Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	2023-05-17 20:01:01 -07:00
Davis Chase	e28bdf4453	Cadlabs/python tool sanitization (#4754 ) Co-authored-by: BenSchZA <BenSchZA@users.noreply.github.com>	2023-05-17 19:46:12 -07:00
Eugene Yurtsev	0dc304ca80	Add html parsers (#4874 ) # Add bs4 html parser * Some minor refactors * Extract the bs4 html parsing code from the bs html loader * Move some tests from integration tests to unit tests	2023-05-17 22:39:11 -04:00
Eugene Yurtsev	8e41143bf5	Add a generic document loader (#4875 ) # Add generic document loader * This PR adds a generic document loader which can assemble a loader from a blob loader and a parser * Adds a registry for parsers * Populate registry with a default mimetype based parser ## Expected changes - Parsing involves loading content via IO so can be sped up via: * Threading in sync * Async - The actual parsing logic may be computatinoally involved: may need to figure out to add multi-processing support - May want to add suffix based parser since suffixes are easier to specify in comparison to mime types ## Before submitting No notebooks yet, we first need to get a few of the basic parsers up (prior to advertising the interface)	2023-05-17 22:38:55 -04:00
Davis Chase	df0c33a005	Faiss no avx2 (#4895 ) Co-authored-by: Ali Mirlou <alimirlou@gmail.com>	2023-05-17 19:18:57 -07:00
Emil Ahlbäck	5c9205d5f4	ConversationalChatAgent: Allow customizing `TEMPLATE_TOOL_RESPONSE` (#2361 ) It's currently not possible to change the `TEMPLATE_TOOL_RESPONSE` prompt for ConversationalChatAgent, this PR changes that. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-17 17:23:08 -07:00
Zander Chase	1ff7c958b0	Bold Crumbs (#4876 )	2023-05-17 22:50:35 +00:00
Alexander Miasoiedov (Myasoedov)	4c3ab55e94	feat(Add FastAPI + Vercel deployment option): (#4520 ) # Update deployments doc with langcorn API server API server example ```python from fastapi import FastAPI from langcorn import create_service app: FastAPI = create_service( "examples.ex1:chain", "examples.ex2:chain", "examples.ex3:chain", "examples.ex4:sequential_chain", "examples.ex5:conversation", "examples.ex6:conversation_with_summary", ) ``` More examples: https://github.com/msoedov/langcorn/tree/main/examples Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-17 15:50:25 -07:00
Taqi Jaffri	ef8b5f64bc	Tiny code review and docs fix for Docugami DataLoader (#4877 ) # Docs and code review fixes for Docugami DataLoader 1. I noticed a couple of hyperlinks that are not loading in the langchain docs (I guess need explicit anchor tags). Added those. 2. In code review @eyurtsev had a [suggestion](https://github.com/hwchase17/langchain/pull/4727#discussion_r1194069347) to allow string paths. Turns out just updating the type works (I tested locally with string paths). # Pre-submission checks I ran `make lint` and `make tests` successfully. --------- Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>	2023-05-17 15:31:43 -07:00
C.J. Jameson	d6e0b9a43d	fix homepage typo (#4883 ) # Fix Homepage Typo ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested... not sure	2023-05-17 15:30:23 -07:00
Leonid Ganeline	b96ab4b763	docs `retriever` improvements (#4430 ) # Docs: improvements in the `retrievers/examples/` notebooks Its primary purpose is to make the Jupyter notebook examples consistent and more suitable for first-time viewers. - add links to the integration source (if applicable) with a short description of this source; - removed `_retriever` suffix from the file names (where it existed) for consistency; - removed ` retriever` from the notebook title (where it existed) for consistency; - added code to install necessary Python package(s); - added code to set up the necessary API Key. - very small fixes in notebooks from other folders (for consistency): - docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb - docs/modules/indexes/vectorstores/examples/pinecone.ipynb - docs/modules/models/llms/integrations/cohere.ipynb - fixed misspelling in langchain/retrievers/time_weighted_retriever.py comment (sorry, about this change in a .py file ) ## Who can review @dev2049	2023-05-17 15:29:22 -07:00
Justin Levi Winter	0147f845f1	Update getting_started.ipynb (#4850 ) minor grammer issue	2023-05-17 13:19:14 -07:00
Yong Fu	3e12f0957a	Remove unused variables in Milvus vectorstore (#4868 ) # Remove unused variables in Milvus vectorstore This PR simply removes a variable unused in Milvus. The variable looks like a copy-paste from other functions in Milvus but it is really unnecessary.	2023-05-17 12:00:37 -07:00
Eugene Yurtsev	c5ab9782c6	Add beautiful soup 4 to extended testing extra (#4869 ) # Add bs4 to extended testing extra Updating extended testing extra in preparation for more refactors.	2023-05-17 14:11:26 -04:00
Ryan Culligan	6a9cdc43f5	Fix TypeError in Vectorstore Redis class methods (#4857 ) # Fix TypeError in Vectorstore Redis class methods This change resolves a TypeError that was raised when invoking the `from_texts_return_keys` method from the `from_texts` method in the `Redis` class. The error was due to the `cls` argument being passed explicitly, which led to it being provided twice since it's also implicitly passed in class methods. No relevant tests were added as the issue appeared to be better suited for linters to catch proactively. Changes: - Removed `cls=cls` from the call to `from_texts_return_keys` in the `from_texts` method. Related to: https://github.com/hwchase17/langchain/pull/4653	2023-05-17 10:48:09 -07:00
Eugene Yurtsev	2d20a1196e	Hugging Face Loader: Add lazy load (#4799 ) # Add lazy load to HF datasets loader Unfortunately, there are no tests as far as i can tell. Verified code manually.	2023-05-17 12:04:23 -04:00
Davis Chase	a63ab7ded1	bump 172 (#4864 )	2023-05-17 08:54:39 -07:00
yujiosaka	2f8eb95a91	Remove unnecessary comment (#4845 ) # Remove unnecessary comment Remove unnecessary comment accidentally included in #4800 ## Before submitting - no test - no document ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:	2023-05-17 11:53:03 -04:00
UmerHA	e257380deb	Typos (#4851 ) # Fixed typos (issues #4818 & #4668 & more typos) - At some places, it said `model = ChatOpenAI(model='gpt-3.5-turbo')` but should be `model = ChatOpenAI(model_name='gpt-3.5-turbo')` - Fixes some other typos Fixes #4818, #4668 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot	2023-05-17 11:52:22 -04:00
Zander Chase	8dcad0f272	Add Support for Flexible Input Format for LLM and Chat Model Runs (#4805 ) Previously, the client expected a strict 'prompt' or 'messages' format and wouldn't permit running a chat model or llm on prompts or messages (respectively). Since many datasets may want to specify custom key: string , relax this requirement. Also, add support for running a chat model on raw prompts and LLM on chat messages through their respective fallbacks.	2023-05-17 14:24:17 +00:00
Zander Chase	a47c62fcba	Add dev option (#4828 ) enable running ``` langchain plus start --dev ``` To use the RC iamges instead	2023-05-17 14:09:25 +00:00
Harrison Chase	720ac49f42	2markdown loader (#4796 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-05-16 23:42:53 -07:00
Ankush Gola	aa73a888fa	Some notebook and client fixes (add retries, clean up docs, etc) (#4820 ) # Your PR Title (What it does) <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-16 20:23:00 -07:00
Davis Chase	0a591da6db	Add weaviate by_text (#4824 ) Thanks @ZouhairElhadi! Made small change Closes #4742 --------- Co-authored-by: Zouhair Elhadi <zouhair11elhadi@gmail.com> Co-authored-by: ZouhairElhadi <87149442+ZouhairElhadi@users.noreply.github.com>	2023-05-16 19:43:15 -07:00
Zander Chase	d1b6839d97	Retry session and tenant (#4822 )	2023-05-17 01:54:40 +00:00
Nguyen Trung Duc (john)	49e4aaf673	Fix subclassing OpenAIEmbeddings (#4500 ) # Fix subclassing OpenAIEmbeddings Fixes #4498 ## Before submitting - Problem: Due to annotated type `Tuple[()]`. - Fix: Change the annotated type to "Iterable[str]". Even though tiktoken use [Collection[str]](`095924e02c/tiktoken/core.py (L80)`) type annotation, but pydantic doesn't support Collection type, and [Iterable](https://docs.pydantic.dev/latest/usage/types/#typing-iterables) is the closest to Collection.	2023-05-16 18:35:19 -07:00
Harrison Chase	08df80bed6	console callback verbose (#4696 ) add verbose callback Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>	2023-05-17 01:28:43 +00:00
David Peterson	d5d4c0a172	Update summarize.ipynb (#4529 ) # Update order in which tasks are stated (logically correct) Fixes the order in which steps are placed under titles. @vowelparrot	2023-05-16 18:14:00 -07:00
Django	bcffc704c1	fix: agenerate miss run_manager args in llm.py (#4566 ) # fix: agenerate miss run_manager args in llm.py <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes # (issue) fix: agenerate miss run_manager args in llm.py <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-16 17:37:56 -07:00
Brendan Mannix	4e56d3119c	update qdrant docs to reflect the proper way to initialize Qdrant() constructor (#4596 ) # update qdrant docs to reflect the proper way to initialize Qdrant() constructor The [Qdrant docs](https://python.langchain.com/en/latest/modules/indexes/vectorstores/examples/qdrant.html) still contain an old reference for passing an `embedding_function` into the constructor. This is no longer supported. This PR updates the docs to reflect the proper way to initialize `Qdrant()` Old: ![Screenshot 2023-05-12 at 3 06 33 PM](https://github.com/hwchase17/langchain/assets/1552962/dd4063d2-2a07-4340-91bb-e305f7215ddd) New: ![Screenshot 2023-05-12 at 3 21 09 PM](https://github.com/hwchase17/langchain/assets/1552962/aebc3f63-1a8b-4ca3-93c0-a2ce30dcd282)	2023-05-16 17:30:38 -07:00
Sean Morgan	5372a06a8c	DOC: Fix SageMaker example (#4598 ) # Fix SageMaker example typing Since https://github.com/hwchase17/langchain/pull/3249 a new type `LLMContentHandler` is enforced for SageMaker Endpoints Fixes #4168	2023-05-16 17:28:16 -07:00
Steve Kim	e90654f39b	Added cleaning up the downloaded PDF files (#4601 ) ArxivAPIWrapper searches and downloads PDFs to get related information. But I found that it doesn't delete the downloaded file. The reason why this is a problem is that a lot of PDF files remain on the server. For example, one size is about 28M. So, I added a delete line because it's too big to maintain on the server. # Clean up downloaded PDF files - Changes: Added new line to delete downloaded file - Background: To get the information on arXiv's paper, ArxivAPIWrapper class downloads a PDF. It's a natural approach, but the wrapper retains a lot of PDF files on the server. - Problem: One size of PDFs is about 28M. It's too big to maintain on a small server like AWS. - Dependency: import os Thank you. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-16 17:26:56 -07:00
Quinn	6fbd5e837f	Update planner_prompt.py, change usery to user (#4623 ) # Fix misspell in planner_prompt.py before ``` Usery query: I want to buy a couch ``` after ``` User query: I want to buy a couch ```	2023-05-16 17:24:27 -07:00
Tony Zhang	432421ffa5	[Fix][GenerativeAgent] Get the memory importance score from regex matched group (#4636 ) # Get the memory importance score from regex matched group In `GenerativeAgentMemory`, the `_score_memory_importance()` will make a prompt to get a rating score. The prompt is: ``` prompt = PromptTemplate.from_template( "On the scale of 1 to 10, where 1 is purely mundane" + " (e.g., brushing teeth, making bed) and 10 is" + " extremely poignant (e.g., a break up, college" + " acceptance), rate the likely poignancy of the" + " following piece of memory. Respond with a single integer." + "\nMemory: {memory_content}" + "\nRating: " ) ``` For some LLM, it will respond with, for example, `Rating: 8`. Thus we might want to get the score from the matched regex group.	2023-05-16 16:59:50 -07:00
Daniel Maturana	be405ac139	Query_constructor.base.py function _get_prompt() not including passed examples. (#4680 ) The function _get_prompt() was returning the DEFAULT_EXAMPLES even if some custom examples were given. The return FewShotPromptTemplate was returnong DEFAULT_EXAMPLES and not examples	2023-05-16 16:31:10 -07:00
Anam Hira	3af448d72e	Update huggingface_tools.ipynb (#4700 )	2023-05-16 16:28:27 -07:00
rajib	e28f4a5f39	changed cohere.py to update the default model of embedding (#4709 ) # The cohere embedding model do not use large, small. It is deprecated. Changed the modules default model Fixes #4694 Co-authored-by: rajib76 <rajib76@yahoo.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-16 16:27:23 -07:00
charosen	75fe9d3555	Add from_file method to message prompt template (#4713 ) Feature: This PR adds `from_template_file` class method to BaseStringMessagePromptTemplate. This is useful to help user to create message prompt templates directly from template files, including `ChatMessagePromptTemplate`, `HumanMessagePromptTemplate`, `AIMessagePromptTemplate` & `SystemMessagePromptTemplate`. Tests: Unit tests have been added in this PR. Co-authored-by: charosen <charosen@bupt.cn> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-16 16:25:17 -07:00
Chandan Routray	e8d46bdd9b	Replaced `SQLDatabaseChain` deprecated direct initialisation with `from_llm` method (#4778 ) # Removed usage of deprecated methods Replaced `SQLDatabaseChain` deprecated direct initialisation with `from_llm` method ## Who can review? @hwchase17 @agola11 --------- Co-authored-by: imeckr <chandanroutray2012@gmail.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-16 15:59:06 -07:00
Chandan Routray	11341fcecb	Fixed query checker for SQLDatabaseChain (#4780 ) # Fixed query checker for SQLDatabaseChain When `SQLDatabaseChain`'s llm attribute was deprecated, the query checker stopped working if `SQLDatabaseChain` is initialised via `from_llm` method. With this fix, `SQLDatabaseChain`'s query checker would use the same `llm` as used in the `llm_chain` ## Who can review? @hwchase17 - project lead Co-authored-by: imeckr <chandanroutray2012@gmail.com>	2023-05-16 15:58:58 -07:00
Yeong0228	08876ad066	Fix SelfQueryRetriever, passing new query to vector store (#4774 ) # Fix SelfQueryRetriever, passing new query to vector store	2023-05-16 15:46:22 -07:00
Mark Pors	8fd4d5d117	Added dependencies to make example executable (#4790 ) - Installation of non-colab packages - Get API keys # Added dependencies to make notebook executable on hosted notebooks ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 @vowelparrot	2023-05-16 15:46:09 -07:00
Mark Pors	5bc7082e82	Cleanup and added dependencies to make example executable (#4795 ) - Installation of non-colab packages - Get API keys - Get rid of warnings # Cleanup and added dependencies to make notebook executable on hosted notebooks @hwchase17 @vowelparrot	2023-05-16 15:29:01 -07:00
keenangraham	bcce9a3a92	Fix age inconsistency in plan and execute Jupyter notebook example (#4814 ) The current example in https://python.langchain.com/en/latest/modules/agents/plan_and_execute.html has inconsistent reasoning step (observing 28 years and thinking it's 26 years): ``` Observation: 28 years Thought:Based on my search, Gigi Hadid's current age is 26 years old. Action: { "action": "Final Answer", "action_input": "Gigi Hadid's current age is 26 years old." } ``` Guessing this is model noise. Rerunning seems to give correct answer of 28 years.	2023-05-16 15:27:27 -07:00
Prateek K. Keshari	61f9c52fc7	Update twitter-the-algorithm-analysis-deeplake.ipynb (#4812 ) Changed model to model_name	2023-05-16 15:27:15 -07:00
yujiosaka	6561efebb7	Accept uuids kwargs for weaviate (#4800 ) # Accept uuids kwargs for weaviate Fixes #4791	2023-05-16 15:26:46 -07:00
Adam Quigley	e78c9be312	Add Confluence Loader unit tests (#3333 ) Adds some basic unit tests for the ConfluenceLoader that can be extended later. Ports this [PR from llama-hub](https://github.com/emptycrown/llama-hub/pull/208) and adapts it to `langchain`. @Jflick58 and @zywilliamli adding you here as potential reviewers --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-16 15:17:07 -07:00
Magnus Friberg	d126276693	Specify which data to return from chromadb (#4393 ) # Improve the Chroma get() method by adding the optional "include" parameter. The Chroma get() method excludes embeddings by default. You can customize the response by specifying the "include" parameter to selectively retrieve the desired data from the collection. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-16 14:43:09 -07:00
Raduan Al-Shedivat	00c6ec8a2d	fix(document_loaders/telegram): fix pandas calls + add tests (#4806 ) # Fix Telegram API loader + add tests. I was testing this integration and it was broken with next error: ```python message_threads = loader._get_message_threads(df) KeyError: False ``` Also, this particular loader didn't have any tests / related group in poetry, so I added those as well. @hwchase17 / @eyurtsev please take a look on this fix PR. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-16 14:35:25 -07:00
Zander Chase	206c87d525	Change server start name (#4811 ) to `langchain plus start/stop`	2023-05-16 20:04:09 +00:00
Eugene Yurtsev	255690d78e	Catch changes to test group (#4802 ) # Catch changes to test group Add test to catch changes to test group.	2023-05-16 14:48:56 -04:00
Eugene Yurtsev	c3b6129beb	Block sockets for unit-tests (#4803 ) # Block usage of sockets during unit tests Catch any tests that attempt to use the network.	2023-05-16 14:41:24 -04:00
了空	f7e3d97b19	Remove unnecessary spaces from document object’s page_content of BiliBiliLoader (#4619 ) - Remove unnecessary spaces from document object’s page_content of BiliBiliLoader - Fix BiliBiliLoader document and test file	2023-05-16 13:13:57 -04:00
Eugene Yurtsev	f47ec5b4b6	Docugami docs: First cell should be a title cell (#4735 ) # Make first cell a title in docugami docs This makes the first cell a title cell in docugami notebook	2023-05-16 13:12:14 -04:00
Eugene Yurtsev	d403f659ea	Update google protobuf dep (#4798 ) # Update google protobuf dep Resolve: https://github.com/hwchase17/langchain/security/dependabot/11	2023-05-16 12:25:07 -04:00
Eugene Yurtsev	3ecd7c9641	Add check to verify poetry.toml (#4794 ) # Add poetry check to github action Check poetry toml file during tests for errors	2023-05-16 11:53:06 -04:00
Ikko Eltociear Ashimine	f5a476fdd4	Fix typo in dataframe.py (#4786 ) # Fix typo in dataframe.py (#4786) Fixed typo. ``` yeild -> yield ```	2023-05-16 11:49:04 -04:00
Eugene Yurtsev	14bedf1cc5	Github Action: Fix poetry lock file checking (#4789 ) Fix how poetry lock file is checked to avoid skipping caches silently.	2023-05-16 11:40:28 -04:00
Davis Chase	7ce43372c3	Version 171 (#4788 )	2023-05-16 08:24:45 -07:00
Zander Chase	bee136efa4	Update Tracing Walkthrough (#4760 ) Add client methods to read / list runs and sessions. Update walkthrough to: - Let the user create a dataset from the runs without going to the UI - Use the new CLI command to start the server Improve the error message when `docker` isn't found	2023-05-16 13:26:43 +00:00
Zander Chase	fc0a3c8500	Persist Volume After Stop (#4763 ) Previously, the data would be removed after shutting down the server. This mounts a db volume that isn't erased between calls	2023-05-16 13:10:13 +00:00
Harrison Chase	a7af32c274	Cassandra support for chat history (#4378 ) (#4764 ) # Cassandra support for chat history ### Description - Store chat messages in cassandra ### Dependency - cassandra-driver - Python Module ## Before submitting - Added Integration Test ## Who can review? @hwchase17 @agola11 # Your PR Title (What it does) <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> Co-authored-by: Jinto Jose <129657162+jj701@users.noreply.github.com>	2023-05-15 23:43:09 -07:00
Harrison Chase	c4c7936caa	Harrison/wiki loader (#4765 ) Co-authored-by: Guillermo Segovia <T1b4lt@users.noreply.github.com>	2023-05-15 23:42:57 -07:00
Filip Haltmayer	c632f7fc4e	Add Milvus and Zilliz Retrievals (#4416 ) Adds the basic retrievers for Milvus and Zilliz. Hybrid search support will be added in the future. Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>	2023-05-15 21:22:54 -07:00
Bradley James	2e43954bc3	fixed on_llm issue (#4717 ) Fixes #4714	2023-05-16 01:36:21 +00:00
Zander Chase	bf0904b676	Add Server Command (#4695 ) Add Support for `langchain server {start\|stop}` commands, with support for using ngrok to tunnel to a remote notebook	2023-05-16 00:44:30 +00:00
Anirudh Suresh	03ac39368f	Fixing DeepLake Overwrite Flag (#4683 ) # Fix DeepLake Overwrite Flag Issue Fixes Issue #4682: essentially, setting overwrite to False in the DeepLake constructor still triggers an overwrite, because the logic is just checking for the presence of "overwrite" in kwargs. The fix is simple--just add some checks to inspect if "overwrite" in kwargs AND kwargs["overwrite"]==True. Added a new test in tests/integration_tests/vectorstores/test_deeplake.py to reflect the desired behavior. Co-authored-by: Anirudh Suresh <ani@Anirudhs-MBP.cable.rcn.com> Co-authored-by: Anirudh Suresh <ani@Anirudhs-MacBook-Pro.local> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-15 17:39:16 -07:00
d 3 n 7	8bb32d77d0	Update utils.py to make headless an optional argument (#4745 ) Making headless an optional argument for create_async_playwright_browser() and create_sync_playwright_browser() By default no functionality is changed. This allows for disabled people to use a web browser intelligently with their voice, for example, while still seeing the content on the screen. As well as many other use cases --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-15 17:29:06 -07:00
Mose Tronci	a9dbe90447	Exponential back-off support for Google PaLM api (#4001 ) This PR adds exponential back-off to the Google PaLM api to gracefully handle rate limiting errors. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-15 17:21:11 -07:00
Leonid Ganeline	a6f3ec94bc	docs: added `additional_resources` folder (#4748 ) # docs: added `additional_resources` folder The additional resource files were inside the doc top-level folder, which polluted the top-level folder. - added the `additional_resources` folder and moved correspondent files to this folder; - fixed a broken link to the "Model comparison" page (model_laboratory notebook) - fixed a broken link to one of the YouTube videos (sorry, it is not directly related to this PR) ## Who can review? @dev2049	2023-05-15 17:12:47 -07:00
Zander Chase	a128d95aeb	Fix Async Shared Resource Bug (#4751 ) Use an async queue to distribute tracers rather than inappropriately sharing a single one	2023-05-16 00:04:01 +00:00
whuwxl	3f0357f94a	Add summarization task type for HuggingFace APIs (#4721 ) # Add summarization task type for HuggingFace APIs Add summarization task type for HuggingFace APIs. This task type is described by [HuggingFace inference API](https://huggingface.co/docs/api-inference/detailed_parameters#summarization-task) My project utilizes LangChain to connect multiple LLMs, including various HuggingFace models that support the summarization task. Integrating this task type is highly convenient and beneficial. Fixes #4720	2023-05-15 16:26:17 -07:00
Zander Chase	580861e7f2	Revert "Make serpapi base url configurable via env (#4402 )" (#4750 ) This reverts commit `5111bec540`. This PR introduced a bug in the async API (the `url` param isn't bound); it also didn't update the synchronous API correctly, which makes it error-prone (the behavior of the async and sync endpoints would be different)	2023-05-15 16:17:16 -07:00
shiyu22	21b9397342	Update the milvus example (#4706 ) # Fix issue when running example - add the query content - update the `user` parameter with Zilliz Signed-off-by: shiyu22 <shiyu.chen@zilliz.com>	2023-05-15 16:16:57 -07:00
hilarious-viking	7d15669b41	llama-cpp: add gpu layers parameter (#4739 ) Adds gpu layers parameter to llama.cpp wrapper Co-authored-by: andrew.khvalenski <andrew.khvalenski@behavox.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-15 16:01:48 -07:00
Davis Chase	36c9fd1af7	Dev2049/docs edit0 (#4699 )	2023-05-15 15:20:37 -07:00
Jinto Jose	1e467d9fc4	Jupyter Notebook Example for using Mongodb to store Chat Message History (#4436 ) # Jupyter Notebook Example for using Mongodb Chat Message History @dev2049	2023-05-15 14:33:42 -07:00
Leonid Ganeline	6060505a9d	Add new links to `Tutorials` and `YouTube` pages (#4746 ) - added an official LangChain YouTube channel :) - added new tutorials and videos (only videos with enough subscriber or view numbers) - added a "New video" icon ## Who can review? @dev2049	2023-05-15 14:32:48 -07:00
Eduard van Valkenburg	47657fe01a	Tweaks to the PowerBI toolkit and utility (#4442 ) Fixes some bugs I found while testing with more advanced datasets and queries. Includes using the output of PowerBI to parse the error and give that back to the LLM.	2023-05-15 14:30:48 -07:00
mvhensbergen	e363e709cb	Add source field to metadata (#4462 ) This is needed if one want to use index.query_with_sources on git files. Without a source field, index.query_with_sources fails with an exception.	2023-05-15 14:30:12 -07:00
vinoyang	5111bec540	Make serpapi base url configurable via env (#4402 ) Fixes #4328 Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-15 14:25:25 -07:00
Roma	cb802edf75	[Feature] Add GraphQL Query Tool (#4409 ) # Add GraphQL Query Support This PR introduces a GraphQL API Wrapper tool that allows LLM agents to query GraphQL databases. The tool utilizes the httpx and gql Python packages to interact with GraphQL APIs and provides a simple interface for running queries with LLM agents. @vowelparrot --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-15 14:06:12 -07:00
Eugene Yurtsev	49ce5ce1ca	Only run linkcheck against docs dir on PR (#4741 ) # Only run linkchecker on direct changes to docs This is a stop-gap that will speed up PRs. Some broken links can slip through if they're embedded in doc-strings inside the codebase. But we'll still be running the linkchecker on master.	2023-05-15 14:40:43 -04:00
Eugene Yurtsev	99cfe71cd0	Check poetry lock file (#4740 ) # Check poetry lock file on CI This PR checks that the lock file is up to date using poetry lock --check. As part of this PR, a new lock file was generated.	2023-05-15 14:38:01 -04:00
Eugene Yurtsev	09587a3201	Clean up tests for pdf parsers (#4595 ) # Organize tests for pdf parsers Clean up tests for pdf parsers, remove duplicate tests, convert to unit tests.	2023-05-15 14:21:05 -04:00
Leonid Ganeline	70fd7cda14	docs: `Concepts` (#4734 ) # glossary.md renamed as concepts.md and moved under the Getting Started small PR. `Concepts` looks right to the point. It is moved under Getting Started (typical place). Previously it was lost in the Additional Resources section. ## Who can review? @hwchase17	2023-05-15 11:09:25 -07:00
Harrison Chase	8de81d34a1	bump version to 170 (#4733 )	2023-05-15 09:21:00 -07:00
Harrison Chase	dd95f0892d	Harrison/add top k (#4707 ) Co-authored-by: blc16 <benlc@umich.edu>	2023-05-15 09:09:22 -07:00
Harrison Chase	0551594722	add async default (#4701 ) a spin on https://github.com/hwchase17/langchain/pull/4300/files#diff-4f16071d58cd34fb3ec5cd5089e9dbd6fb06574c25c76b4d573827f8a2f48e96	2023-05-15 08:57:30 -07:00
Zander Chase	97434a64c5	Add Environment Info to Run (#4691 ) Store the environment info within the `extra` fields of the Run	2023-05-15 15:38:49 +00:00
Eugene Yurtsev	d3300bd799	YouTube Loader: Replace regexp with built-in parsing (#4729 )	2023-05-15 08:34:41 -07:00
Daniel Barker	c70ae562b4	Added support for streaming output response to HuggingFaceTextgenInference LLM class (#4633 ) # Added support for streaming output response to HuggingFaceTextgenInference LLM class Current implementation does not support streaming output. Updated to incorporate this feature. Tagging @agola11 for visibility.	2023-05-15 14:59:12 +00:00
d 3 n 7	435b70da47	Update click.py to pass errors back to Agent (#4723 ) Instead of halting the entire program if this tool encounters an error, it should pass the error back to the agent to decide what to do. This may be best suited for @vowelparrot to review.	2023-05-15 14:54:08 +00:00
Eugene Yurtsev	3c490b5ba3	Docugami DataLoader (#4727 ) ### Adds a document loader for Docugami Specifically: 1. Adds a data loader that talks to the [Docugami](http://docugami.com) API to download processed documents as semantic XML 2. Parses the semantic XML into chunks, with additional metadata capturing chunk semantics 3. Adds a detailed notebook showing how you can use additional metadata returned by Docugami for techniques like the [self-querying retriever](https://python.langchain.com/en/latest/modules/indexes/retrievers/examples/self_query_retriever.html) 4. Adds an integration test, and related documentation Here is an example of a result that is not possible without the capabilities added by Docugami (from the notebook): <img width="1585" alt="image" src="https://github.com/hwchase17/langchain/assets/749277/bb6c1ce3-13dc-4349-a53b-de16681fdd5b"> --------- Co-authored-by: Taqi Jaffri <tjaffri@docugami.com> Co-authored-by: Taqi Jaffri <tjaffri@gmail.com>	2023-05-15 10:53:00 -04:00
KNiski	c2761aa8f4	Improve video_id extraction in YoutubeLoader (#4452 ) # Improve video_id extraction in `YoutubeLoader` `YoutubeLoader.from_youtube_url` can only deal with one specific url format. I've introduced `YoutubeLoader.extract_video_id` which can extract video id from common YT urls. Fixes #4451 @eyurtsev --------- Co-authored-by: Kamil Niski <kamil.niski@gmail.com>	2023-05-15 10:45:19 -04:00
sqr	8b42e8a510	Update Makefile (typo) (#4725 ) # Update minor typo in makefile	2023-05-15 10:34:44 -04:00
Lester Yang	cd3f9865f3	Feature: pdfplumber PDF loader with BaseBlobParser (#4552 ) # Feature: pdfplumber PDF loader with BaseBlobParser * Adds pdfplumber as a PDF loader * Adds pdfplumber as a blob parser.	2023-05-15 09:47:02 -04:00
Harrison Chase	b6e3ac17c4	Harrison/sitemap local (#4704 ) Co-authored-by: Lukas Bauer <lukas.bauer@mayflower.de>	2023-05-14 22:04:38 -07:00
Harrison Chase	12b4ee1fc7	Harrison/telegram chat loader (#4698 ) Co-authored-by: Akinwande Komolafe <47945512+Sensei-akin@users.noreply.github.com> Co-authored-by: Akinwande Komolafe <akhinoz@gmail.com>	2023-05-14 22:04:27 -07:00
Leonid Ganeline	2b181e5a6c	docs: tutorials are moved on the top-level of docs (#4464 ) # Added Tutorials section on the top-level of documentation Problem Statement: the Tutorials section in the documentation is top-priority. Not every project has resources to make tutorials. We have such a privilege. Community experts created several tutorials on YouTube. But the tutorial links are now hidden on the YouTube page and not easily discovered by first-time visitors. PR: I've created the `Tutorials` page (from the `Additional Resources/YouTube` page) and moved it to the top level of documentation in the `Getting Started` section. ## Who can review? @dev2049 NOTE: PR checks are randomly failing `3aefaafcdb` `258819eadf` `514d81b5b3`	2023-05-14 21:22:25 -07:00
Li Yuanzheng	3b6206af49	Respect User-Specified User-Agent in WebBaseLoader (#4579 ) # Respect User-Specified User-Agent in WebBaseLoader This pull request modifies the `WebBaseLoader` class initializer from the `langchain.document_loaders.web_base` module to preserve any User-Agent specified by the user in the `header_template` parameter. Previously, even if a User-Agent was specified in `header_template`, it would always be overridden by a random User-Agent generated by the `fake_useragent` library. With this change, if a User-Agent is specified in `header_template`, it will be used. Only in the case where no User-Agent is specified will a random User-Agent be generated and used. This provides additional flexibility when using the `WebBaseLoader` class, allowing users to specify their own User-Agent if they have a specific need or preference, while still providing a reasonable default for cases where no User-Agent is specified. This change has no impact on existing users who do not specify a User-Agent, as the behavior in this case remains the same. However, for users who do specify a User-Agent, their choice will now be respected and used for all subsequent requests made using the `WebBaseLoader` class. Fixes #4167 ## Before submitting ============================= test session starts ============================== collecting ... collected 1 item test_web_base.py::TestWebBaseLoader::test_respect_user_specified_user_agent ============================== 1 passed in 3.64s =============================== PASSED [100%] ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-05-14 23:09:27 -04:00
Ashish Talati	372a5113ff	Update gallery.rst with chatpdf opensource (#4342 )	2023-05-14 19:43:16 -07:00
Samuli Rauatmaa	66828ad231	add the existing OpenWeatherMap tool to the public api (#4292 ) [OpenWeatherMapAPIWrapper](`f70e18a5b3/docs/modules/agents/tools/examples/openweathermap.ipynb`) works wonderfully, but the _tool_ itself can't be used in master branch. - added OpenWeatherMap tool to the public api, to be loadable with `load_tools` by using "openweathermap-api" tool name (that name is used in the existing [docs](`aff33d52c5/docs/modules/agents/tools/getting_started.md`), at the bottom of the page) - updated OpenWeatherMap tool's description to make the input format match what the API expects (e.g. `London,GB` instead of `'London,GB'`) - added [ecosystem documentation page for OpenWeatherMap](`f9c41594fe/docs/ecosystem/openweathermap.md`) - added tool usage example to [OpenWeatherMap's notebook](`f9c41594fe/docs/modules/agents/tools/examples/openweathermap.ipynb`) Let me know if there's something I missed or something needs to be updated! Or feel free to make edits yourself if that makes it easier for you 🙂	2023-05-14 18:50:45 -07:00
Harrison Chase	6f47ab17a4	Harrison/param notion db (#4689 ) Co-authored-by: Edward Park <ed.sh.park@gmail.com>	2023-05-14 18:26:25 -07:00
Harrison Chase	5d63fc65e1	add warning for combined memory (#4688 )	2023-05-14 18:26:16 -07:00
Harrison Chase	a48810fb21	dont have openai_api_version by default (#4687 ) an alternative to https://github.com/hwchase17/langchain/pull/4234/files	2023-05-14 18:26:08 -07:00
Harrison Chase	cdc20d1203	Harrison/json loader fix (#4686 ) Co-authored-by: Triet Le <112841660+triet-lq-holistics@users.noreply.github.com>	2023-05-14 18:25:59 -07:00
Harrison Chase	ed8207b2fb	Harrison/typing of return (#4685 ) Co-authored-by: OlajideOgun <37077640+OlajideOgun@users.noreply.github.com>	2023-05-14 18:25:50 -07:00
Harrison Chase	c48f1301ee	oops remove api key, dont worried i cycled it	2023-05-14 17:40:31 -07:00
Harrison Chase	57b2f3ffe6	add rebuff (#4637 )	2023-05-14 17:38:43 -07:00
Zander Chase	d85b04be7f	Add RELLM and JSONFormer experimental LLM decoding (#4185 ) [RELLM](https://github.com/r2d4/rellm) is a library that wraps local HuggingFace pipeline models for structured decoding. RELLM works by generating tokens one at a time. At each step, it masks tokens that don't conform to the provided partial regular expression. [JSONFormer](https://github.com/1rgs/jsonformer) is a bit different, where it sequentially adds the keys then decodes each value directly	2023-05-14 22:40:03 +00:00
Harrison Chase	54f5523197	bump version to 169 (#4675 )	2023-05-14 14:18:29 -07:00
Harrison Chase	243886be93	Harrison/virtual time (#4658 ) Co-authored-by: ifsheldon <39153080+ifsheldon@users.noreply.github.com> Co-authored-by: maple.liang <maple.liang@gempoll.com>	2023-05-14 10:29:17 -07:00
Harrison Chase	f2f2aced6d	allow partials in from_template (#4638 )	2023-05-13 21:47:20 -07:00
Harrison Chase	fbfa49f2c1	agent serialization (#4642 )	2023-05-13 21:47:10 -07:00
Harrison Chase	ef49c659f6	add embedding router (#4644 )	2023-05-13 21:47:01 -07:00
Harrison Chase	5020094e3b	Harrison/azure content filter (#4645 ) Co-authored-by: Rob Kopel <R0bk@users.noreply.github.com>	2023-05-13 21:46:51 -07:00
Harrison Chase	f5e2f70115	Harrison/json new line (#4646 ) Co-authored-by: David Chen <davidchen@gliacloud.com>	2023-05-13 21:46:33 -07:00
Harrison Chase	87d8d221fb	Harrison/headers for openai (#4648 ) Co-authored-by: aakash.shah <aakash.shah@quintiles.com>	2023-05-13 21:46:20 -07:00
Harrison Chase	c09bb00959	Harrison/summary memory history (#4649 ) Co-authored-by: engkheng <60956360+outday29@users.noreply.github.com>	2023-05-13 21:46:11 -07:00
Harrison Chase	44ae673388	Harrison/multithreading directory loader (#4650 ) Co-authored-by: PawelFaron <42373772+PawelFaron@users.noreply.github.com> Co-authored-by: Pawel Faron <ext-pawel.faron@vaisala.com>	2023-05-13 21:46:02 -07:00
Harrison Chase	b0c733e327	list of messages (#4651 )	2023-05-13 21:45:53 -07:00
Harrison Chase	873b0c7eb6	Harrison/structured chat mem (#4652 ) Co-authored-by: d 3 n 7 <29033313+d3n7@users.noreply.github.com>	2023-05-13 21:45:42 -07:00
Harrison Chase	9ba3a798c4	Harrison/from keys redis (#4653 ) Co-authored-by: Christoph Kahl <christoph@zauberware.com>	2023-05-13 21:45:24 -07:00
Harrison Chase	e781ff9256	Harrison/chatopenaibase path (#4656 ) Co-authored-by: Dave <dave@gray101.com>	2023-05-13 21:45:14 -07:00
Harrison Chase	279605b4d3	Harrison/metaphor search (#4657 ) Co-authored-by: Jeffrey Wang <jeffreyzhiyuanwang@gmail.com>	2023-05-13 21:45:05 -07:00
Harrison Chase	9aa9fe7021	Harrison/spark connect example (#4659 ) Co-authored-by: Mike Wang <62768671+skcoirz@users.noreply.github.com>	2023-05-13 21:44:54 -07:00
Prerit Das	2747ccbcf1	Allow custom base Zapier prompt (#4213 ) Currently, all Zapier tools are built using the pre-written base Zapier prompt. These small changes (that retain default behavior) will allow a user to create a Zapier tool using the ZapierNLARunTool while providing their own base prompt. Their prompt must contain input fields for zapier_description and params, checked and enforced in the tool's root validator. An example of when this may be useful: user has several, say 10, Zapier tools enabled. Currently, the long generic default Zapier base prompt is attached to every single tool, using an extreme number of tokens for no real added benefit (repeated). User prompts LLM on how to use Zapier tools once, then overrides the base prompt. Or: user has a few specific Zapier tools and wants to maximize their success rate. So, user writes prompts/descriptions for those tools specific to their use case, and provides those to the ZapierNLARunTool. A consideration - this is the simplest way to implement this I could think of... though ideally custom prompting would be possible at the Toolkit level as well. For now, this should be sufficient in solving the concerns outlined above.	2023-05-13 21:08:18 -07:00
Paresh Mathur	e2bc836571	Fix #4087 by setting the correct csv dialect (#4103 ) The error in #4087 was happening because of the use of csv.Dialect.* which is just an empty base class. we need to make a choice on what is our base dialect. I usually use excel so I put it as excel, if maintainers have other preferences do let me know. Open Questions: 1. What should be the default dialect? 2. Should we rework all tests to mock the open function rather than the csv.DictReader? 3. Should we make a separate input for `dialect` like we have for `encoding`? --------- Co-authored-by: = <=>	2023-05-13 20:35:01 -07:00
Leonid Ganeline	3ce78ef6c4	docs: document_loaders classification (#4069 ) Problem statement: the [document_loaders](https://python.langchain.com/en/latest/modules/indexes/document_loaders.html#) section is too long and hard to comprehend. Proposal: group document_loaders by 3 classes: (see `Files changed` tab) UPDATE: I've completely reworked the document_loader classification. Now this PR changes only one file! FYI @eyurtsev @hwchase17	2023-05-13 19:17:32 -07:00
Zander Chase	928cdd57a4	[Breaking] Refactor Base Tracer(#4549 ) ### Refactor the BaseTracer - Remove the 'session' abstraction from the BaseTracer - Rename 'RunV2' object(s) to be called 'Run' objects (Rename previous Run objects to be RunV1 objects) - Ditto for sessions: TracerSessionV2 -> TracerSession - Remove now deprecated conversion from v1 run objects to v2 run objects in LangChainTracerV2 - Add conversion from v2 run objects to v1 run objects in V1 tracer	2023-05-13 17:23:56 +00:00
Harrison Chase	1e322ffc1c	change heading	2023-05-13 09:52:23 -07:00
Harrison Chase	86c1f090fd	bump version to 168 (#4632 )	2023-05-13 09:50:22 -07:00
Davis Chase	9ab7101182	WIP: FLARE-inspired chain (#4612 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-13 09:28:28 -07:00
Harrison Chase	daa3e6dedb	Harrison/prompt constructor methods (#4616 )	2023-05-13 09:23:51 -07:00
Harrison Chase	6265cbfb11	Harrison/standard llm interface (#4615 )	2023-05-13 09:05:31 -07:00
Harrison Chase	485ecc3580	option for csv agent to not include df in prompt (#4610 )	2023-05-12 21:55:22 -07:00
Harrison Chase	7d425cbf38	improve sql prompt (#4611 ) Co-authored-by: Taqi Jaffri <tjaffri@docugami.com> Co-authored-by: Taqi Jaffri <tjaffri@gmail.com>	2023-05-12 21:55:03 -07:00
Hans van Dam	01531cb16d	remove quotes from sql database prompts (caused syntax error) (#4101 ) fixes a syntax error mentioned in #2027 and #3305 another PR to remedy is in #3385, but I believe that is not tacking the core problem. Also #2027 mentions a solution that works: add to the prompt: 'The SQL query should be outputted plainly, do not surround it in quotes or anything else.' To me it seems strange to first ask for: SQLQuery: "SQL Query to run" and then to tell the LLM not to put the quotes around it. Other templates (than the sql one) do not use quotes in their steps. This PR changes that to: SQLQuery: SQL Query to run	2023-05-12 20:03:37 -07:00
Zander Chase	0c6ed657ef	Convert Chain to a Chain Factory (#4605 ) ## Change Chain argument in client to accept a chain factory The `run_over_dataset` functionality seeks to treat each iteration of an example as an independent trial. Chains have memory, so it's easier to permit this type of behavior if we accept a factory method rather than the chain object directly. There's still corner cases / UX pains people will likely run into, like: - Caching may cause issues - if memory is persisted to a shared object (e.g., same redis queue) , this could impact what is retrieved - If we're running the async methods with concurrency using local models, if someone naively instantiates the chain and loads each time, it could lead to tons of disk I/O or OOM	2023-05-13 02:13:21 +00:00
Tim Asp	ed0d557ede	docs: fix pdf docs hierarchy and formatting (#4593 ) # Fix pdf loader docs page ![image](https://github.com/hwchase17/langchain/assets/707699/4a11f379-00ed-4f7a-9870-71f74e0cadc6) Using h1's messes with hierarchy, this fixes that, and moves the PyPDFium2 loader out of the middle of PDFMiner docs	2023-05-12 15:03:01 -04:00
Davis Chase	36f9e9a0ba	Skip flaky unit test (#4591 )	2023-05-12 11:54:40 -07:00
Eugene Yurtsev	08ed927c32	Turn on extended tests (#4588 ) # Turn on strict extended tests This PR turns on strict testing for extended tests.	2023-05-12 14:50:08 -04:00
Zander Chase	d96f6a106b	Add Steamship Image Generation Tool (#4580 ) Co-authored-by: Enias Cailliau <enias@steamship.com>	2023-05-12 10:35:01 -07:00
Davis Chase	739c297c94	Release 167 (#4589 )	2023-05-12 10:24:59 -07:00
Davis Chase	a4a9d1f403	Improve vespa interface (#4546 ) ![Screenshot 2023-05-11 at 7 50 31 PM](https://github.com/hwchase17/langchain/assets/130488702/bc8ab4bb-8006-44fc-ba07-df54e84ee2c1)	2023-05-12 10:11:26 -07:00
vinoyang	72f18fd08b	Provide get current date function dialect for other DBs (#4576 ) # Provide get current date function dialect for other DBs <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-12 13:04:28 -04:00
Neil Ruaro	3a2855945b	added documentation on retrieving a PG vectorstore (#4578 ) This PR adds in documentation on querying an existing vectorstore in PG Fixes 3191 (issue)	2023-05-12 13:04:06 -04:00
Andrea Pinto	1e5d25b93c	Improve error messages formatting in doc loaders (#4586 ) # Cosmetic in errors formatting Added appropriate spacing to the `ImportError` message in a bunch of document loaders to enhance trace readability (including Google Drive, Youtube, Confluence and others). This change ensures that the error messages are not displayed as a single line block, and that the `pip install xyz` commands can be copied to clipboard from terminal easily. ## Who can review? @eyurtsev	2023-05-12 13:03:39 -04:00
kYLe	570d057db4	Expose AnyScale LLM in langchain.llms (#4585 ) # Expose AnyScale LLM in langchain.llms Fixes # update init.py so we can from langchain.llms import Anyscale	2023-05-12 12:48:38 -04:00
Eugene Yurtsev	a5371a0fa2	Add pytest --only-extended and --only-core options (#4494 ) # Adds testing options to pytest This PR adds the following options: * `--only-core` will skip all extended tests, running all core tests. * `--only-extended` will skip all core tests. Forcing alll extended tests to be run. Running `py.test` without specifying either option will remain unaffected. Run all tests that can be run within the unit_tests direction. Extended tests will run if required packages are installed. ## Before submitting ## Who can review?	2023-05-12 11:35:22 -04:00
Harrison Chase	5ad151ed44	Add constitutional principles from paper (#4554 ) Add constitutional principles from https://arxiv.org/pdf/2212.08073.pdf --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-12 07:34:03 -07:00
Sai Vinay G	cf4c1394a2	feat: Added class to support huggingface text generation inference server (#4447 ) [Text Generation Inference](https://github.com/huggingface/text-generation-inference) is a Rust, Python and gRPC server for generating text using LLMs. This pull request add support for self hosted Text Generation Inference servers. feature: #4280 --------- Co-authored-by: Your Name <you@example.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-12 07:32:37 -07:00
Zander Chase	258c319855	Dereference Messages (#4557 ) Update how we parse the messages now that the server splits prompts / messages up	2023-05-12 00:12:43 -07:00
Leonid Ganeline	e17d0319d5	Add `arxiv` retriever (#4538 )	2023-05-11 22:48:38 -07:00
vinoyang	25cd6e060a	Enhance the prompt to make the LLM generate right date for real today (#4505 ) # Enhance the prompt to make the LLM generate right date for real today Fixes # (issue) Currently, if the user's question contains `today`, the clickhouse always points to an old date. This may be related to the fact that the GPT training data is relatively old.	2023-05-11 22:11:14 -04:00
vinoyang	e942db3e78	Add prestodb prompt (#4516 ) Add a PrestoDB prompt	2023-05-11 22:09:48 -04:00
SimFG	7bcf238a1a	Optimize the initialization method of GPTCache (#4522 ) Optimize the initialization method of GPTCache, so that users can use GPTCache more quickly.	2023-05-11 16:15:23 -07:00
Zander Chase	f4d3cf2dfb	Add Invocation Params (#4509 ) ### Add Invocation Params to Logged Run Adds an llm type to each chat model as well as an override of the dict() method to log the invocation parameters for each call --------- Co-authored-by: Ankush Gola <ankush.gola@gmail.com>	2023-05-11 15:34:06 -07:00
Ankush Gola	59853fc876	add invocation params as extra params in llm callbacks (#4506 ) # Your PR Title (What it does) <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoader Abstractions - @eyurtsev LLM/Chat Wrappers - @hwchase17 - @agola11 Tools / Toolkits - @vowelparrot -->	2023-05-11 15:33:52 -07:00
Ofey Chan	1c0ec26e40	[pyproject.toml] add `tiktoken` when install `langchain[openai]` (#4514 ) # Add `tiktoken` as dependency when installed as `langchain[openai]` Fixes #4513 (issue) ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @vowelparrot <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-11 12:21:06 -07:00
Zander Chase	4ee47926ca	Add on_chat_message_start (#4499 ) ### Add on_chat_message_start to callback manager and base tracer Goal: trace messages directly to permit reloading as chat messages (store in an integration-agnostic way) Add an `on_chat_message_start` method. Fall back to `on_llm_start()` for handlers that don't have it implemented. Does so in a non-backwards-compat breaking way (for now)	2023-05-11 11:06:39 -07:00
Yu Le	bbf76dbb52	fix typos in the prompts of LLMSummarizationCheckerChain (#4518 )	2023-05-11 10:32:34 -07:00
Jonas Nelle	97e7dc1502	Make BaseStringMessagePromptTemplate.from_template return type generic (#4523 ) # Make BaseStringMessagePromptTemplate.from_template return type generic I use mypy to check type on my code that uses langchain. Currently after I load a prompt and convert it to a system prompt I have to explicitly cast it which is quite ugly (and not necessary): ``` prompt_template = load_prompt("prompt.yaml") system_prompt_template = cast( SystemMessagePromptTemplate, SystemMessagePromptTemplate.from_template(prompt_template.template), ) ``` With this PR, the code would simply be: ``` prompt_template = load_prompt("prompt.yaml") system_prompt_template = SystemMessagePromptTemplate.from_template(prompt_template.template) ``` Given how much langchain uses inheritance, I think this type hinting could be applied in a bunch more places, e.g. load_prompt also return a `FewShotPromptTemplate` or a `PromptTemplate` but without typing the type checkers aren't able to infer that. Let me know if you agree and I can take a look at implementing that as well. @hwchase17 - project lead DataLoaders - @eyurtsev	2023-05-11 10:24:50 -07:00
kYLe	446b60d803	Fix a typo in langchain/docs/modules/models/llms/integrations/anyscale.ipynb (#4526 )	2023-05-11 09:03:04 -07:00

1701 changed files with 97454 additions and 41262 deletions

									
										37

.devcontainer/README.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,37 @@

				# Dev container

				This project includes a [dev container](https://containers.dev/), which lets you use a container as a full-featured dev environment.

				You can use the dev container configuration in this folder to build and run the app without needing to install any of its tools locally! You can use it in [GitHub Codespaces](https://github.com/features/codespaces) or the [VS Code Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers).

				## GitHub Codespaces

				[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/hwchase17/langchain)

				You may use the button above, or follow these steps to open this repo in a Codespace:

				1. Click the **Code** drop-down menu at the top of https://github.com/hwchase17/langchain.

				1. Click on the **Codespaces** tab.

				1. Click **Create codespace on master** .

				For more info, check out the [GitHub documentation](https://docs.github.com/en/free-pro-team@latest/github/developing-online-with-codespaces/creating-a-codespace#creating-a-codespace).

				## VS Code Dev Containers

				[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/hwchase17/langchain)

				If you already have VS Code and Docker installed, you can use the button above to get started. This will cause VS Code to automatically install the Dev Containers extension if needed, clone the source code into a container volume, and spin up a dev container for use.

				You can also follow these steps to open this repo in a container using the VS Code Dev Containers extension:

				1. If this is your first time using a development container, please ensure your system meets the pre-reqs (i.e. have Docker installed) in the [getting started steps](https://aka.ms/vscode-remote/containers/getting-started).

				2. Open a locally cloned copy of the code:

				   - Clone this repository to your local filesystem.

				   - Press <kbd>F1</kbd> and select the **Dev Containers: Open Folder in Container...** command.

				   - Select the cloned copy of this folder, wait for the container to start, and try things out!

				You can learn more in the [Dev Containers documentation](https://code.visualstudio.com/docs/devcontainers/containers).

				## Tips and tricks

				* If you are working with the same repository folder in a container and Windows, you'll want consistent line endings (otherwise you may see hundreds of changes in the SCM view). The `.gitattributes` file in the root of this repo will disable line ending conversion and should prevent this. See [tips and tricks](https://code.visualstudio.com/docs/devcontainers/tips-and-tricks#_resolving-git-line-ending-issues-in-containers-resulting-in-many-modified-files) for more info.

				* If you'd like to review the contents of the image used in this dev container, you can check it out in the [devcontainers/images](https://github.com/devcontainers/images/tree/main/src/python) repo.

									
										45

.devcontainer/devcontainer.json
									
												View File
												
				@@ -1,24 +1,26 @@

				// For format details, see https://aka.ms/devcontainer.json. For config options, see the

				// README at: https://github.com/devcontainers/templates/tree/main/src/docker-existing-dockerfile

				// README at: https://github.com/devcontainers/templates/tree/main/src/docker-existing-docker-compose

				{

					"dockerComposeFile": "./docker-compose.yaml",

					"service": "langchain",

					"workspaceFolder": "/workspaces/langchain",

					// Name for the dev container

					"name": "langchain",

					"customizations": {

						"vscode": {

							"extensions": [   

								"ms-python.python"

							],

							"settings": {

								"python.defaultInterpreterPath": "/home/vscode/langchain-py-env/bin/python3.11"

							}

						}

					},

					// Features to add to the dev container. More info: https://containers.dev/features.

					"features": {},

					// Point to a Docker Compose file

					"dockerComposeFile": "./docker-compose.yaml",

					// Required when using Docker Compose. The name of the service to connect to once running

					"service": "langchain",

					// The optional 'workspaceFolder' property is the path VS Code should open by default when

					// connected. This is typically a file mount in .devcontainer/docker-compose.yml

					"workspaceFolder": "/workspaces/${localWorkspaceFolderBasename}",

					// Prevent the container from shutting down

					"overrideCommand": true

					// Features to add to the dev container. More info: https://containers.dev/features

					// "features": {

					// 	"ghcr.io/devcontainers-contrib/features/poetry:2": {}

					// }

					// Use 'forwardPorts' to make a list of ports inside the container available locally.

					// "forwardPorts": [],

				@@ -26,8 +28,9 @@

					// Uncomment the next line to run commands after the container is created.

					// "postCreateCommand": "cat /etc/os-release",

					// Uncomment to connect as an existing user other than the container default. More info: https://aka.ms/dev-containers-non-root.

					// "remoteUser": "devcontainer"

					"remoteUser": "vscode",

					"overrideCommand": true

					// Configure tool-specific properties.

					// "customizations": {},

					// Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root.

					// "remoteUser": "root"

				}

									
										7

.devcontainer/docker-compose.yaml
									
												View File
												
				@@ -2,10 +2,11 @@ version: '3'

				services:

				  langchain:

				    build:

				      dockerfile: .devcontainer/Dockerfile

				      context: ../ 

				      dockerfile: dev.Dockerfile

				      context: ..

				    volumes:

				      - ../:/workspaces/langchain

				   # Update this to wherever you want VS Code to mount the folder of your project

				      - ..:/workspaces:cached

				    networks:

				      - langchain-network 

				  #   environment:

3

.gitattributes vendored Normal file

View File

@@ -0,0 +1,3 @@
 * text=auto eol=lf
 *.{cmd,[cC][mM][dD]} text eol=crlf
 *.{bat,[bB][aA][tT]} text eol=crlf

									
										43

.github/CONTRIBUTING.md
									
										vendored
									
												View File
												
				@@ -59,6 +59,8 @@ we do not want these to get in the way of getting good code into the codebase.

				## 🚀 Quick Start

				> **Note:** You can run this repository locally (which is described below) or in a [development container](https://containers.dev/) (which is described in the [.devcontainer folder](https://github.com/hwchase17/langchain/tree/master/.devcontainer)).

				This project uses [Poetry](https://python-poetry.org/) as a dependency manager. Check out Poetry's [documentation on how to install it](https://python-poetry.org/docs/#installation) on your system before proceeding.

				❗Note: If you use `Conda` or `Pyenv` as your environment / package manager, avoid dependency conflicts by doing the following first:

				@@ -115,8 +117,37 @@ To get a report of current coverage, run the following:

				make coverage

				```

				### Working with Optional Dependencies

				Langchain relies heavily on optional dependencies to keep the Langchain package lightweight.

				If you're adding a new dependency to Langchain, assume that it will be an optional dependency, and

				that most users won't have it installed.

				Users that do not have the dependency installed should be able to **import** your code without

				any side effects (no warnings, no errors, no exceptions). 

				To introduce the dependency to the pyproject.toml file correctly, please do the following: 

				1. Add the dependency to the main group as an optional dependency

				  ```bash

				  poetry add --optional [package_name]

				  ```

				2. Open pyproject.toml and add the dependency to the `extended_testing` extra

				3. Relock the poetry file to update the extra.

				  ```bash

				  poetry lock --no-update

				  ```

				4. Add a unit test that the very least attempts to import the new code. Ideally the unit

				test makes use of lightweight fixtures to test the logic of the code.

				5. Please use the `@pytest.mark.requires(package_name)` decorator for any tests that require the dependency.

				### Testing

				See section about optional dependencies.

				#### Unit Tests

				Unit tests cover modular logic that does not require calls to outside APIs.

				To run unit tests:

				@@ -133,8 +164,20 @@ make docker_tests

				If you add new logic, please add a unit test.

				#### Integration Tests

				Integration tests cover logic that requires making calls to outside APIs (often integration with other services).

				**warning** Almost no tests should be integration tests. 

				  Tests that require making network connections make it difficult for other

				  developers to test the code.

				  Instead favor relying on `responses` library and/or mock.patch to mock

				  requests using small fixtures.

				To run integration tests:

				```bash

									
										2

.github/ISSUE_TEMPLATE/bug-report.yml
									
										vendored
									
												View File
												
				@@ -46,7 +46,7 @@ body:

				        - @agola11

				        Tools / Toolkits

				        - @vowelparrot

				        - ...

				      placeholder: "@Username ..."

									
										56

.github/PULL_REQUEST_TEMPLATE.md
									
										vendored
									
												View File
												
				@@ -1,46 +1,56 @@

				# Your PR Title (What it does)

				<!--

				Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution.

				Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution.

				Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change.

				After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost.

				Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle!

				-->

				<!-- Remove if not applicable -->

				Fixes # (issue)

				## Before submitting

				#### Before submitting

				<!-- If you're adding a new integration, include an integration test and an example notebook showing its use! -->

				<!-- If you're adding a new integration, please include:

				## Who can review?

				1. a test for the integration - favor unit tests that does not rely on network access.

				2. an example notebook showing its use

				Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:

				See contribution guidelines for more information on how to write tests, lint

				etc:

				https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

				-->

				#### Who can review?

				Tag maintainers/contributors who might be interested:

				<!-- For a quicker response, figure out the right person to tag with @

				        @hwchase17 - project lead

				  @hwchase17 - project lead

				        Tracing / Callbacks

				        - @agola11

				  Tracing / Callbacks

				  - @agola11

				        Async

				        - @agola11

				  Async

				  - @agola11

				        DataLoaders

				        - @eyurtsev

				  DataLoaders

				  - @eyurtsev

				        Models

				        - @hwchase17

				        - @agola11

				  Models

				  - @hwchase17

				  - @agola11

				  Agents / Tools / Toolkits

				  - @hwchase17

				  VectorStores / Retrievers / Memory

				  - @dev2049

				        Agents / Tools / Toolkits

				        - @vowelparrot

				        VectorStores / Retrievers / Memory

				        - @dev2049

				 -->

									
										12

.github/actions/poetry_setup/action.yml
									
										vendored
									
												View File
												
				@@ -33,11 +33,13 @@ runs:

				  using: composite

				  steps:

				    - uses: actions/setup-python@v4

				      name: Setup python $${ inputs.python-version }}

				      with:

				        python-version: ${{ inputs.python-version }}

				    - uses: actions/cache@v3

				      id: cache-pip

				      name: Cache Pip ${{ inputs.python-version }}

				      env:

				        SEGMENT_DOWNLOAD_TIMEOUT_MIN: "15"

				      with:

				@@ -48,6 +50,16 @@ runs:

				    - run: pipx install poetry==${{ inputs.poetry-version }} --python python${{ inputs.python-version }}

				      shell: bash

				    - name: Check Poetry File

				      shell: bash

				      run: |

				        poetry check

				    - name: Check lock file

				      shell: bash

				      run: |

				        poetry lock --check

				    - uses: actions/cache@v3

				      id: cache-poetry

				      env:

									
										36

.github/workflows/linkcheck.yml
									
										vendored
									
												View File
											
				@@ -1,36 +0,0 @@

				name: linkcheck

				on:

				  push:

				    branches: [master]

				  pull_request:

				env:

				  POETRY_VERSION: "1.4.2"

				jobs:

				  build:

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        python-version:

				          - "3.11"

				    steps:

				      - uses: actions/checkout@v3

				      - name: Install poetry

				        run: |

				          pipx install poetry==$POETRY_VERSION

				      - name: Set up Python ${{ matrix.python-version }}

				        uses: actions/setup-python@v4

				        with:

				          python-version: ${{ matrix.python-version }}

				          cache: poetry

				      - name: Install dependencies

				        run: |

				          poetry install --with docs

				      - name: Build the docs

				        run: |

				          make docs_build

				      - name: Analyzing the docs with linkcheck

				        run: |

				          make docs_linkcheck

									
										7

.github/workflows/test.yml
									
										vendored
									
												View File
												
				@@ -4,6 +4,7 @@ on:

				  push:

				    branches: [master]

				  pull_request:

				  workflow_dispatch:

				env:

				  POETRY_VERSION: "1.4.2"

				@@ -40,5 +41,9 @@ jobs:

				              fi

				      - name: Run ${{matrix.test_type}} tests

				        run: |

				          make test

				          if [ "${{ matrix.test_type }}" == "core" ]; then

				            make test

				          else

				            make extended_tests

				          fi

				        shell: bash

17

.gitignore vendored

View File

@@ -73,6 +73,7 @@ instance/
 # Sphinx documentation
 docs/_build/
 docs/docs/_build/
 # PyBuilder
 target/
@@ -149,4 +150,18 @@ wandb/
 # integration test artifacts
 data_map*
 \[('_type', 'fake'), ('stop', None)]
 \[('_type', 'fake'), ('stop', None)]
 # Replit files
 *replit*
 node_modules
 docs/.yarn/
 docs/node_modules/
 docs/.docusaurus/
 docs/.cache-loader/
 docs/_dist
 docs/api_reference/_build
 docs/docs_skeleton/build
 docs/docs_skeleton/node_modules
 docs/docs_skeleton/yarn.lock

4

.gitmodules vendored Normal file

View File

@@ -0,0 +1,4 @@
 [submodule "docs/_docs_skeleton"]
 	path = docs/_docs_skeleton
 	url = https://github.com/langchain-ai/langchain-shared-docs
 	branch = main

									
										4

.readthedocs.yaml
									
												View File
												
				@@ -12,7 +12,7 @@ build:

				# Build documentation in the docs/ directory with Sphinx

				sphinx:

				   configuration: docs/conf.py

				   configuration: docs/api_reference/conf.py

				# If using Sphinx, optionally build your docs in additional formats such as PDF

				# formats:

				@@ -23,4 +23,4 @@ python:

				   install:

				   - requirements: docs/requirements.txt

				   - method: pip

				     path: .

				     path: .

									
										16

Makefile
									
												View File
												
				@@ -1,4 +1,4 @@

				.PHONY: all clean format lint test tests test_watch integration_tests docker_tests help

				.PHONY: all clean format lint test tests test_watch integration_tests docker_tests help extended_tests

				all: help

				@@ -10,6 +10,9 @@ coverage:

				clean: docs_clean

				docs_compile:

					poetry run nbdoc_build --srcdir $(srcdir)

				docs_build:

					cd docs && poetry run make html

				@@ -35,10 +38,13 @@ lint lint_diff:

				TEST_FILE ?= tests/unit_tests/

				test:

					poetry run pytest $(TEST_FILE)

					poetry run pytest --disable-socket --allow-unix-socket $(TEST_FILE)

				tests:

					poetry run pytest $(TEST_FILE)

				tests: 

					poetry run pytest --disable-socket --allow-unix-socket $(TEST_FILE)

				extended_tests:

					poetry run pytest --disable-socket --allow-unix-socket --only-extended tests/unit_tests

				test_watch:

					poetry run ptw --now . -- tests/unit_tests

				@@ -59,7 +65,9 @@ help:

					@echo 'format                       - run code formatters'

					@echo 'lint                         - run linters'

					@echo 'test                         - run unit tests'

					@echo 'tests                        - run unit tests'

					@echo 'test TEST_FILE=<test_file>   - run all tests in file'

					@echo 'extended_tests               - run only extended unit tests'

					@echo 'test_watch                   - run unit tests in watch mode'

					@echo 'integration_tests            - run integration tests'

					@echo 'docker_tests                 - run unit tests in docker'

									
										14

README.md
									
												View File
												
				@@ -2,9 +2,9 @@

				⚡ Building applications with LLMs through composability ⚡

				[![Release Notes](https://img.shields.io/github/release/hwchase17/langchain)](https://github.com/hwchase17/langchain/releases)

				[![lint](https://github.com/hwchase17/langchain/actions/workflows/lint.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/lint.yml)

				[![test](https://github.com/hwchase17/langchain/actions/workflows/test.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/test.yml)

				[![linkcheck](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml)

				[![Downloads](https://static.pepy.tech/badge/langchain/month)](https://pepy.tech/project/langchain)

				[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

				[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai)

				@@ -12,6 +12,8 @@

				[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/hwchase17/langchain)

				[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/hwchase17/langchain)

				[![GitHub star chart](https://img.shields.io/github/stars/hwchase17/langchain?style=social)](https://star-history.com/#hwchase17/langchain)

				[![Dependency Status](https://img.shields.io/librariesio/github/hwchase17/langchain)](https://libraries.io/github/hwchase17/langchain)

				[![Open Issues](https://img.shields.io/github/issues-raw/hwchase17/langchain)](https://github.com/hwchase17/langchain/issues)

				Looking for the JS/TS version? Check out [LangChain.js](https://github.com/hwchase17/langchainjs).

				@@ -33,22 +35,22 @@ This library aims to assist in the development of those types of applications. C

				**❓ Question Answering over specific documents**

				- [Documentation](https://langchain.readthedocs.io/en/latest/use_cases/question_answering.html)

				- [Documentation](https://python.langchain.com/docs/use_cases/question_answering/)

				- End-to-end Example: [Question Answering over Notion Database](https://github.com/hwchase17/notion-qa)

				**💬 Chatbots**

				- [Documentation](https://langchain.readthedocs.io/en/latest/use_cases/chatbots.html)

				- [Documentation](https://python.langchain.com/docs/use_cases/chatbots/)

				- End-to-end Example: [Chat-LangChain](https://github.com/hwchase17/chat-langchain)

				**🤖 Agents**

				- [Documentation](https://langchain.readthedocs.io/en/latest/modules/agents.html)

				- [Documentation](https://python.langchain.com/docs/modules/agents/)

				- End-to-end Example: [GPT+WolframAlpha](https://huggingface.co/spaces/JavaFXpert/Chat-GPT-LangChain)

				## 📖 Documentation

				Please see [here](https://langchain.readthedocs.io/en/latest/?) for full documentation on:

				Please see [here](https://python.langchain.com) for full documentation on:

				- Getting started (installation, setting up the environment, simple examples)

				- How-To examples (demos, integrations, helper functions)

				@@ -84,7 +86,7 @@ Memory refers to persisting state between calls of a chain/agent. LangChain prov

				[BETA] Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.

				For more information on these concepts, please see our [full documentation](https://langchain.readthedocs.io/en/latest/).

				For more information on these concepts, please see our [full documentation](https://python.langchain.com).

				## 💁 Contributing

									
										11

.devcontainer/Dockerfile → dev.Dockerfile
									
												View File
												
				@@ -1,15 +1,15 @@

				# This is a Dockerfile for Developer Container

				# This is a Dockerfile for the Development Container

				# Use the Python base image

				ARG VARIANT="3.11-bullseye"

				FROM mcr.microsoft.com/vscode/devcontainers/python:0-${VARIANT} AS langchain-dev-base

				FROM mcr.microsoft.com/devcontainers/python:0-${VARIANT} AS langchain-dev-base

				USER vscode

				# Define the version of Poetry to install (default is 1.4.2)

				# Define the directory of python virtual environment

				ARG PYTHON_VIRTUALENV_HOME=/home/vscode/langchain-py-env \

				    POETRY_VERSION=1.4.2 

				    POETRY_VERSION=1.3.2

				ENV POETRY_VIRTUALENVS_IN_PROJECT=false \

				    POETRY_NO_INTERACTION=true 

				@@ -35,8 +35,7 @@ FROM langchain-dev-base AS langchain-dev-dependencies

				ARG PYTHON_VIRTUALENV_HOME

				# Copy only the dependency files for installation

				COPY pyproject.toml poetry.lock poetry.toml ./

				COPY pyproject.toml poetry.toml ./

				# Install the Poetry dependencies (this layer will be cached as long as the dependencies don't change)

				RUN poetry install --no-interaction --no-ansi --with dev,test,docs

				RUN poetry install --no-interaction --no-ansi --with dev,test,docs

									
										12

docs/.local_build.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,12 @@

				mkdir _dist

				cp -r {docs_skeleton,snippets} _dist

				mkdir -p _dist/docs_skeleton/static/api_reference

				cd api_reference

				poetry run make html

				cp -r _build/* ../_dist/docs_skeleton/static/api_reference

				cd ..

				cp -r extras/* _dist/docs_skeleton/docs

				cd _dist/docs_skeleton

				poetry run nbdoc_build

				yarn install

				yarn start

0

docs/Makefile → docs/api_reference/Makefile

View File

									
										2

docs/_static/css/custom.css → docs/api_reference/_static/css/custom.css
									
												View File
												
				@@ -13,5 +13,5 @@ pre {

				}

				#my-component-root *, #headlessui-portal-root * {

				  z-index: 1000000000000;

				  z-index: 10000;

				}

									
										57

docs/api_reference/_static/js/mendablesearch.js
									
										Normal file
									
												View File
												
				@@ -0,0 +1,57 @@

				document.addEventListener('DOMContentLoaded', () => {

				  // Load the external dependencies

				  function loadScript(src, onLoadCallback) {

				    const script = document.createElement('script');

				    script.src = src;

				    script.onload = onLoadCallback;

				    document.head.appendChild(script);

				  }

				  function createRootElement() {

				    const rootElement = document.createElement('div');

				    rootElement.id = 'my-component-root';

				    document.body.appendChild(rootElement);

				    return rootElement;

				  }

				  function initializeMendable() {

				    const rootElement = createRootElement();

				    const { MendableFloatingButton } = Mendable;

				    const iconSpan1 = React.createElement('span', {

				    }, '🦜');

				    const iconSpan2 = React.createElement('span', {

				    }, '🔗');

				    const icon = React.createElement('p', {

				      style: { color: '#ffffff', fontSize: '22px',width: '48px', height: '48px', margin: '0px', padding: '0px', display: 'flex', alignItems: 'center', justifyContent: 'center', textAlign: 'center' },

				    }, [iconSpan1, iconSpan2]);

				    const mendableFloatingButton = React.createElement(

				      MendableFloatingButton,

				      {

				        style: { darkMode: false, accentColor: '#010810' },

				        floatingButtonStyle: { color: '#ffffff', backgroundColor: '#010810' },

				        anon_key: '82842b36-3ea6-49b2-9fb8-52cfc4bde6bf', // Mendable Search Public ANON key, ok to be public

				        cmdShortcutKey:'j',

				        messageSettings: {

				          openSourcesInNewTab: false,

				          prettySources: true // Prettify the sources displayed now

				        },

				        icon: icon,

				      }

				    );

				    ReactDOM.render(mendableFloatingButton, rootElement);

				  }

				  loadScript('https://unpkg.com/react@17/umd/react.production.min.js', () => {

				    loadScript('https://unpkg.com/react-dom@17/umd/react-dom.production.min.js', () => {

				      loadScript('https://unpkg.com/@mendable/search@0.0.102/dist/umd/mendable.min.js', initializeMendable);

				    });

				  });

				});

0

docs/reference/agents.rst → docs/api_reference/agents.rst

View File

									
										25

docs/conf.py → docs/api_reference/conf.py
									
												View File
												
				@@ -17,7 +17,7 @@

				import toml

				with open("../pyproject.toml") as f:

				with open("../../pyproject.toml") as f:

				    data = toml.load(f)

				# -- Project information -----------------------------------------------------

				@@ -49,19 +49,31 @@ extensions = [

				    "sphinx_copybutton",

				    "sphinx_panels",

				    "IPython.sphinxext.ipython_console_highlighting",

				    "sphinx_tabs.tabs",

				]

				source_suffix = [".ipynb", ".html", ".md", ".rst"]

				source_suffix = [".rst"]

				autodoc_pydantic_model_show_json = False

				autodoc_pydantic_field_list_validators = False

				autodoc_pydantic_config_members = False

				autodoc_pydantic_model_show_config_summary = False

				autodoc_pydantic_model_show_validator_members = False

				autodoc_pydantic_model_show_validator_summary = False

				autodoc_pydantic_model_show_field_summary = False

				autodoc_pydantic_model_members = False

				autodoc_pydantic_model_undoc_members = False

				# autodoc_typehints = "signature"

				# autodoc_typehints = "description"

				autodoc_pydantic_model_hide_paramlist = False

				autodoc_pydantic_model_signature_prefix = "class"

				autodoc_pydantic_field_signature_prefix = "attribute"

				autodoc_pydantic_model_summary_list_order = "bysource"

				autodoc_member_order = "bysource"

				autodoc_default_options = {

				    "members": True,

				    "show-inheritance": True,

				    "undoc_members": True,

				    "inherited_members": "BaseModel",

				}

				autodoc_typehints = "description"

				# Add any paths that contain templates here, relative to this directory.

				templates_path = ["_templates"]

				@@ -77,12 +89,13 @@ exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]

				# The theme to use for HTML and HTML Help pages.  See the documentation for

				# a list of builtin themes.

				#

				html_theme = "sphinx_book_theme"

				html_theme = "sphinx_rtd_theme"

				html_theme_options = {

				    "path_to_docs": "docs",

				    "repository_url": "https://github.com/hwchase17/langchain",

				    "use_repository_button": True,

				    # "style_nav_header_background": "white"

				}

				html_context = {

				@@ -90,7 +103,7 @@ html_context = {

				    "github_user": "hwchase17",  # Username

				    "github_repo": "langchain",  # Repo name

				    "github_version": "master",  # Version

				    "conf_py_path": "/docs/",  # Path in the checkout to the docs root

				    "conf_py_path": "/docs/api_reference",  # Path in the checkout to the docs root

				}

				# Add any paths that contain custom static files (such as style sheets) here,

									
										9

docs/reference/indexes.rst → docs/api_reference/data_connection.rst
									
												View File
												
				@@ -1,16 +1,13 @@

				Indexes

				Data connection

				==============

				Indexes refer to ways to structure documents so that LLMs can best interact with them.

				LangChain has a number of modules that help you load, structure, store, and retrieve documents.

				.. toctree::

				   :maxdepth: 1

				   :glob:

				   modules/docstore

				   modules/text_splitter

				   modules/document_loaders

				   modules/document_transformers

				   modules/embeddings

				   modules/vectorstores

				   modules/retrievers

				   modules/document_compressors

				   modules/document_transformers

									
										29

docs/api_reference/index.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,29 @@

				API Reference

				==========================

				| Full documentation on all methods, classes, and APIs in the LangChain Python package.

				.. toctree::

				   :maxdepth: 1

				   :caption: Abstractions

				   ./modules/base_classes.rst

				.. toctree::

				   :maxdepth: 1

				   :caption: Core

				   ./model_io.rst

				   ./data_connection.rst

				   ./modules/chains.rst

				   ./agents.rst

				   ./modules/memory.rst

				   ./modules/callbacks.rst

				.. toctree::

				   :maxdepth: 1

				   :caption: Additional

				   ./modules/utilities.rst

				   ./modules/experimental.rst

0

docs/make.bat → docs/api_reference/make.bat

View File

									
										12

docs/api_reference/model_io.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,12 @@

				Model I/O

				==============

				LangChain provides interfaces and integrations for working with language models.

				.. toctree::

				   :maxdepth: 1

				   :glob:

				   ./prompts.rst

				   ./models.rst

				   ./modules/output_parsers.rst

1

docs/reference/models.rst → docs/api_reference/models.rst

View File

@@ -9,4 +9,3 @@ LangChain provides interfaces and integrations for a number of different types o
    modules/llms
    modules/chat_models
    modules/embeddings

0

docs/reference/modules/agent_toolkits.rst → docs/api_reference/modules/agent_toolkits.rst

View File

0

docs/reference/modules/agents.rst → docs/api_reference/modules/agents.rst

View File

									
										5

docs/api_reference/modules/base_classes.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,5 @@

				Base classes

				========================

				.. automodule:: langchain.schema

				   :inherited-members:

									
										7

docs/api_reference/modules/callbacks.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,7 @@

				Callbacks

				=======================

				.. automodule:: langchain.callbacks

				   :members:

				   :undoc-members:

									
										1

docs/reference/modules/chains.rst → docs/api_reference/modules/chains.rst
									
												View File
												
				@@ -4,4 +4,5 @@ Chains

				.. automodule:: langchain.chains

				   :members:

				   :undoc-members:

				   :inherited-members: BaseModel

0

docs/reference/modules/chat_models.rst → docs/api_reference/modules/chat_models.rst

View File

0

docs/reference/modules/document_loaders.rst → docs/api_reference/modules/document_loaders.rst

View File

									
										6

docs/reference/modules/document_transformers.rst → docs/api_reference/modules/document_transformers.rst
									
												View File
												
				@@ -5,3 +5,9 @@ Document Transformers

				   :members:

				   :undoc-members:

				Text Splitters

				------------------------------

				.. automodule:: langchain.text_splitter

				   :members:

				   :undoc-members:

0

docs/reference/modules/embeddings.rst → docs/api_reference/modules/embeddings.rst

View File

0

docs/reference/modules/example_selector.rst → docs/api_reference/modules/example_selector.rst

View File

10

docs/reference/modules/experimental.rst → docs/api_reference/modules/experimental.rst

View File

@@ -1,10 +1,10 @@
 ==========
 Experimental Modules
 ==========
 ====================
 Experimental
 ====================
 This module contains experimental modules and reproductions of existing work using LangChain primitives.
 Autonomous Agents
 Autonomous agents
 ------------------
 Here, we document the BabyAGI and AutoGPT classes from the langchain.experimental module.
@@ -16,7 +16,7 @@ Here, we document the BabyAGI and AutoGPT classes from the langchain.experimenta
    :members:
 Generative Agents
 Generative agents
 ------------------
 Here, we document the GenerativeAgent and GenerativeAgentMemory classes from the langchain.experimental module.

0

docs/reference/modules/llms.rst → docs/api_reference/modules/llms.rst

View File

0

docs/reference/modules/memory.rst → docs/api_reference/modules/memory.rst

View File

0

docs/reference/modules/output_parsers.rst → docs/api_reference/modules/output_parsers.rst

View File

									
										3

docs/reference/modules/prompts.rst → docs/api_reference/modules/prompts.rst
									
												View File
												
				@@ -1,5 +1,6 @@

				PromptTemplates

				Prompt Templates

				========================

				.. automodule:: langchain.prompts

				   :members:

				   :undoc-members:

									
										14

docs/api_reference/modules/retrievers.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,14 @@

				Retrievers

				===============================

				.. automodule:: langchain.retrievers

				   :members:

				   :undoc-members:

				Document compressors

				-------------------------------

				.. automodule:: langchain.retrievers.document_compressors

				   :members:

				   :undoc-members:

0

docs/reference/modules/tools.rst → docs/api_reference/modules/tools.rst

View File

0

docs/reference/modules/utilities.rst → docs/api_reference/modules/utilities.rst

View File

0

docs/reference/modules/vectorstores.rst → docs/api_reference/modules/vectorstores.rst

View File

1

docs/reference/prompts.rst → docs/api_reference/prompts.rst

View File

@@ -9,4 +9,3 @@ The reference guides here all relate to objects for working with Prompts.
    modules/prompts
    modules/example_selector
    modules/output_parsers

7

docs/docs_skeleton/.gitignore vendored Normal file

View File

@@ -0,0 +1,7 @@
 .yarn/
 node_modules/
 .docusaurus
 .cache-loader
 docs/api

									
										49

docs/docs_skeleton/README.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,49 @@

				# Website

				This website is built using [Docusaurus 2](https://docusaurus.io/), a modern static website generator.

				### Installation

				```

				$ yarn

				```

				### Local Development

				```

				$ yarn start

				```

				This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.

				### Build

				```

				$ yarn build

				```

				This command generates static content into the `build` directory and can be served using any static contents hosting service.

				### Deployment

				Using SSH:

				```

				$ USE_SSH=true yarn deploy

				```

				Not using SSH:

				```

				$ GIT_USER=<Your GitHub username> yarn deploy

				```

				If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.

				### Continuous Integration

				Some common defaults for linting/formatting have been set for you. If you integrate your project with an open source Continuous Integration system (e.g. Travis CI, CircleCI), you may check for issues using the following command.

				```

				$ yarn ci

				```

									
										12

docs/docs_skeleton/babel.config.js
									
										Normal file
									
												View File
												
				@@ -0,0 +1,12 @@

				/**

				 * Copyright (c) Meta Platforms, Inc. and affiliates.

				 *

				 * This source code is licensed under the MIT license found in the

				 * LICENSE file in the root directory of this source tree.

				 *

				 * @format

				 */

				module.exports = {

				  presets: [require.resolve("@docusaurus/core/lib/babel/preset")],

				};

									
										76

docs/docs_skeleton/code-block-loader.js
									
										Normal file
									
												View File
												
				@@ -0,0 +1,76 @@

				/* eslint-disable prefer-template */

				/* eslint-disable no-param-reassign */

				// eslint-disable-next-line import/no-extraneous-dependencies

				const babel = require("@babel/core");

				const path = require("path");

				const fs = require("fs");

				/**

				 *

				 * @param {string|Buffer} content Content of the resource file

				 * @param {object} [map] SourceMap data consumable by https://github.com/mozilla/source-map

				 * @param {any} [meta] Meta data, could be anything

				 */

				async function webpackLoader(content, map, meta) {

				  const cb = this.async();

				  if (!this.resourcePath.endsWith(".ts")) {

				    cb(null, JSON.stringify({ content, imports: [] }), map, meta);

				    return;

				  }

				  try {

				    const result = await babel.parseAsync(content, {

				      sourceType: "module",

				      filename: this.resourcePath,

				    });

				    const imports = [];

				    result.program.body.forEach((node) => {

				      if (node.type === "ImportDeclaration") {

				        const source = node.source.value;

				        if (!source.startsWith("langchain")) {

				          return;

				        }

				        node.specifiers.forEach((specifier) => {

				          if (specifier.type === "ImportSpecifier") {

				            const local = specifier.local.name;

				            const imported = specifier.imported.name;

				            imports.push({ local, imported, source });

				          } else {

				            throw new Error("Unsupported import type");

				          }

				        });

				      }

				    });

				    imports.forEach((imp) => {

				      const { imported, source } = imp;

				      const moduleName = source.split("/").slice(1).join("_");

				      const docsPath = path.resolve(__dirname, "docs", "api", moduleName);

				      const available = fs.readdirSync(docsPath, { withFileTypes: true });

				      const found = available.find(

				        (dirent) =>

				          dirent.isDirectory() &&

				          fs.existsSync(path.resolve(docsPath, dirent.name, imported + ".md"))

				      );

				      if (found) {

				        imp.docs =

				          "/" + path.join("docs", "api", moduleName, found.name, imported);

				      } else {

				        throw new Error(

				          `Could not find docs for ${source}.${imported} in docs/api/`

				        );

				      }

				    });

				    cb(null, JSON.stringify({ content, imports }), map, meta);

				  } catch (err) {

				    cb(err);

				  }

				}

				module.exports = webpackLoader;

0

docs/_static/ApifyActors.png → docs/docs_skeleton/docs/_static/ApifyActors.png vendored

View File

Before

Width: | Height: | Size: 559 KiB

After

Width: | Height: | Size: 559 KiB

0

docs/_static/DataberryDashboard.png → docs/docs_skeleton/docs/_static/DataberryDashboard.png vendored

View File

Before

Width: | Height: | Size: 157 KiB

After

Width: | Height: | Size: 157 KiB

0

docs/_static/HeliconeDashboard.png → docs/docs_skeleton/docs/_static/HeliconeDashboard.png vendored

View File

Before

Width: | Height: | Size: 235 KiB

After

Width: | Height: | Size: 235 KiB

0

docs/_static/HeliconeKeys.png → docs/docs_skeleton/docs/_static/HeliconeKeys.png vendored

View File

Before

Width: | Height: | Size: 148 KiB

After

Width: | Height: | Size: 148 KiB

0

docs/_static/MetalDash.png → docs/docs_skeleton/docs/_static/MetalDash.png vendored

View File

Before

Width: | Height: | Size: 3.5 MiB

After

Width: | Height: | Size: 3.5 MiB

BIN
docs/docs_skeleton/docs/_static/android-chrome-192x192.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 18 KiB

BIN
docs/docs_skeleton/docs/_static/android-chrome-512x512.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 85 KiB

BIN
docs/docs_skeleton/docs/_static/apple-touch-icon.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 16 KiB

									
										21

docs/docs_skeleton/docs/_static/css/custom.css
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,21 @@

				pre {

				  white-space: break-spaces;

				}

				@media (min-width: 1200px) {

				  .container,

				  .container-lg,

				  .container-md,

				  .container-sm,

				  .container-xl {

				    max-width: 2560px !important;

				  }

				}

				#my-component-root *, #headlessui-portal-root * {

				  z-index: 10000;

				}

				.content-container p {

				    margin: revert;

				}

BIN
docs/docs_skeleton/docs/_static/favicon-16x16.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 542 B

BIN
docs/docs_skeleton/docs/_static/favicon-32x32.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 1.2 KiB

BIN
docs/docs_skeleton/docs/_static/favicon.ico vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 15 KiB

									
										6

docs/_static/js/mendablesearch.js → docs/docs_skeleton/docs/_static/js/mendablesearch.js
									
										vendored
									
												View File
												
				@@ -30,10 +30,7 @@ document.addEventListener('DOMContentLoaded', () => {

				    const icon = React.createElement('p', {

				      style: { color: '#ffffff', fontSize: '22px',width: '48px', height: '48px', margin: '0px', padding: '0px', display: 'flex', alignItems: 'center', justifyContent: 'center', textAlign: 'center' },

				    }, [iconSpan1, iconSpan2]);

				    const mendableFloatingButton = React.createElement(

				      MendableFloatingButton,

				      {

				@@ -42,6 +39,7 @@ document.addEventListener('DOMContentLoaded', () => {

				        anon_key: '82842b36-3ea6-49b2-9fb8-52cfc4bde6bf', // Mendable Search Public ANON key, ok to be public

				        messageSettings: {

				          openSourcesInNewTab: false,

				          prettySources: true // Prettify the sources displayed now

				        },

				        icon: icon,

				      }

				@@ -52,7 +50,7 @@ document.addEventListener('DOMContentLoaded', () => {

				  loadScript('https://unpkg.com/react@17/umd/react.production.min.js', () => {

				    loadScript('https://unpkg.com/react-dom@17/umd/react-dom.production.min.js', () => {

				      loadScript('https://unpkg.com/@mendable/search@0.0.93/dist/umd/mendable.min.js', initializeMendable);

				      loadScript('https://unpkg.com/@mendable/search@0.0.102/dist/umd/mendable.min.js', initializeMendable);

				    });

				  });

				});

BIN
docs/docs_skeleton/docs/_static/lc_modules.jpg vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 103 KiB

BIN
docs/docs_skeleton/docs/_static/parrot-chainlink-icon.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 136 KiB

BIN
docs/docs_skeleton/docs/_static/parrot-icon.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 34 KiB

8

docs/docs_skeleton/docs/ecosystem/integrations/index.mdx Normal file

View File

@@ -0,0 +1,8 @@
 ---
 sidebar_position: 0
 ---
 # Integrations
 import DocCardList from "@theme/DocCardList";
 <DocCardList />

5

docs/docs_skeleton/docs/get_started/installation.mdx Normal file

View File

@@ -0,0 +1,5 @@
 # Installation
 import Installation from "@snippets/get_started/installation.mdx"
 <Installation/>

65

docs/docs_skeleton/docs/get_started/introduction.mdx Normal file

View File

@@ -0,0 +1,65 @@
 ---
 sidebar_position: 0
 ---
 # Introduction
 **LangChain** is a framework for developing applications powered by language models. It enables applications that are:
 - **Data-aware**: connect a language model to other sources of data
 - **Agentic**: allow a language model to interact with its environment
 The main value props of LangChain are:
 . **Components**: abstractions for working with language models, along with a collection of implementations for each abstraction. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
 . **Off-the-shelf chains**: a structured assembly of components for accomplishing specific higher-level tasks
 Off-the-shelf chains make it easy to get started. For more complex applications and nuanced use-cases, components make it easy to customize existing chains or build new ones.
 ## Get started
 [Here’s](/docs/get_started/installation.html) how to install LangChain, set up your environment, and start building.
 We recommend following our [Quickstart](/docs/get_started/quickstart.html) guide to familiarize yourself with the framework by building your first LangChain application.
 _**Note**: These docs are for the LangChain [Python package](https://github.com/hwchase17/langchain). For documentation on [LangChain.js](https://github.com/hwchase17/langchainjs), the JS/TS version, [head here](https://js.langchain.com/docs)._
 ## Modules
 LangChain provides standard, extendable interfaces and external integrations for the following modules, listed from least to most complex:
 #### [Model I/O](/docs/modules/model_io/)
 Interface with language models
 #### [Data connection](/docs/modules/data_connection/)
 Interface with application-specific data
 #### [Chains](/docs/modules/chains/)
 Construct sequences of calls
 #### [Agents](/docs/modules/agents/)
 Let chains choose which tools to use given high-level directives
 #### [Memory](/docs/modules/memory/)
 Persist application state between runs of a chain
 #### [Callbacks](/docs/modules/callbacks/)
 Log and stream intermediate steps of any chain
 ## Examples, ecosystem, and resources
 ### [Use cases](/docs/use_cases/)
 Walkthroughs and best-practices for common end-to-end use cases, like:
 - [Chatbots](/docs/use_cases/chatbots/)
 - [Answering questions using sources](/docs/use_cases/question_answering/)
 - [Analyzing structured data](/docs/use_cases/tabular.html)
 - and much more...
 ### [Guides](/docs/guides/)
 Learn best practices for developing with LangChain.
 ### [Ecosystem](/docs/ecosystem/)
 LangChain is part of a rich ecosystem of tools that integrate with our framework and build on top of it. Check out our growing list of [integrations](/docs/ecosystem/integrations/) and [dependent repos](/docs/ecosystem/dependents.html).
 ### [Additional resources](/docs/additional_resources/)
 Our community is full of prolific developers, creative builders, and fantastic teachers. Check out [YouTube tutorials](/docs/ecosystem/youtube.html) for great tutorials from folks in the community, and [Gallery](https://github.com/kyrolabs/awesome-langchain) for a list of awesome LangChain projects, compiled by the folks at [KyroLabs](https://kyrolabs.com).
 <h3><span style={{color:"#2e8555"}}> Support </span></h3>
 Join us on [GitHub](https://github.com/hwchase17/langchain) or [Discord](https://discord.gg/6adMQxSpJS) to ask questions, share feedback, meet other developers building with LangChain, and dream about the future of LLM’s.
 ## API reference
 Head to the [reference](https://api.python.langchain.com) section for full documentation of all classes and methods in the LangChain Python package.

158

docs/docs_skeleton/docs/get_started/quickstart.mdx Normal file

View File

@@ -0,0 +1,158 @@
 # Quickstart
 ## Installation
 To install LangChain run:
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 import Install from "@snippets/get_started/quickstart/installation.mdx"
 <Install/>
 For more details, see our [Installation guide](/docs/get_started/installation.html).
 ## Environment setup
 Using LangChain will usually require integrations with one or more model providers, data stores, APIs, etc. For this example, we'll use OpenAI's model APIs.
 import OpenAISetup from "@snippets/get_started/quickstart/openai_setup.mdx"
 <OpenAISetup/>
 ## Building an application
 Now we can start building our language model application. LangChain provides many modules that can be used to build language model applications. Modules can be used as stand-alones in simple applications and they can be combined for more complex use cases.
 ## LLMs
 #### Get predictions from a language model
 The basic building block of LangChain is the LLM, which takes in text and generates more text.
 As an example, suppose we're building an application that generates a company name based on a company description. In order to do this, we need to initialize an OpenAI model wrapper. In this case, since we want the outputs to be MORE random, we'll initialize our model with a HIGH temperature.
 import LLM from "@snippets/get_started/quickstart/llm.mdx"
 <LLM/>
 ## Chat models
 Chat models are a variation on language models. While chat models use language models under the hood, the interface they expose is a bit different: rather than expose a "text in, text out" API, they expose an interface where "chat messages" are the inputs and outputs.
 You can get chat completions by passing one or more messages to the chat model. The response will be a message. The types of messages currently supported in LangChain are `AIMessage`, `HumanMessage`, `SystemMessage`, and `ChatMessage` -- `ChatMessage` takes in an arbitrary role parameter. Most of the time, you'll just be dealing with `HumanMessage`, `AIMessage`, and `SystemMessage`.
 import ChatModel from "@snippets/get_started/quickstart/chat_model.mdx"
 <ChatModel/>
 ## Prompt templates
 Most LLM applications do not pass user input directly into to an LLM. Usually they will add the user input to a larger piece of text, called a prompt template, that provides additional context on the specific task at hand.
 In the previous example, the text we passed to the model contained instructions to generate a company name. For our application, it'd be great if the user only had to provide the description of a company/product, without having to worry about giving the model instructions.
 import PromptTemplateLLM from "@snippets/get_started/quickstart/prompt_templates_llms.mdx"
 import PromptTemplateChatModel from "@snippets/get_started/quickstart/prompt_templates_chat_models.mdx"
 <Tabs>
     <TabItem value="llms" label="LLMs" default>
 With PromptTemplates this is easy! In this case our template would be very simple:
 <PromptTemplateLLM/>
 </TabItem>
 <TabItem value="chat_models" label="Chat models">
 Similar to LLMs, you can make use of templating by using a `MessagePromptTemplate`. You can build a `ChatPromptTemplate` from one or more `MessagePromptTemplate`s. You can use `ChatPromptTemplate`'s `format_messages` method to generate the formatted messages.
 Because this is generating a list of messages, it is slightly more complex than the normal prompt template which is generating only a string. Please see the detailed guides on prompts to understand more options available to you here.
 <PromptTemplateChatModel/>
     </TabItem>
 </Tabs>
 ## Chains
 Now that we've got a model and a prompt template, we'll want to combine the two. Chains give us a way to link (or chain) together multiple primitives, like models, prompts, and other chains.
 import ChainLLM from "@snippets/get_started/quickstart/chains_llms.mdx"
 import ChainChatModel from "@snippets/get_started/quickstart/chains_chat_models.mdx"
 <Tabs>
 <TabItem value="llms" label="LLMs" default>
 The simplest and most common type of chain is an LLMChain, which passes an input first to a PromptTemplate and then to an LLM. We can construct an LLM chain from our existing model and prompt template.
 <ChainLLM/>
 There we go, our first chain! Understanding how this simple chain works will set you up well for working with more complex chains.
 </TabItem>
 <TabItem value="chat_models" label="Chat models">
 The `LLMChain` can be used with chat models as well:
 <ChainChatModel/>
 </TabItem>
 </Tabs>
 ## Agents
 import AgentLLM from "@snippets/get_started/quickstart/agents_llms.mdx"
 import AgentChatModel from "@snippets/get_started/quickstart/agents_chat_models.mdx"
 Our first chain ran a pre-determined sequence of steps. To handle complex workflows, we need to be able to dynamically choose actions based on inputs.
 Agents do just this: they use a language model to determine which actions to take and in what order. Agents are given access to tools, and they repeatedly choose a tool, run the tool, and observe the output until they come up with a final answer.
 To load an agent, you need to choose a(n):
 - LLM/Chat model: The language model powering the agent.
 - Tool(s): A function that performs a specific duty. This can be things like: Google Search, Database lookup, Python REPL, other chains. For a list of predefined tools and their specifications, see the [Tools documentation](/docs/modules/agents/tools/).
 - Agent name: A string that references a supported agent class. An agent class is largely parameterized by the prompt the language model uses to determine which action to take. Because this notebook focuses on the simplest, highest level API, this only covers using the standard supported agents. If you want to implement a custom agent, see [here](/docs/modules/agents/how_to/custom_agent.html). For a list of supported agents and their specifications, see [here](/docs/modules/agents/agent_types/).
 For this example, we'll be using SerpAPI to query a search engine.
 You'll need to install the SerpAPI Python package:
 ```bash
 pip install google-search-results
 ```
 And set the `SERPAPI_API_KEY` environment variable.
 <Tabs>
 <TabItem value="llms" label="LLMs" default>
 <AgentLLM/>
 </TabItem>
 <TabItem value="chat_models" label="Chat models">
 Agents can also be used with chat models, you can initialize one using `AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION` as the agent type.
 <AgentChatModel/>
 </TabItem>
 </Tabs>
 ## Memory
 The chains and agents we've looked at so far have been stateless, but for many applications it's necessary to reference past interactions. This is clearly the case with a chatbot for example, where you want it to understand new messages in the context of past messages.
 The Memory module gives you a way to maintain application state. The base Memory interface is simple: it lets you update state given the latest run inputs and outputs and it lets you modify (or contextualize) the next input using the stored state.
 There are a number of built-in memory systems. The simplest of these are is a buffer memory which just prepends the last few inputs/outputs to the current input - we will use this in the example below.
 import MemoryLLM from "@snippets/get_started/quickstart/memory_llms.mdx"
 import MemoryChatModel from "@snippets/get_started/quickstart/memory_chat_models.mdx"
 <Tabs>
 <TabItem value="llms" label="LLMs" default>
 <MemoryLLM/>
 </TabItem>
 <TabItem value="chat_models" label="Chat models">
 You can use Memory with chains and agents initialized with chat models. The main difference between this and Memory for LLMs is that rather than trying to condense all previous messages into a string, we can keep them as their own unique memory object.
 <MemoryChatModel/>
 </TabItem>
 </Tabs>

13

docs/docs_skeleton/docs/modules/agents/agent_types/chat_conversation_agent.mdx Normal file

View File

@@ -0,0 +1,13 @@
 # Conversational
 This walkthrough demonstrates how to use an agent optimized for conversation. Other agents are often optimized for using tools to figure out the best response, which is not ideal in a conversational setting where you may want the agent to be able to chat with the user as well.
 import Example from "@snippets/modules/agents/agent_types/conversational_agent.mdx"
 <Example/>
 import ChatExample from "@snippets/modules/agents/agent_types/chat_conversation_agent.mdx"
 ## Using a chat model
 <ChatExample/>

57

docs/docs_skeleton/docs/modules/agents/agent_types/index.mdx Normal file

View File

@@ -0,0 +1,57 @@
 ---
 sidebar_position: 0
 ---
 # Agent types
 ## Action agents
 Agents use an LLM to determine which actions to take and in what order.
 An action can either be using a tool and observing its output, or returning a response to the user.
 Here are the agents available in LangChain.
 ### [Zero-shot ReAct](/docs/modules/agents/agent_types/react.html)
 This agent uses the [ReAct](https://arxiv.org/pdf/2205.00445.pdf) framework to determine which tool to use
 based solely on the tool's description. Any number of tools can be provided.
 This agent requires that a description is provided for each tool.
 **Note**: This is the most general purpose action agent.
 ### [Structured input ReAct](/docs/modules/agents/agent_types/structured_chat.html)
 The structured tool chat agent is capable of using multi-input tools.
 Older agents are configured to specify an action input as a single string, but this agent can use a tools' argument
 schema to create a structured action input. This is useful for more complex tool usage, like precisely
 navigating around a browser.
 ### [OpenAI Functions](/docs/modules/agents/agent_types/openai_functions_agent.html)
 Certain OpenAI models (like gpt-3.5-turbo-0613 and gpt-4-0613) have been explicitly fine-tuned to detect when a
 function should to be called and respond with the inputs that should be passed to the function.
 The OpenAI Functions Agent is designed to work with these models.
 ### [Conversational](/docs/modules/agents/agent_types/chat_conversation_agent.html)
 This agent is designed to be used in conversational settings.
 The prompt is designed to make the agent helpful and conversational.
 It uses the ReAct framework to decide which tool to use, and uses memory to remember the previous conversation interactions.
 ### [Self ask with search](/docs/modules/agents/agent_types/self_ask_with_search.html)
 This agent utilizes a single tool that should be named `Intermediate Answer`.
 This tool should be able to lookup factual answers to questions. This agent
 is equivalent to the original [self ask with search paper](https://ofir.io/self-ask.pdf),
 where a Google search API was provided as the tool.
 ### [ReAct document store](/docs/modules/agents/agent_types/react_docstore.html)
 This agent uses the ReAct framework to interact with a docstore. Two tools must
 be provided: a `Search` tool and a `Lookup` tool (they must be named exactly as so).
 The `Search` tool should search for a document, while the `Lookup` tool should lookup
 a term in the most recently found document.
 This agent is equivalent to the
 original [ReAct paper](https://arxiv.org/pdf/2210.03629.pdf), specifically the Wikipedia example.
 ## [Plan-and-execute agents](/docs/modules/agents/agent_types/plan_and_execute.html)
 Plan and execute agents accomplish an objective by first planning what to do, then executing the sub tasks. This idea is largely inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) and then the ["Plan-and-Solve" paper](https://arxiv.org/abs/2305.04091).

11

docs/docs_skeleton/docs/modules/agents/agent_types/openai_functions_agent.mdx Normal file

View File

@@ -0,0 +1,11 @@
 # OpenAI functions
 Certain OpenAI models (like gpt-3.5-turbo-0613 and gpt-4-0613) have been fine-tuned to detect when a function should to be called and respond with the inputs that should be passed to the function.
 In an API call, you can describe functions and have the model intelligently choose to output a JSON object containing arguments to call those functions.
 The goal of the OpenAI Function APIs is to more reliably return valid and useful function calls than a generic text completion or chat API.
 The OpenAI Functions Agent is designed to work with these models.
 import Example from "@snippets/modules/agents/agent_types/openai_functions_agent.mdx";
 <Example/>

11

docs/docs_skeleton/docs/modules/agents/agent_types/plan_and_execute.mdx Normal file

View File

@@ -0,0 +1,11 @@
 # Plan and execute
 Plan and execute agents accomplish an objective by first planning what to do, then executing the sub tasks. This idea is largely inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) and then the ["Plan-and-Solve" paper](https://arxiv.org/abs/2305.04091).
 The planning is almost always done by an LLM.
 The execution is usually done by a separate agent (equipped with tools).
 import Example from "@snippets/modules/agents/agent_types/plan_and_execute.mdx"
 <Example/>

15

docs/docs_skeleton/docs/modules/agents/agent_types/react.mdx Normal file

View File

@@ -0,0 +1,15 @@
 # ReAct
 This walkthrough showcases using an agent to implement the [ReAct](https://react-lm.github.io/) logic.
 import Example from "@snippets/modules/agents/agent_types/react.mdx"
 <Example/>
 ## Using chat models
 You can also create ReAct agents that use chat models instead of LLMs as the agent driver.
 import ChatExample from "@snippets/modules/agents/agent_types/react_chat.mdx"
 <ChatExample/>

10

docs/docs_skeleton/docs/modules/agents/agent_types/structured_chat.mdx Normal file

View File

@@ -0,0 +1,10 @@
 # Structured tool chat
 The structured tool chat agent is capable of using multi-input tools.
 Older agents are configured to specify an action input as a single string, but this agent can use the provided tools' `args_schema` to populate the action input.
 import Example from "@snippets/modules/agents/agent_types/structured_chat.mdx"
 <Example/>

									
										2

docs/docs_skeleton/docs/modules/agents/how_to/_category_.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,2 @@

				label: 'How-to'

				position: 1

14

docs/docs_skeleton/docs/modules/agents/how_to/custom_llm_agent.mdx Normal file

View File

@@ -0,0 +1,14 @@
 # Custom LLM Agent
 This notebook goes through how to create your own custom LLM agent.
 An LLM agent consists of three parts:
 - PromptTemplate: This is the prompt template that can be used to instruct the language model on what to do
 - LLM: This is the language model that powers the agent
 - `stop` sequence: Instructs the LLM to stop generating as soon as this string is found
 - OutputParser: This determines how to parse the LLMOutput into an AgentAction or AgentFinish object
 import Example from "@snippets/modules/agents/how_to/custom_llm_agent.mdx"
 <Example/>

14

docs/docs_skeleton/docs/modules/agents/how_to/custom_llm_chat_agent.mdx Normal file

View File

@@ -0,0 +1,14 @@
 # Custom LLM Agent (with a ChatModel)
 This notebook goes through how to create your own custom agent based on a chat model.
 An LLM chat agent consists of three parts:
 - PromptTemplate: This is the prompt template that can be used to instruct the language model on what to do
 - ChatModel: This is the language model that powers the agent
 - `stop` sequence: Instructs the LLM to stop generating as soon as this string is found
 - OutputParser: This determines how to parse the LLMOutput into an AgentAction or AgentFinish object
 import Example from "@snippets/modules/agents/how_to/custom_llm_chat_agent.mdx"
 <Example/>

16

docs/docs_skeleton/docs/modules/agents/how_to/mrkl.mdx Normal file

View File

@@ -0,0 +1,16 @@
 # Replicating MRKL
 This walkthrough demonstrates how to replicate the [MRKL](https://arxiv.org/pdf/2205.00445.pdf) system using agents.
 This uses the example Chinook database.
 To set it up follow the instructions on https://database.guide/2-sample-databases-sqlite/, placing the `.db` file in a notebooks folder at the root of this repository.
 import Example from "@snippets/modules/agents/how_to/mrkl.mdx"
 <Example/>
 ## With a chat model
 import ChatExample from "@snippets/modules/agents/how_to/mrkl_chat.mdx"
 <ChatExample/>

51

docs/docs_skeleton/docs/modules/agents/index.mdx Normal file

View File

@@ -0,0 +1,51 @@
 ---
 sidebar_position: 4
 ---
 # Agents
 Some applications require a flexible chain of calls to LLMs and other tools based on user input. The **Agent** interface provides the flexibility for such applications. An agent has access to a suite of tools, and determines which ones to use depending on the user input. Agents can use multiple tools, and use the output of one tool as the input to the next.
 There are two main types of agents:
 - **Action agents**: at each timestep, decide on the next action using the outputs of all previous actions
 - **Plan-and-execute agents**: decide on the full sequence of actions up front, then execute them all without updating the plan
 Action agents are suitable for small tasks, while plan-and-execute agents are better for complex or long-running tasks that require maintaining long-term objectives and focus. Often the best approach is to combine the dynamism of an action agent with the planning abilities of a plan-and-execute agent by letting the plan-and-execute agent use action agents to execute plans.
 For a full list of agent types see [agent types](/docs/modules/agents/agent_types/). Additional abstractions involved in agents are:
 - [**Tools**](/docs/modules/agents/tools/): the actions an agent can take. What tools you give an agent highly depend on what you want the agent to do
 - [**Toolkits**](/docs/modules/agents/toolkits/): wrappers around collections of tools that can be used together a specific use case. For example, in order for an agent to
   interact with a SQL database it will likely need one tool to execute queries and another to inspect tables
 ## Action agents
 At a high-level an action agent:
 . Receives user input
 . Decides which tool, if any, to use and the tool input
 . Calls the tool and records the output (also known as an "observation")
 . Decides the next step using the history of tools, tool inputs, and observations
 . Repeats 3-4 until it determines it can respond directly to the user
 Action agents are wrapped in **agent executors**, which are responsible for calling the agent, getting back an action and action input, calling the tool that the action references with the generated input, getting the output of the tool, and then passing all that information back into the agent to get the next action it should take.
 Although an agent can be constructed in many ways, it typically involves these components:
 - **Prompt template**: Responsible for taking the user input and previous steps and constructing a prompt
   to send to the language model
 - **Language model**: Takes the prompt with use input and action history and decides what to do next
 - **Output parser**: Takes the output of the language model and parses it into the next action or a final answer
 ## Plan-and-execute agents
 At a high-level a plan-and-execute agent:
 . Receives user input
 . Plans the full sequence of steps to take
 . Executes the steps in order, passing the outputs of past steps as inputs to future steps
 The most typical implementation is to have the planner be a language model, and the executor be an action agent. Read more [here](/docs/modules/agents/agent_types/plan_and_execute.html).
 ## Get started
 import GetStarted from "@snippets/modules/agents/get_started.mdx"
 <GetStarted/>

10

docs/docs_skeleton/docs/modules/agents/toolkits/index.mdx Normal file

View File

@@ -0,0 +1,10 @@
 ---
 sidebar_position: 3
 ---
 # Toolkits
 Toolkits are collections of tools that are designed to be used together for specific tasks and have convenience loading methods.
 import DocCardList from "@theme/DocCardList";
 <DocCardList />

									
										2

docs/docs_skeleton/docs/modules/agents/tools/how_to/_category_.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,2 @@

				label: 'How-to'

				position: 0

17

docs/docs_skeleton/docs/modules/agents/tools/index.mdx Normal file

View File

@@ -0,0 +1,17 @@
 ---
 sidebar_position: 2
 ---
 # Tools
 Tools are interfaces that an agent can use to interact with the world.
 ## Get started
 Tools are functions that agents can use to interact with the world.
 These tools can be generic utilities (e.g. search), other chains, or even other agents.
 Currently, tools can be loaded with the following snippet:
 import GetStarted from "@snippets/modules/agents/tools/get_started.mdx"
 <GetStarted/>

									
										1

docs/docs_skeleton/docs/modules/agents/tools/integrations/_category_.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1 @@

				label: 'Integrations'

									
										2

docs/docs_skeleton/docs/modules/callbacks/how_to/_category_.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,2 @@

				label: 'How-to'

				position: 0

10

docs/docs_skeleton/docs/modules/callbacks/index.mdx Normal file

View File

@@ -0,0 +1,10 @@
 ---
 sidebar_position: 5
 ---
 # Callbacks
 LangChain provides a callbacks system that allows you to hook into the various stages of your LLM application. This is useful for logging, monitoring, streaming, and other tasks.
 import GetStarted from "@snippets/modules/callbacks/get_started.mdx"
 <GetStarted/>

									
										1

docs/docs_skeleton/docs/modules/callbacks/integrations/_category_.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1 @@

				label: 'Integrations'

7

docs/docs_skeleton/docs/modules/chains/additional/analyze_document.mdx Normal file

View File

@@ -0,0 +1,7 @@
 # Analyze Document
 The AnalyzeDocumentChain can be used as an end-to-end to chain. This chain takes in a single document, splits it up, and then runs it through a CombineDocumentsChain.
 import Example from "@snippets/modules/chains/additional/analyze_document.mdx"
 <Example/>

7

docs/docs_skeleton/docs/modules/chains/additional/constitutional_chain.mdx Normal file

View File

@@ -0,0 +1,7 @@
 # Self-critique chain with constitutional AI
 The ConstitutionalChain is a chain that ensures the output of a language model adheres to a predefined set of constitutional principles. By incorporating specific rules and guidelines, the ConstitutionalChain filters and modifies the generated content to align with these principles, thus providing more controlled, ethical, and contextually appropriate responses. This mechanism helps maintain the integrity of the output while minimizing the risk of generating content that may violate guidelines, be offensive, or deviate from the desired context.
 import Example from "@snippets/modules/chains/additional/constitutional_chain.mdx"
 <Example/>

8

docs/docs_skeleton/docs/modules/chains/additional/index.mdx Normal file

View File

@@ -0,0 +1,8 @@
 ---
 sidebar_position: 4
 ---
 # Additional
 import DocCardList from "@theme/DocCardList";
 <DocCardList />

8

docs/docs_skeleton/docs/modules/chains/additional/moderation.mdx Normal file

View File

@@ -0,0 +1,8 @@
 # Moderation
 This notebook walks through examples of how to use a moderation chain, and several common ways for doing so. Moderation chains are useful for detecting text that could be hateful, violent, etc. This can be useful to apply on both user input, but also on the output of a Language Model. Some API providers, like OpenAI, [specifically prohibit](https://beta.openai.com/docs/usage-policies/use-case-policy) you, or your end users, from generating some types of harmful content. To comply with this (and to just generally prevent your application from being harmful) you may often want to append a moderation chain to any LLMChains, in order to make sure any output the LLM generates is not harmful.
 If the content passed into the moderation chain is harmful, there is not one best way to handle it, it probably depends on your application. Sometimes you may want to throw an error in the Chain (and have your application handle that). Other times, you may want to return something to the user explaining that the text was harmful. There could even be other ways to handle it! We will cover all these ways in this walkthrough.
 import Example from "@snippets/modules/chains/additional/moderation.mdx"
 <Example/>

7

docs/docs_skeleton/docs/modules/chains/additional/multi_prompt_router.mdx Normal file

View File

@@ -0,0 +1,7 @@
 # Dynamically selecting from multiple prompts
 This notebook demonstrates how to use the `RouterChain` paradigm to create a chain that dynamically selects the prompt to use for a given input. Specifically we show how to use the `MultiPromptChain` to create a question-answering chain that selects the prompt which is most relevant for a given question, and then answers the question using that prompt.
 import Example from "@snippets/modules/chains/additional/multi_prompt_router.mdx"
 <Example/>

7

docs/docs_skeleton/docs/modules/chains/additional/multi_retrieval_qa_router.mdx Normal file

View File

@@ -0,0 +1,7 @@
 # Dynamically selecting from multiple retrievers
 This notebook demonstrates how to use the `RouterChain` paradigm to create a chain that dynamically selects which Retrieval system to use. Specifically we show how to use the `MultiRetrievalQAChain` to create a question-answering chain that selects the retrieval QA chain which is most relevant for a given question, and then answers the question using it.
 import Example from "@snippets/modules/chains/additional/multi_retrieval_qa_router.mdx"
 <Example/>

13

docs/docs_skeleton/docs/modules/chains/additional/question_answering.mdx Normal file

View File

@@ -0,0 +1,13 @@
 # Document QA
 Here we walk through how to use LangChain for question answering over a list of documents. Under the hood we'll be using our [Document chains](../document.html).
 import Example from "@snippets/modules/chains/additional/question_answering.mdx"
 <Example/>
 ## Document QA with sources
 import ExampleWithSources from "@snippets/modules/chains/additional/qa_with_sources.mdx"
 <ExampleWithSources/>

16

docs/docs_skeleton/docs/modules/chains/document/index.mdx Normal file

View File

@@ -0,0 +1,16 @@
 ---
 sidebar_position: 2
 ---
 # Documents
 These are the core chains for working with Documents. They are useful for summarizing documents, answering questions over documents, extracting information from documents, and more.
 These chains all implement a common interface:
 import Interface from "@snippets/modules/chains/document/combine_docs.mdx"
 <Interface/>
 import DocCardList from "@theme/DocCardList";
 <DocCardList />

5

docs/docs_skeleton/docs/modules/chains/document/map_reduce.mdx Normal file

View File

@@ -0,0 +1,5 @@
 # Map reduce
 The map reduce documents chain first applies an LLM chain to each document individually (the Map step), treating the chain output as a new document. It then passes all the new documents to a separate combine documents chain to get a single output (the Reduce step). It can optionally first compress, or collapse, the mapped documents to make sure that they fit in the combine documents chain (which will often pass them to an LLM). This compression step is performed recursively if necessary.
 ![map_reduce_diagram](/img/map_reduce.jpg)

5

docs/docs_skeleton/docs/modules/chains/document/map_rerank.mdx Normal file

View File

@@ -0,0 +1,5 @@
 # Map re-rank
 The map re-rank documents chain runs an initial prompt on each document, that not only tries to complete a task but also gives a score for how certain it is in its answer. The highest scoring response is returned.
 ![map_rerank_diagram](/img/map_rerank.jpg)

12

docs/docs_skeleton/docs/modules/chains/document/refine.mdx Normal file

View File

@@ -0,0 +1,12 @@
 ---
 sidebar_position: 1
 ---
 # Refine
 The refine documents chain constructs a response by looping over the input documents and iteratively updating its answer. For each document, it passes all non-document inputs, the current document, and the latest intermediate answer to an LLM chain to get a new answer.
 Since the Refine chain only passes a single document to the LLM at a time, it is well-suited for tasks that require analyzing more documents than can fit in the model's context.
 The obvious tradeoff is that this chain will make far more LLM calls than, for example, the Stuff documents chain.
 There are also certain tasks which are difficult to accomplish iteratively. For example, the Refine chain can perform poorly when documents frequently cross-reference one another or when a task requires detailed information from many documents.
 ![refine_diagram](/img/refine.jpg)

Compare commits

757 Commits v0.0.166 ... v0.0.207

37 .devcontainer/README.md Normal file Unescape Escape View File

45 .devcontainer/devcontainer.json Unescape Escape View File

7 .devcontainer/docker-compose.yaml Unescape Escape View File

3 .gitattributes vendored Normal file Unescape Escape View File

43 .github/CONTRIBUTING.md vendored Unescape Escape View File

2 .github/ISSUE_TEMPLATE/bug-report.yml vendored Unescape Escape View File

56 .github/PULL_REQUEST_TEMPLATE.md vendored Unescape Escape View File

12 .github/actions/poetry_setup/action.yml vendored Unescape Escape View File

36 .github/workflows/linkcheck.yml vendored Unescape Escape View File

7 .github/workflows/test.yml vendored Unescape Escape View File

17 .gitignore vendored Unescape Escape View File

4 .gitmodules vendored Normal file Unescape Escape View File

4 .readthedocs.yaml Unescape Escape View File

16 Makefile Unescape Escape View File

14 README.md Unescape Escape View File

11 .devcontainer/Dockerfile → dev.Dockerfile Unescape Escape View File

12 docs/.local_build.sh Executable file Unescape Escape View File

0 docs/Makefile → docs/api_reference/Makefile Unescape Escape View File

2 docs/_static/css/custom.css → docs/api_reference/_static/css/custom.css Unescape Escape View File

57 docs/api_reference/_static/js/mendablesearch.js Normal file Unescape Escape View File

0 docs/reference/agents.rst → docs/api_reference/agents.rst Unescape Escape View File

25 docs/conf.py → docs/api_reference/conf.py Unescape Escape View File

9 docs/reference/indexes.rst → docs/api_reference/data_connection.rst Unescape Escape View File

29 docs/api_reference/index.rst Normal file Unescape Escape View File

0 docs/make.bat → docs/api_reference/make.bat Unescape Escape View File

12 docs/api_reference/model_io.rst Normal file Unescape Escape View File

1 docs/reference/models.rst → docs/api_reference/models.rst Unescape Escape View File

0 docs/reference/modules/agent_toolkits.rst → docs/api_reference/modules/agent_toolkits.rst Unescape Escape View File

0 docs/reference/modules/agents.rst → docs/api_reference/modules/agents.rst Unescape Escape View File

5 docs/api_reference/modules/base_classes.rst Normal file Unescape Escape View File

7 docs/api_reference/modules/callbacks.rst Normal file Unescape Escape View File

1 docs/reference/modules/chains.rst → docs/api_reference/modules/chains.rst Unescape Escape View File

0 docs/reference/modules/chat_models.rst → docs/api_reference/modules/chat_models.rst Unescape Escape View File

0 docs/reference/modules/document_loaders.rst → docs/api_reference/modules/document_loaders.rst Unescape Escape View File

6 docs/reference/modules/document_transformers.rst → docs/api_reference/modules/document_transformers.rst Unescape Escape View File

0 docs/reference/modules/embeddings.rst → docs/api_reference/modules/embeddings.rst Unescape Escape View File

0 docs/reference/modules/example_selector.rst → docs/api_reference/modules/example_selector.rst Unescape Escape View File

10 docs/reference/modules/experimental.rst → docs/api_reference/modules/experimental.rst Unescape Escape View File

0 docs/reference/modules/llms.rst → docs/api_reference/modules/llms.rst Unescape Escape View File

0 docs/reference/modules/memory.rst → docs/api_reference/modules/memory.rst Unescape Escape View File

0 docs/reference/modules/output_parsers.rst → docs/api_reference/modules/output_parsers.rst Unescape Escape View File

3 docs/reference/modules/prompts.rst → docs/api_reference/modules/prompts.rst Unescape Escape View File

14 docs/api_reference/modules/retrievers.rst Normal file Unescape Escape View File

0 docs/reference/modules/tools.rst → docs/api_reference/modules/tools.rst Unescape Escape View File

0 docs/reference/modules/utilities.rst → docs/api_reference/modules/utilities.rst Unescape Escape View File

0 docs/reference/modules/vectorstores.rst → docs/api_reference/modules/vectorstores.rst Unescape Escape View File

1 docs/reference/prompts.rst → docs/api_reference/prompts.rst Unescape Escape View File

7 docs/docs_skeleton/.gitignore vendored Normal file Unescape Escape View File

49 docs/docs_skeleton/README.md Normal file Unescape Escape View File

12 docs/docs_skeleton/babel.config.js Normal file Unescape Escape View File

76 docs/docs_skeleton/code-block-loader.js Normal file Unescape Escape View File

0 docs/_static/ApifyActors.png → docs/docs_skeleton/docs/_static/ApifyActors.png vendored Unescape Escape View File

0 docs/_static/DataberryDashboard.png → docs/docs_skeleton/docs/_static/DataberryDashboard.png vendored Unescape Escape View File

0 docs/_static/HeliconeDashboard.png → docs/docs_skeleton/docs/_static/HeliconeDashboard.png vendored Unescape Escape View File

0 docs/_static/HeliconeKeys.png → docs/docs_skeleton/docs/_static/HeliconeKeys.png vendored Unescape Escape View File

0 docs/_static/MetalDash.png → docs/docs_skeleton/docs/_static/MetalDash.png vendored Unescape Escape View File

BIN docs/docs_skeleton/docs/_static/android-chrome-192x192.png vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/android-chrome-512x512.png vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/apple-touch-icon.png vendored Normal file View File

21 docs/docs_skeleton/docs/_static/css/custom.css vendored Normal file Unescape Escape View File

BIN docs/docs_skeleton/docs/_static/favicon-16x16.png vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/favicon-32x32.png vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/favicon.ico vendored Normal file View File

6 docs/_static/js/mendablesearch.js → docs/docs_skeleton/docs/_static/js/mendablesearch.js vendored Unescape Escape View File

BIN docs/docs_skeleton/docs/_static/lc_modules.jpg vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/parrot-chainlink-icon.png vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/parrot-icon.png vendored Normal file View File

8 docs/docs_skeleton/docs/ecosystem/integrations/index.mdx Normal file Unescape Escape View File

5 docs/docs_skeleton/docs/get_started/installation.mdx Normal file Unescape Escape View File

65 docs/docs_skeleton/docs/get_started/introduction.mdx Normal file Unescape Escape View File

158 docs/docs_skeleton/docs/get_started/quickstart.mdx Normal file Unescape Escape View File

13 docs/docs_skeleton/docs/modules/agents/agent_types/chat_conversation_agent.mdx Normal file Unescape Escape View File

57 docs/docs_skeleton/docs/modules/agents/agent_types/index.mdx Normal file Unescape Escape View File

11 docs/docs_skeleton/docs/modules/agents/agent_types/openai_functions_agent.mdx Normal file Unescape Escape View File

11 docs/docs_skeleton/docs/modules/agents/agent_types/plan_and_execute.mdx Normal file Unescape Escape View File

15 docs/docs_skeleton/docs/modules/agents/agent_types/react.mdx Normal file Unescape Escape View File

10 docs/docs_skeleton/docs/modules/agents/agent_types/structured_chat.mdx Normal file Unescape Escape View File

2 docs/docs_skeleton/docs/modules/agents/how_to/_category_.yml Normal file Unescape Escape View File

757 Commits

v0.0.166 ... v0.0.207

37

.devcontainer/README.md Normal file

View File

45

.devcontainer/devcontainer.json

View File

7

.devcontainer/docker-compose.yaml

View File

3

.gitattributes vendored Normal file

View File

43

.github/CONTRIBUTING.md vendored

View File

2

.github/ISSUE_TEMPLATE/bug-report.yml vendored

View File

56

.github/PULL_REQUEST_TEMPLATE.md vendored

View File

12

.github/actions/poetry_setup/action.yml vendored

View File

36

.github/workflows/linkcheck.yml vendored

View File

7

.github/workflows/test.yml vendored

View File

17

.gitignore vendored

View File

4

.gitmodules vendored Normal file

View File

4

.readthedocs.yaml

View File

16

Makefile

View File

14

README.md

View File

11

.devcontainer/Dockerfile → dev.Dockerfile

View File

12

docs/.local_build.sh Executable file

View File

0

docs/Makefile → docs/api_reference/Makefile

View File

2

docs/_static/css/custom.css → docs/api_reference/_static/css/custom.css

View File

57

docs/api_reference/_static/js/mendablesearch.js Normal file

View File

0

docs/reference/agents.rst → docs/api_reference/agents.rst

View File

25

docs/conf.py → docs/api_reference/conf.py

View File

9

docs/reference/indexes.rst → docs/api_reference/data_connection.rst

View File

29

docs/api_reference/index.rst Normal file

View File

0

docs/make.bat → docs/api_reference/make.bat

View File

12

docs/api_reference/model_io.rst Normal file

View File

1

docs/reference/models.rst → docs/api_reference/models.rst

View File

0

docs/reference/modules/agent_toolkits.rst → docs/api_reference/modules/agent_toolkits.rst

View File

0

docs/reference/modules/agents.rst → docs/api_reference/modules/agents.rst

View File

5

docs/api_reference/modules/base_classes.rst Normal file

View File

7

docs/api_reference/modules/callbacks.rst Normal file

View File

1

docs/reference/modules/chains.rst → docs/api_reference/modules/chains.rst

View File

0

docs/reference/modules/chat_models.rst → docs/api_reference/modules/chat_models.rst

View File

0

docs/reference/modules/document_loaders.rst → docs/api_reference/modules/document_loaders.rst

View File

6

docs/reference/modules/document_transformers.rst → docs/api_reference/modules/document_transformers.rst

View File

0

docs/reference/modules/embeddings.rst → docs/api_reference/modules/embeddings.rst

View File

0

docs/reference/modules/example_selector.rst → docs/api_reference/modules/example_selector.rst

View File

10

docs/reference/modules/experimental.rst → docs/api_reference/modules/experimental.rst

View File

0

docs/reference/modules/llms.rst → docs/api_reference/modules/llms.rst

View File

0

docs/reference/modules/memory.rst → docs/api_reference/modules/memory.rst

View File

0

docs/reference/modules/output_parsers.rst → docs/api_reference/modules/output_parsers.rst

View File

3

docs/reference/modules/prompts.rst → docs/api_reference/modules/prompts.rst

View File

14

docs/api_reference/modules/retrievers.rst Normal file

View File

0

docs/reference/modules/tools.rst → docs/api_reference/modules/tools.rst

View File

0

docs/reference/modules/utilities.rst → docs/api_reference/modules/utilities.rst

View File

0

docs/reference/modules/vectorstores.rst → docs/api_reference/modules/vectorstores.rst

View File

1

docs/reference/prompts.rst → docs/api_reference/prompts.rst

View File

7

docs/docs_skeleton/.gitignore vendored Normal file

View File

49

docs/docs_skeleton/README.md Normal file

View File

12

docs/docs_skeleton/babel.config.js Normal file

View File

76

docs/docs_skeleton/code-block-loader.js Normal file

View File

0

docs/_static/ApifyActors.png → docs/docs_skeleton/docs/_static/ApifyActors.png vendored

View File

0

docs/_static/DataberryDashboard.png → docs/docs_skeleton/docs/_static/DataberryDashboard.png vendored

View File

0

docs/_static/HeliconeDashboard.png → docs/docs_skeleton/docs/_static/HeliconeDashboard.png vendored

View File

0

docs/_static/HeliconeKeys.png → docs/docs_skeleton/docs/_static/HeliconeKeys.png vendored

View File

0

docs/_static/MetalDash.png → docs/docs_skeleton/docs/_static/MetalDash.png vendored

View File

BIN
docs/docs_skeleton/docs/_static/android-chrome-192x192.png vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/android-chrome-512x512.png vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/apple-touch-icon.png vendored Normal file

View File

21

docs/docs_skeleton/docs/_static/css/custom.css vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/favicon-16x16.png vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/favicon-32x32.png vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/favicon.ico vendored Normal file

View File

6

docs/_static/js/mendablesearch.js → docs/docs_skeleton/docs/_static/js/mendablesearch.js vendored

View File

BIN
docs/docs_skeleton/docs/_static/lc_modules.jpg vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/parrot-chainlink-icon.png vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/parrot-icon.png vendored Normal file

View File

8

docs/docs_skeleton/docs/ecosystem/integrations/index.mdx Normal file

View File

5

docs/docs_skeleton/docs/get_started/installation.mdx Normal file

View File

65

docs/docs_skeleton/docs/get_started/introduction.mdx Normal file

View File

158

docs/docs_skeleton/docs/get_started/quickstart.mdx Normal file

View File

13

docs/docs_skeleton/docs/modules/agents/agent_types/chat_conversation_agent.mdx Normal file

View File

57

docs/docs_skeleton/docs/modules/agents/agent_types/index.mdx Normal file

View File

11

docs/docs_skeleton/docs/modules/agents/agent_types/openai_functions_agent.mdx Normal file

View File

11

docs/docs_skeleton/docs/modules/agents/agent_types/plan_and_execute.mdx Normal file

View File

15

docs/docs_skeleton/docs/modules/agents/agent_types/react.mdx Normal file

View File

10

docs/docs_skeleton/docs/modules/agents/agent_types/structured_chat.mdx Normal file

View File

2

docs/docs_skeleton/docs/modules/agents/how_to/_category_.yml Normal file

View File

14

docs/docs_skeleton/docs/modules/agents/how_to/custom_llm_agent.mdx Normal file

View File