Send evaluator logs to new session

bump version to 218 (#6857 )
feat (documents): add LarkSuite document loader (#6420 )
2026-02-11 19:49:54 +00:00 · 2023-06-28 15:45:45 -07:00 · 2023-06-27 23:36:37 -07:00 · 2023-06-27 23:08:05 -07:00 · 2023-06-27 23:07:20 -07:00 · 2023-06-27 23:06:25 -07:00
63 changed files with 3601 additions and 115 deletions
--- a/docs/extras/ecosystem/integrations/unstructured.mdx
+++ b/docs/extras/ecosystem/integrations/unstructured.mdx
@@ -23,11 +23,15 @@ its dependencies running locally.
 If you want to get up and running with less set up, you can
 simply run `pip install unstructured` and use `UnstructuredAPIFileLoader` or
 `UnstructuredAPIFileIOLoader`. That will process your document using the hosted Unstructured API.
-Note that currently (as of 1 May 2023) the Unstructured API is open, but it will soon require
-an API. The [Unstructured documentation page](https://unstructured-io.github.io/) will have
-instructions on how to generate an API key once they're available. Check out the instructions
-[here](https://github.com/Unstructured-IO/unstructured-api#dizzy-instructions-for-using-the-docker-image)
-if you'd like to self-host the Unstructured API or run it locally.
+
+
+The Unstructured API requires API keys to make requests.
+You can generate a free API key [here](https://www.unstructured.io/api-key) and start using it today!
+Checkout the README [here](https://github.com/Unstructured-IO/unstructured-api) here to get started making API calls.
+We'd love to hear your feedback, let us know how it goes in our [community slack](https://join.slack.com/t/unstructuredw-kbe4326/shared_invite/zt-1x7cgo0pg-PTptXWylzPQF9xZolzCnwQ).
+And stay tuned for improvements to both quality and performance!
+Check out the instructions
+[here](https://github.com/Unstructured-IO/unstructured-api#dizzy-instructions-for-using-the-docker-image) if you'd like to self-host the Unstructured API or run it locally.

 ## Wrappers

--- a/docs/extras/modules/agents/tools/integrations/zapier.ipynb
+++ b/docs/extras/modules/agents/tools/integrations/zapier.ipynb
@@ -7,7 +7,7 @@
   "source": [
    "# Zapier Natural Language Actions API\n",
    "\\\n",
-    "Full docs here: https://nla.zapier.com/api/v1/docs\n",
+    "Full docs here: https://nla.zapier.com/start/\n",
    "\n",
    "**Zapier Natural Language Actions** gives you access to the 5k+ apps, 20k+ actions on Zapier's platform through a natural language API interface.\n",
    "\n",
@@ -21,7 +21,7 @@
    "\n",
    "2. User-facing (Oauth): for production scenarios where you are deploying an end-user facing application and LangChain needs access to end-user's exposed actions and connected accounts on Zapier.com\n",
    "\n",
-    "This quick start will focus on the server-side use case for brevity. Review [full docs](https://nla.zapier.com/api/v1/docs) or reach out to nla@zapier.com for user-facing oauth developer support.\n",
+    "This quick start will focus on the server-side use case for brevity. Review [full docs](https://nla.zapier.com/start/) for user-facing oauth developer support.\n",
    "\n",
    "This example goes over how to use the Zapier integration with a `SimpleSequentialChain`, then an `Agent`.\n",
    "In code, below:"
@@ -39,7 +39,7 @@
    "# get from https://platform.openai.com/\n",
    "os.environ[\"OPENAI_API_KEY\"] = os.environ.get(\"OPENAI_API_KEY\", \"\")\n",
    "\n",
-    "# get from https://nla.zapier.com/demo/provider/debug (under User Information, after logging in):\n",
+    "# get from https://nla.zapier.com/docs/authentication/ after logging in):\n",
    "os.environ[\"ZAPIER_NLA_API_KEY\"] = os.environ.get(\"ZAPIER_NLA_API_KEY\", \"\")"
   ]
  },
--- a/docs/extras/modules/callbacks/integrations/streamlit.md
+++ b/docs/extras/modules/callbacks/integrations/streamlit.md
@@ -0,0 +1,73 @@
+# Streamlit
+
+> **[Streamlit](https://streamlit.io/) is a faster way to build and share data apps.**
+> Streamlit turns data scripts into shareable web apps in minutes. All in pure Python. No front‑end experience required.
+> See more examples at [streamlit.io/generative-ai](https://streamlit.io/generative-ai).
+
+[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/langchain-ai/streamlit-agent?quickstart=1)
+
+In this guide we will demonstrate how to use `StreamlitCallbackHandler` to display the thoughts and actions of an agent in an
+interactive Streamlit app. Try it out with the running app below using the [MRKL agent](/docs/modules/agents/how_to/mrkl/):
+
+<iframe loading="lazy" src="https://mrkl-minimal.streamlit.app/?embed=true&embed_options=light_theme"
+    style={{ width: 100 + '%', border: 'none', marginBottom: 1 + 'rem', height: 600 }}
+    allow="camera;clipboard-read;clipboard-write;"
+></iframe>
+
+## Installation and Setup
+
+```bash
+pip install langchain streamlit
+```
+
+You can run `streamlit hello` to load a sample app and validate your install succeeded. See full instructions in Streamlit's
+[Getting started documentation](https://docs.streamlit.io/library/get-started).
+
+## Display thoughts and actions
+
+To create a `StreamlitCallbackHandler`, you just need to provide a parent container to render the output.
+
+```python
+from langchain.callbacks import StreamlitCallbackHandler
+import streamlit as st
+
+st_callback = StreamlitCallbackHandler(st.container())
+```
+
+Additional keyword arguments to customize the display behavior are described in the
+[API reference](https://api.python.langchain.com/en/latest/modules/callbacks.html#langchain.callbacks.StreamlitCallbackHandler).
+
+### Scenario 1: Using an Agent with Tools
+
+The primary supported use case today is visualizing the actions of an Agent with Tools (or Agent Executor). You can create an
+agent in your Streamlit app and simply pass the `StreamlitCallbackHandler` to `agent.run()` in order to visualize the
+thoughts and actions live in your app.
+
+```python
+from langchain.llms import OpenAI
+from langchain.agents import AgentType, initialize_agent, load_tools
+from langchain.callbacks import StreamlitCallbackHandler
+import streamlit as st
+
+llm = OpenAI(temperature=0, streaming=True)
+tools = load_tools(["ddg-search"])
+agent = initialize_agent(
+    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
+)
+
+if prompt := st.chat_input():
+    st.chat_message("user").write(prompt)
+    with st.chat_message("assistant"):
+        st_callback = StreamlitCallbackHandler(st.container())
+        response = agent.run(prompt, callbacks=[st_callback])
+        st.write(response)
+```
+
+**Note:** You will need to set `OPENAI_API_KEY` for the above app code to run successfully.
+The easiest way to do this is via [Streamlit secrets.toml](https://docs.streamlit.io/library/advanced-features/secrets-management),
+or any other local ENV management tool.
+
+### Additional scenarios
+
+Currently `StreamlitCallbackHandler` is geared towards use with a LangChain Agent Executor. Support for additional agent types,
+use directly with Chains, etc will be added in the future.
--- a/docs/extras/modules/data_connection/document_loaders/integrations/example_data/README.org
+++ b/docs/extras/modules/data_connection/document_loaders/integrations/example_data/README.org
@@ -0,0 +1,27 @@
+* Example Docs
+
+The sample docs directory contains the following files:
+
+-  ~example-10k.html~ - A 10-K SEC filing in HTML format
+-  ~layout-parser-paper.pdf~ - A PDF copy of the layout parser paper
+-  ~factbook.xml~ / ~factbook.xsl~ - Example XML/XLS files that you
+   can use to test stylesheets
+
+These documents can be used to test out the parsers in the library. In
+addition, here are instructions for pulling in some sample docs that are
+too big to store in the repo.
+
+** XBRL 10-K
+
+You can get an example 10-K in inline XBRL format using the following
+~curl~. Note, you need to have the user agent set in the header or the
+SEC site will reject your request.
+
+#+BEGIN_SRC bash
+
+   curl -O \
+     -A '${organization} ${email}'
+     https://www.sec.gov/Archives/edgar/data/311094/000117184321001344/0001171843-21-001344.txt
+#+END_SRC
+
+You can parse this document using the HTML parser.
--- a/docs/extras/modules/data_connection/document_loaders/integrations/example_data/source_code/example.js
+++ b/docs/extras/modules/data_connection/document_loaders/integrations/example_data/source_code/example.js
@@ -0,0 +1,17 @@
+class MyClass {
+  constructor(name) {
+    this.name = name;
+  }
+
+  greet() {
+    console.log(`Hello, ${this.name}!`);
+  }
+}
+
+function main() {
+  const name = prompt("Enter your name:");
+  const obj = new MyClass(name);
+  obj.greet();
+}
+
+main();
--- a/docs/extras/modules/data_connection/document_loaders/integrations/example_data/source_code/example.py
+++ b/docs/extras/modules/data_connection/document_loaders/integrations/example_data/source_code/example.py
@@ -0,0 +1,16 @@
+class MyClass:
+    def __init__(self, name):
+        self.name = name
+
+    def greet(self):
+        print(f"Hello, {self.name}!")
+
+
+def main():
+    name = input("Enter your name: ")
+    obj = MyClass(name)
+    obj.greet()
+
+
+if __name__ == "__main__":
+    main()
--- a/docs/extras/modules/data_connection/document_loaders/integrations/larksuite.ipynb
+++ b/docs/extras/modules/data_connection/document_loaders/integrations/larksuite.ipynb
@@ -0,0 +1,103 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "id": "33205b12",
+   "metadata": {},
+   "source": [
+    "# LarkSuite (FeiShu)\n",
+    "\n",
+    ">[LarkSuite](https://www.larksuite.com/) is an enterprise collaboration platform developed by ByteDance.\n",
+    "\n",
+    "This notebook covers how to load data from the `LarkSuite` REST API into a format that can be ingested into LangChain, along with example usage for text summarization.\n",
+    "\n",
+    "The LarkSuite API requires an access token (tenant_access_token or user_access_token), checkout [LarkSuite open platform document](https://open.larksuite.com/document) for API details."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "90b69c94",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-06-19T10:05:03.645161Z",
+     "start_time": "2023-06-19T10:04:49.541968Z"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from getpass import getpass\n",
+    "from langchain.document_loaders.larksuite import LarkSuiteDocLoader\n",
+    "\n",
+    "DOMAIN = input(\"larksuite domain\")\n",
+    "ACCESS_TOKEN = getpass(\"larksuite tenant_access_token or user_access_token\")\n",
+    "DOCUMENT_ID = input(\"larksuite document id\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "13deb0f5",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-06-19T10:05:36.016495Z",
+     "start_time": "2023-06-19T10:05:35.360884Z"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[Document(page_content='Test Doc\\nThis is a Test Doc\\n\\n1\\n2\\n3\\n\\n', metadata={'document_id': 'V76kdbd2HoBbYJxdiNNccajunPf', 'revision_id': 11, 'title': 'Test Doc'})]\n"
+     ]
+    }
+   ],
+   "source": [
+    "from pprint import pprint\n",
+    "\n",
+    "larksuite_loader = LarkSuiteDocLoader(DOMAIN, ACCESS_TOKEN, DOCUMENT_ID)\n",
+    "docs = larksuite_loader.load()\n",
+    "\n",
+    "pprint(docs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9ccc1e2f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# see https://python.langchain.com/docs/use_cases/summarization for more details\n",
+    "from langchain.chains.summarize import load_summarize_chain\n",
+    "\n",
+    "chain = load_summarize_chain(llm, chain_type=\"map_reduce\")\n",
+    "chain.run(docs)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/modules/data_connection/document_loaders/integrations/org_mode.ipynb
+++ b/docs/extras/modules/data_connection/document_loaders/integrations/org_mode.ipynb
@@ -0,0 +1,88 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Org-mode\n",
+    "\n",
+    ">A [Org Mode document](https://en.wikipedia.org/wiki/Org-mode) is a document editing, formatting, and organizing mode, designed for notes, planning, and authoring within the free software text editor Emacs."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## `UnstructuredOrgModeLoader`\n",
+    "\n",
+    "You can load data from Org-mode files with `UnstructuredOrgModeLoader` using the following workflow."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import UnstructuredOrgModeLoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = UnstructuredOrgModeLoader(\n",
+    "    file_path=\"example_data/README.org\", mode=\"elements\"\n",
+    ")\n",
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "page_content='Example Docs' metadata={'source': 'example_data/README.org', 'filename': 'README.org', 'file_directory': 'example_data', 'filetype': 'text/org', 'page_number': 1, 'category': 'Title'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(docs[0])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/docs/extras/modules/data_connection/document_loaders/integrations/source_code.ipynb
+++ b/docs/extras/modules/data_connection/document_loaders/integrations/source_code.ipynb
@@ -0,0 +1,419 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "213a38a2",
+   "metadata": {},
+   "source": [
+    "# Source Code\n",
+    "\n",
+    "This notebook covers how to load source code files using a special approach with language parsing: each top-level function and class in the code is loaded into separate documents. Any remaining code top-level code outside the already loaded functions and classes will be loaded into a seperate document.\n",
+    "\n",
+    "This approach can potentially improve the accuracy of QA models over source code. Currently, the supported languages for code parsing are Python and JavaScript. The language used for parsing can be configured, along with the minimum number of lines required to activate the splitting based on syntax."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7fa47b2e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! pip install esprima"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "beb55c2f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import warnings\n",
+    "warnings.filterwarnings('ignore')\n",
+    "from pprint import pprint\n",
+    "from langchain.text_splitter import Language\n",
+    "from langchain.document_loaders.generic import GenericLoader\n",
+    "from langchain.document_loaders.parsers import LanguageParser"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "64056e07",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = GenericLoader.from_filesystem(\n",
+    "    \"./example_data/source_code\",\n",
+    "    glob=\"*\",\n",
+    "    suffixes=[\".py\", \".js\"],\n",
+    "    parser=LanguageParser()\n",
+    ")\n",
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "8af79bd7",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "6"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "len(docs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "85edf3fc",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'content_type': 'functions_classes',\n",
+      " 'language': <Language.PYTHON: 'python'>,\n",
+      " 'source': 'example_data/source_code/example.py'}\n",
+      "{'content_type': 'functions_classes',\n",
+      " 'language': <Language.PYTHON: 'python'>,\n",
+      " 'source': 'example_data/source_code/example.py'}\n",
+      "{'content_type': 'simplified_code',\n",
+      " 'language': <Language.PYTHON: 'python'>,\n",
+      " 'source': 'example_data/source_code/example.py'}\n",
+      "{'content_type': 'functions_classes',\n",
+      " 'language': <Language.JS: 'js'>,\n",
+      " 'source': 'example_data/source_code/example.js'}\n",
+      "{'content_type': 'functions_classes',\n",
+      " 'language': <Language.JS: 'js'>,\n",
+      " 'source': 'example_data/source_code/example.js'}\n",
+      "{'content_type': 'simplified_code',\n",
+      " 'language': <Language.JS: 'js'>,\n",
+      " 'source': 'example_data/source_code/example.js'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "for document in docs:\n",
+    "    pprint(document.metadata)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "f44e3e37",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "class MyClass:\n",
+      "    def __init__(self, name):\n",
+      "        self.name = name\n",
+      "\n",
+      "    def greet(self):\n",
+      "        print(f\"Hello, {self.name}!\")\n",
+      "\n",
+      "--8<--\n",
+      "\n",
+      "def main():\n",
+      "    name = input(\"Enter your name: \")\n",
+      "    obj = MyClass(name)\n",
+      "    obj.greet()\n",
+      "\n",
+      "--8<--\n",
+      "\n",
+      "# Code for: class MyClass:\n",
+      "\n",
+      "\n",
+      "# Code for: def main():\n",
+      "\n",
+      "\n",
+      "if __name__ == \"__main__\":\n",
+      "    main()\n",
+      "\n",
+      "--8<--\n",
+      "\n",
+      "class MyClass {\n",
+      "  constructor(name) {\n",
+      "    this.name = name;\n",
+      "  }\n",
+      "\n",
+      "  greet() {\n",
+      "    console.log(`Hello, ${this.name}!`);\n",
+      "  }\n",
+      "}\n",
+      "\n",
+      "--8<--\n",
+      "\n",
+      "function main() {\n",
+      "  const name = prompt(\"Enter your name:\");\n",
+      "  const obj = new MyClass(name);\n",
+      "  obj.greet();\n",
+      "}\n",
+      "\n",
+      "--8<--\n",
+      "\n",
+      "// Code for: class MyClass {\n",
+      "\n",
+      "// Code for: function main() {\n",
+      "\n",
+      "main();\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(\"\\n\\n--8<--\\n\\n\".join([document.page_content for document in docs]))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "69aad0ed",
+   "metadata": {},
+   "source": [
+    "The parser can be disabled for small files. \n",
+    "\n",
+    "The parameter `parser_threshold` indicates the minimum number of lines that the source code file must have to be segmented using the parser."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "ae024794",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = GenericLoader.from_filesystem(\n",
+    "    \"./example_data/source_code\",\n",
+    "    glob=\"*\",\n",
+    "    suffixes=[\".py\"],\n",
+    "    parser=LanguageParser(language=Language.PYTHON, parser_threshold=1000)\n",
+    ")\n",
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "5d3b372a",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "1"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "len(docs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "89e546ad",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "class MyClass:\n",
+      "    def __init__(self, name):\n",
+      "        self.name = name\n",
+      "\n",
+      "    def greet(self):\n",
+      "        print(f\"Hello, {self.name}!\")\n",
+      "\n",
+      "\n",
+      "def main():\n",
+      "    name = input(\"Enter your name: \")\n",
+      "    obj = MyClass(name)\n",
+      "    obj.greet()\n",
+      "\n",
+      "\n",
+      "if __name__ == \"__main__\":\n",
+      "    main()\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(docs[0].page_content)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c9c71e61",
+   "metadata": {},
+   "source": [
+    "## Splitting\n",
+    "\n",
+    "Additional splitting could be needed for those functions, classes, or scripts that are too big."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "adbaa79f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = GenericLoader.from_filesystem(\n",
+    "    \"./example_data/source_code\",\n",
+    "    glob=\"*\",\n",
+    "    suffixes=[\".js\"],\n",
+    "    parser=LanguageParser(language=Language.JS)\n",
+    ")\n",
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "c44c0d3f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.text_splitter import (\n",
+    "    RecursiveCharacterTextSplitter,\n",
+    "    Language,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "id": "b1e0053d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "js_splitter = RecursiveCharacterTextSplitter.from_language(\n",
+    "    language=Language.JS, chunk_size=60, chunk_overlap=0\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "id": "7dbe6188",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "result = js_splitter.split_documents(docs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "id": "8a80d089",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "7"
+      ]
+     },
+     "execution_count": 15,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "len(result)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "id": "000a6011",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "class MyClass {\n",
+      "  constructor(name) {\n",
+      "    this.name = name;\n",
+      "\n",
+      "--8<--\n",
+      "\n",
+      "}\n",
+      "\n",
+      "--8<--\n",
+      "\n",
+      "greet() {\n",
+      "    console.log(`Hello, ${this.name}!`);\n",
+      "  }\n",
+      "}\n",
+      "\n",
+      "--8<--\n",
+      "\n",
+      "function main() {\n",
+      "  const name = prompt(\"Enter your name:\");\n",
+      "\n",
+      "--8<--\n",
+      "\n",
+      "const obj = new MyClass(name);\n",
+      "  obj.greet();\n",
+      "}\n",
+      "\n",
+      "--8<--\n",
+      "\n",
+      "// Code for: class MyClass {\n",
+      "\n",
+      "// Code for: function main() {\n",
+      "\n",
+      "--8<--\n",
+      "\n",
+      "main();\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(\"\\n\\n--8<--\\n\\n\".join([document.page_content for document in result]))"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.16"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/modules/data_connection/document_loaders/integrations/tencent_cos_directory.ipynb
+++ b/docs/extras/modules/data_connection/document_loaders/integrations/tencent_cos_directory.ipynb
@@ -0,0 +1,116 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "a634365e",
+   "metadata": {},
+   "source": [
+    "# Tencent COS Directory\n",
+    "\n",
+    "This covers how to load document objects from a `Tencent COS Directory`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "85e97267",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#! pip install cos-python-sdk-v5"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "2f0cd6a5",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import TencentCOSDirectoryLoader\n",
+    "from qcloud_cos import CosConfig"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "321cc7f1",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "conf = CosConfig(\n",
+    "        Region=\"your cos region\",\n",
+    "        SecretId=\"your cos secret_id\",\n",
+    "        SecretKey=\"your cos secret_key\",\n",
+    "    )\n",
+    "loader = TencentCOSDirectoryLoader(conf=conf, bucket=\"you_cos_bucket\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4c50d2c7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0690c40a",
+   "metadata": {},
+   "source": [
+    "## Specifying a prefix\n",
+    "You can also specify a prefix for more finegrained control over what files to load."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "72d44781",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = TencentCOSDirectoryLoader(conf=conf, bucket=\"you_cos_bucket\", prefix=\"fake\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2d3c32db",
+   "metadata": {
+    "scrolled": true
+   },
+   "outputs": [],
+   "source": [
+    "loader.load()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/modules/data_connection/document_loaders/integrations/tencent_cos_file.ipynb
+++ b/docs/extras/modules/data_connection/document_loaders/integrations/tencent_cos_file.ipynb
@@ -0,0 +1,91 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "a634365e",
+   "metadata": {},
+   "source": [
+    "# Tencent COS File\n",
+    "\n",
+    "This covers how to load document object from a `Tencent COS File`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "85e97267",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#! pip install cos-python-sdk-v5"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "2f0cd6a5",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import TencentCOSFileLoader\n",
+    "from qcloud_cos import CosConfig"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "321cc7f1",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "conf = CosConfig(\n",
+    "        Region=\"your cos region\",\n",
+    "        SecretId=\"your cos secret_id\",\n",
+    "        SecretKey=\"your cos secret_key\",\n",
+    "    )\n",
+    "loader = TencentCOSFileLoader(conf=conf, bucket=\"you_cos_bucket\", key=\"fake.docx\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4c50d2c7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0690c40a",
+   "metadata": {},
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/modules/data_connection/document_loaders/integrations/unstructured_file.ipynb
+++ b/docs/extras/modules/data_connection/document_loaders/integrations/unstructured_file.ipynb
@@ -226,7 +226,6 @@
   ]
  },
  {
-   "attachments": {},
   "cell_type": "markdown",
   "id": "8de9ef16",
   "metadata": {},
@@ -303,7 +302,7 @@
   "source": [
    "## Unstructured API\n",
    "\n",
-    "If you want to get up and running with less set up, you can simply run `pip install unstructured` and use `UnstructuredAPIFileLoader` or `UnstructuredAPIFileIOLoader`. That will process your document using the hosted Unstructured API. Note that currently (as of 11 May 2023) the Unstructured API is open, but it will soon require an API. The [Unstructured documentation](https://unstructured-io.github.io/) page will have instructions on how to generate an API key once they’re available. Check out the instructions [here](https://github.com/Unstructured-IO/unstructured-api#dizzy-instructions-for-using-the-docker-image) if you’d like to self-host the Unstructured API or run it locally."
+    "If you want to get up and running with less set up, you can simply run `pip install unstructured` and use `UnstructuredAPIFileLoader` or `UnstructuredAPIFileIOLoader`. That will process your document using the hosted Unstructured API. You can generate a free Unstructured API key [here](https://www.unstructured.io/api-key/). The [Unstructured documentation](https://unstructured-io.github.io/) page will have instructions on how to generate an API key once they’re available. Check out the instructions [here](https://github.com/Unstructured-IO/unstructured-api#dizzy-instructions-for-using-the-docker-image) if you’d like to self-host the Unstructured API or run it locally."
   ]
  },
  {
--- a/docs/extras/modules/data_connection/document_loaders/integrations/web_base.ipynb
+++ b/docs/extras/modules/data_connection/document_loaders/integrations/web_base.ipynb
@@ -224,13 +224,33 @@
    "docs"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "## Using proxies\n",
+    "\n",
+    "Sometimes you might need to use proxies to get around IP blocks. You can pass in a dictionary of proxies to the loader (and `requests` underneath) to use them."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
  {
   "cell_type": "code",
   "execution_count": null,
-   "id": "1dd8ab23",
-   "metadata": {},
   "outputs": [],
-   "source": []
+   "source": [
+    "loader = WebBaseLoader(\n",
+    "    \"https://www.walmart.com/search?q=parrots\", proxies={\n",
+    "        \"http\": \"http://{username}:{password}:@proxy.service.com:6666/\",\n",
+    "        \"https\": \"https://{username}:{password}:@proxy.service.com:6666/\"\n",
+    "    }\n",
+    ")\n",
+    "docs = loader.load()\n"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
  }
 ],
 "metadata": {
--- a/docs/extras/modules/data_connection/retrievers/how_to/MultiQueryRetriever.ipynb
+++ b/docs/extras/modules/data_connection/retrievers/how_to/MultiQueryRetriever.ipynb
@@ -0,0 +1,214 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "8cc82b48",
+   "metadata": {},
+   "source": [
+    "# MultiQueryRetriever\n",
+    "\n",
+    "Distance-based vector database retrieval embeds (represents) queries in high-dimensional space and finds similar embedded documents based on \"distance\". But, retrieval may produce difference results with subtle changes in query wording or if the embeddings do not capture the semantics of the data well. Prompt engineering / tuning is sometimes done to manually address these problems, but can be tedious.\n",
+    "\n",
+    "The `MultiQueryRetriever` automates the process of prompt tuning by using an LLM to generate multiple queries from different perspectives for a given user input query. For each query, it retrieves a set of relevant documents and takes the unique union across all queries to get a larger set of potentially relevant documents. By generating multiple perspectives on the same question, the `MultiQueryRetriever` might be able to overcome some of the limitations of the distance-based retrieval and get a richer set of results."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "c2f3f5f2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Build a sample vectorDB\n",
+    "from langchain.vectorstores import Chroma\n",
+    "from langchain.document_loaders import PyPDFLoader\n",
+    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
+    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
+    "\n",
+    "# Load PDF\n",
+    "path=\"path-to-files\"\n",
+    "loaders = [\n",
+    "    PyPDFLoader(path+\"docs/cs229_lectures/MachineLearning-Lecture01.pdf\"),\n",
+    "    PyPDFLoader(path+\"docs/cs229_lectures/MachineLearning-Lecture02.pdf\"),\n",
+    "    PyPDFLoader(path+\"docs/cs229_lectures/MachineLearning-Lecture03.pdf\")\n",
+    "]\n",
+    "docs = []\n",
+    "for loader in loaders:\n",
+    "    docs.extend(loader.load())\n",
+    "    \n",
+    "# Split\n",
+    "text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1500,chunk_overlap = 150)\n",
+    "splits = text_splitter.split_documents(docs)\n",
+    "\n",
+    "# VectorDB\n",
+    "embedding = OpenAIEmbeddings()\n",
+    "vectordb = Chroma.from_documents(documents=splits,embedding=embedding)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cca8f56c",
+   "metadata": {},
+   "source": [
+    "`Simple usage`\n",
+    "\n",
+    "Specify the LLM to use for query generation, and the retriver will do the rest."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "edbca101",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.chat_models import ChatOpenAI\n",
+    "from langchain.retrievers.multi_query import MultiQueryRetriever\n",
+    "question=\"What does the course say about regression?\"\n",
+    "num_queries=3\n",
+    "llm = ChatOpenAI(temperature=0)\n",
+    "retriever_from_llm = MultiQueryRetriever.from_llm(retriever=vectordb.as_retriever(),llm=llm)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "e5203612",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "INFO:root:Generated queries: [\"1. What is the course's perspective on regression?\", '2. How does the course discuss regression?', '3. What information does the course provide about regression?']\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "6"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "unique_docs = retriever_from_llm.get_relevant_documents(question=\"What does the course say about regression?\")\n",
+    "len(unique_docs)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c54a282f",
+   "metadata": {},
+   "source": [
+    "`Supplying your own prompt`\n",
+    "\n",
+    "You can also supply a prompt along with an output parser to split the results into a list of queries."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "d9afb0ca",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import List\n",
+    "from langchain import LLMChain\n",
+    "from pydantic import BaseModel, Field\n",
+    "from langchain.prompts import PromptTemplate\n",
+    "from langchain.output_parsers import PydanticOutputParser\n",
+    "\n",
+    "# Output parser will split the LLM result into a list of queries\n",
+    "class LineList(BaseModel):\n",
+    "    # \"lines\" is the key (attribute name) of the parsed output\n",
+    "    lines: List[str] = Field(description=\"Lines of text\")\n",
+    "\n",
+    "class LineListOutputParser(PydanticOutputParser):\n",
+    "    def __init__(self) -> None:\n",
+    "        super().__init__(pydantic_object=LineList)\n",
+    "    def parse(self, text: str) -> LineList:\n",
+    "        lines = text.strip().split(\"\\n\")\n",
+    "        return LineList(lines=lines)\n",
+    "\n",
+    "output_parser = LineListOutputParser()\n",
+    "    \n",
+    "QUERY_PROMPT = PromptTemplate(\n",
+    "    input_variables=[\"question\"],\n",
+    "    template=\"\"\"You are an AI language model assistant. Your task is to generate five \n",
+    "    different versions of the given user question to retrieve relevant documents from a vector \n",
+    "    database. By generating multiple perspectives on the user question, your goal is to help\n",
+    "    the user overcome some of the limitations of the distance-based similarity search. \n",
+    "    Provide these alternative questions seperated by newlines.\n",
+    "    Original question: {question}\"\"\",\n",
+    ")\n",
+    "llm = ChatOpenAI(temperature=0)\n",
+    "\n",
+    "# Chain\n",
+    "llm_chain = LLMChain(llm=llm,prompt=QUERY_PROMPT,output_parser=output_parser)\n",
+    " \n",
+    "# Other inputs\n",
+    "question=\"What does the course say about regression?\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "6660d7ee",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "INFO:root:Generated queries: [\"1. What is the course's perspective on regression?\", '2. Can you provide information on regression as discussed in the course?', '3. How does the course cover the topic of regression?', \"4. What are the course's teachings on regression?\", '5. In relation to the course, what is mentioned about regression?']\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "8"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Run\n",
+    "retriever = MultiQueryRetriever(retriever=vectordb.as_retriever(), \n",
+    "                                llm_chain=llm_chain,\n",
+    "                                parser_key=\"lines\") # \"lines\" is the key (attribute name) of the parsed output\n",
+    "\n",
+    "# Results\n",
+    "unique_docs = retriever.get_relevant_documents(question=\"What does the course say about regression?\")\n",
+    "len(unique_docs)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.16"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/use_cases/apis.mdx
+++ b/docs/extras/use_cases/apis.mdx
@@ -9,7 +9,7 @@ If you are just getting started, and you have relatively simple apis, you should
 Chains are a sequence of predetermined steps, so they are good to get started with as they give you more control and let you 
 understand what is happening better.

- [API Chain](/docs/modules/chains/how_to/api.html)
+- [API Chain](/docs/modules/chains/popular/api.html)

 ## Agents

--- a/langchain/agents/agent_toolkits/zapier/toolkit.py
+++ b/langchain/agents/agent_toolkits/zapier/toolkit.py
@@ -29,6 +29,23 @@ class ZapierToolkit(BaseToolkit):
        ]
        return cls(tools=tools)

+    @classmethod
+    async def async_from_zapier_nla_wrapper(
+        cls, zapier_nla_wrapper: ZapierNLAWrapper
+    ) -> "ZapierToolkit":
+        """Create a toolkit from a ZapierNLAWrapper."""
+        actions = await zapier_nla_wrapper.alist()
+        tools = [
+            ZapierNLARunAction(
+                action_id=action["id"],
+                zapier_description=action["description"],
+                params_schema=action["params"],
+                api_wrapper=zapier_nla_wrapper,
+            )
+            for action in actions
+        ]
+        return cls(tools=tools)
+
    def get_tools(self) -> List[BaseTool]:
        """Get the tools in the toolkit."""
        return self.tools
--- a/langchain/callbacks/openai_info.py
+++ b/langchain/callbacks/openai_info.py
@@ -96,7 +96,7 @@ def get_openai_token_cost_for_model(
            f"Unknown model: {model_name}. Please provide a valid OpenAI model name."
            "Known models are: " + ", ".join(MODEL_COST_PER_1K_TOKENS.keys())
        )
-    return MODEL_COST_PER_1K_TOKENS[model_name] * num_tokens / 1000
+    return MODEL_COST_PER_1K_TOKENS[model_name] * (num_tokens / 1000)


 class OpenAICallbackHandler(BaseCallbackHandler):
--- a/langchain/callbacks/streaming_aiter_final_only.py
+++ b/langchain/callbacks/streaming_aiter_final_only.py
@@ -0,0 +1,88 @@
+from __future__ import annotations
+
+from typing import Any, Dict, List, Optional
+
+from langchain.callbacks.streaming_aiter import AsyncIteratorCallbackHandler
+from langchain.schema import LLMResult
+
+DEFAULT_ANSWER_PREFIX_TOKENS = ["Final", "Answer", ":"]
+
+
+class AsyncFinalIteratorCallbackHandler(AsyncIteratorCallbackHandler):
+    """Callback handler that returns an async iterator.
+    Only the final output of the agent will be iterated.
+    """
+
+    def append_to_last_tokens(self, token: str) -> None:
+        self.last_tokens.append(token)
+        self.last_tokens_stripped.append(token.strip())
+        if len(self.last_tokens) > len(self.answer_prefix_tokens):
+            self.last_tokens.pop(0)
+            self.last_tokens_stripped.pop(0)
+
+    def check_if_answer_reached(self) -> bool:
+        if self.strip_tokens:
+            return self.last_tokens_stripped == self.answer_prefix_tokens_stripped
+        else:
+            return self.last_tokens == self.answer_prefix_tokens
+
+    def __init__(
+        self,
+        *,
+        answer_prefix_tokens: Optional[List[str]] = None,
+        strip_tokens: bool = True,
+        stream_prefix: bool = False,
+    ) -> None:
+        """Instantiate AsyncFinalIteratorCallbackHandler.
+
+        Args:
+            answer_prefix_tokens: Token sequence that prefixes the answer.
+                Default is ["Final", "Answer", ":"]
+            strip_tokens: Ignore white spaces and new lines when comparing
+                answer_prefix_tokens to last tokens? (to determine if answer has been
+                reached)
+            stream_prefix: Should answer prefix itself also be streamed?
+        """
+        super().__init__()
+        if answer_prefix_tokens is None:
+            self.answer_prefix_tokens = DEFAULT_ANSWER_PREFIX_TOKENS
+        else:
+            self.answer_prefix_tokens = answer_prefix_tokens
+        if strip_tokens:
+            self.answer_prefix_tokens_stripped = [
+                token.strip() for token in self.answer_prefix_tokens
+            ]
+        else:
+            self.answer_prefix_tokens_stripped = self.answer_prefix_tokens
+        self.last_tokens = [""] * len(self.answer_prefix_tokens)
+        self.last_tokens_stripped = [""] * len(self.answer_prefix_tokens)
+        self.strip_tokens = strip_tokens
+        self.stream_prefix = stream_prefix
+        self.answer_reached = False
+
+    async def on_llm_start(
+        self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
+    ) -> None:
+        # If two calls are made in a row, this resets the state
+        self.done.clear()
+        self.answer_reached = False
+
+    async def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
+        if self.answer_reached:
+            self.done.set()
+
+    async def on_llm_new_token(self, token: str, **kwargs: Any) -> None:
+        # Remember the last n tokens, where n = len(answer_prefix_tokens)
+        self.append_to_last_tokens(token)
+
+        # Check if the last n tokens match the answer_prefix_tokens list ...
+        if self.check_if_answer_reached():
+            self.answer_reached = True
+            if self.stream_prefix:
+                for t in self.last_tokens:
+                    self.queue.put_nowait(t)
+            return
+
+        # If yes, then put tokens from now on
+        if self.answer_reached:
+            self.queue.put_nowait(token)
--- a/langchain/callbacks/tracers/evaluation.py
+++ b/langchain/callbacks/tracers/evaluation.py
@@ -5,6 +5,7 @@ from uuid import UUID

 from langchainplus_sdk import LangChainPlusClient, RunEvaluator

+from langchain.callbacks.manager import tracing_v2_enabled
 from langchain.callbacks.tracers.base import BaseTracer
 from langchain.callbacks.tracers.schemas import Run

@@ -47,6 +48,7 @@ class EvaluatorCallbackHandler(BaseTracer):
        max_workers: Optional[int] = None,
        client: Optional[LangChainPlusClient] = None,
        example_id: Optional[Union[UUID, str]] = None,
+        project_name: Optional[str] = None,
        **kwargs: Any
    ) -> None:
        super().__init__(**kwargs)
@@ -59,6 +61,23 @@ class EvaluatorCallbackHandler(BaseTracer):
            max_workers=max(max_workers or len(evaluators), 1)
        )
        self.futures: Set[Future] = set()
+        self.project_name = project_name
+
+    def _evaluate_in_project(self, run: Run, evaluator: RunEvaluator) -> None:
+        """Evaluate the run in the project.
+
+        Parameters
+        ----------
+        run : Run
+            The run to be evaluated.
+        evaluator : RunEvaluator
+            The evaluator to use for evaluating the run.
+
+        """
+        if self.project_name is None:
+            return self.client.evaluate_run(run, evaluator)
+        with tracing_v2_enabled(project_name=self.project_name):
+            return self.client.evaluate_run(run, evaluator)

    def _persist_run(self, run: Run) -> None:
        """Run the evaluator on the run.
@@ -73,7 +92,7 @@ class EvaluatorCallbackHandler(BaseTracer):
        run_.reference_example_id = self.example_id
        for evaluator in self.evaluators:
            self.futures.add(
-                self.executor.submit(self.client.evaluate_run, run_, evaluator)
+                self.executor.submit(self._evaluate_in_project, run_, evaluator)
            )

    def wait_for_futures(self) -> None:
--- a/langchain/chains/openai_functions/openapi.py
+++ b/langchain/chains/openai_functions/openapi.py
@@ -157,7 +157,13 @@ def openapi_spec_to_openai_fn(
                "url": api_op.base_url + api_op.path,
            }

-    def default_call_api(name: str, fn_args: dict, **kwargs: Any) -> Any:
+    def default_call_api(
+        name: str,
+        fn_args: dict,
+        headers: Optional[dict] = None,
+        params: Optional[dict] = None,
+        **kwargs: Any,
+    ) -> Any:
        method = _name_to_call_map[name]["method"]
        url = _name_to_call_map[name]["url"]
        path_params = fn_args.pop("path_params", {})
@@ -165,6 +171,16 @@ def openapi_spec_to_openai_fn(
        if "data" in fn_args and isinstance(fn_args["data"], dict):
            fn_args["data"] = json.dumps(fn_args["data"])
        _kwargs = {**fn_args, **kwargs}
+        if headers is not None:
+            if "headers" in _kwargs:
+                _kwargs["headers"].update(headers)
+            else:
+                _kwargs["headers"] = headers
+        if params is not None:
+            if "params" in _kwargs:
+                _kwargs["params"].update(params)
+            else:
+                _kwargs["params"] = params
        return requests.request(method, url, **_kwargs)

    return functions, default_call_api
@@ -218,6 +234,8 @@ def get_openapi_chain(
    request_chain: Optional[Chain] = None,
    llm_kwargs: Optional[Dict] = None,
    verbose: bool = False,
+    headers: Optional[Dict] = None,
+    params: Optional[Dict] = None,
    **kwargs: Any,
 ) -> SequentialChain:
    """Create a chain for querying an API from a OpenAPI spec.
@@ -259,7 +277,10 @@ def get_openapi_chain(
        **(llm_kwargs or {}),
    )
    request_chain = request_chain or SimpleRequestChain(
-        request_method=call_api_fn, verbose=verbose
+        request_method=lambda name, args: call_api_fn(
+            name, args, headers=headers, params=params
+        ),
+        verbose=verbose,
    )
    return SequentialChain(
        chains=[llm_chain, request_chain],
--- a/langchain/client/runner_utils.py
+++ b/langchain/client/runner_utils.py
@@ -296,12 +296,14 @@ async def _callbacks_initializer(
    project_name: Optional[str],
    client: LangChainPlusClient,
    run_evaluators: Sequence[RunEvaluator],
+    evaluation_handler_collector: List[EvaluatorCallbackHandler],
 ) -> List[BaseTracer]:
    """
    Initialize a tracer to share across tasks.

    Args:
        project_name: The project name for the tracer.
+        client: The client to use for the tracer.

    Returns:
        A LangChainTracer instance with an active project.
@@ -309,15 +311,17 @@ async def _callbacks_initializer(
    callbacks: List[BaseTracer] = []
    if project_name:
        callbacks.append(LangChainTracer(project_name=project_name))
+    evaluator_project_name = f"{project_name}-evaluators" if project_name else None
    if run_evaluators:
-        callbacks.append(
-            EvaluatorCallbackHandler(
-                client=client,
-                evaluators=run_evaluators,
-                # We already have concurrency, don't want to overload the machine
-                max_workers=1,
-            )
+        callback = EvaluatorCallbackHandler(
+            client=client,
+            evaluators=run_evaluators,
+            # We already have concurrency, don't want to overload the machine
+            max_workers=1,
+            project_name=evaluator_project_name,
        )
+        callbacks.append(callback)
+        evaluation_handler_collector.append(callback)
    return callbacks


@@ -362,9 +366,6 @@ async def arun_on_examples(
    client_.create_project(project_name, mode="eval")

    results: Dict[str, List[Any]] = {}
-    evaluation_handler = EvaluatorCallbackHandler(
-        evaluators=run_evaluators or [], client=client_
-    )

    async def process_example(
        example: Example, callbacks: List[BaseCallbackHandler], job_state: dict
@@ -386,17 +387,20 @@ async def arun_on_examples(
                flush=True,
            )

+    evaluation_handlers: List[EvaluatorCallbackHandler] = []
    await _gather_with_concurrency(
        concurrency_level,
        functools.partial(
            _callbacks_initializer,
            project_name=project_name,
            client=client_,
+            evaluation_handler_collector=evaluation_handlers,
            run_evaluators=run_evaluators or [],
        ),
        *(functools.partial(process_example, e) for e in examples),
    )
-    evaluation_handler.wait_for_futures()
+    for handler in evaluation_handlers:
+        handler.wait_for_futures()
    return results


@@ -537,8 +541,11 @@ def run_on_examples(
    client_ = client or LangChainPlusClient()
    client_.create_project(project_name, mode="eval")
    tracer = LangChainTracer(project_name=project_name)
+    evaluator_project_name = f"{project_name}-evaluators"
    evalution_handler = EvaluatorCallbackHandler(
-        evaluators=run_evaluators or [], client=client_
+        evaluators=run_evaluators or [],
+        client=client_,
+        project_name=evaluator_project_name,
    )
    callbacks: List[BaseCallbackHandler] = [tracer, evalution_handler]
    for i, example in enumerate(examples):
--- a/langchain/document_loaders/init.py
+++ b/langchain/document_loaders/init.py
@@ -63,6 +63,7 @@ from langchain.document_loaders.imsdb import IMSDbLoader
 from langchain.document_loaders.iugu import IuguLoader
 from langchain.document_loaders.joplin import JoplinLoader
 from langchain.document_loaders.json_loader import JSONLoader
+from langchain.document_loaders.larksuite import LarkSuiteDocLoader
 from langchain.document_loaders.markdown import UnstructuredMarkdownLoader
 from langchain.document_loaders.mastodon import MastodonTootsLoader
 from langchain.document_loaders.max_compute import MaxComputeLoader
@@ -78,6 +79,7 @@ from langchain.document_loaders.odt import UnstructuredODTLoader
 from langchain.document_loaders.onedrive import OneDriveLoader
 from langchain.document_loaders.onedrive_file import OneDriveFileLoader
 from langchain.document_loaders.open_city_data import OpenCityDataLoader
+from langchain.document_loaders.org_mode import UnstructuredOrgModeLoader
 from langchain.document_loaders.pdf import (
    MathpixPDFLoader,
    OnlinePDFLoader,
@@ -112,6 +114,8 @@ from langchain.document_loaders.telegram import (
    TelegramChatApiLoader,
    TelegramChatFileLoader,
 )
+from langchain.document_loaders.tencent_cos_directory import TencentCOSDirectoryLoader
+from langchain.document_loaders.tencent_cos_file import TencentCOSFileLoader
 from langchain.document_loaders.text import TextLoader
 from langchain.document_loaders.tomarkdown import ToMarkdownLoader
 from langchain.document_loaders.toml import TomlLoader
@@ -201,6 +205,7 @@ __all__ = [
    "IuguLoader",
    "JSONLoader",
    "JoplinLoader",
+    "LarkSuiteDocLoader",
    "MWDumpLoader",
    "MastodonTootsLoader",
    "MathpixPDFLoader",
@@ -242,6 +247,8 @@ __all__ = [
    "SnowflakeLoader",
    "SpreedlyLoader",
    "StripeLoader",
+    "TencentCOSDirectoryLoader",
+    "TencentCOSFileLoader",
    "TelegramChatApiLoader",
    "TelegramChatFileLoader",
    "TelegramChatLoader",
@@ -262,6 +269,7 @@ __all__ = [
    "UnstructuredImageLoader",
    "UnstructuredMarkdownLoader",
    "UnstructuredODTLoader",
+    "UnstructuredOrgModeLoader",
    "UnstructuredPDFLoader",
    "UnstructuredPowerPointLoader",
    "UnstructuredRSTLoader",
--- a/langchain/document_loaders/larksuite.py
+++ b/langchain/document_loaders/larksuite.py
@@ -0,0 +1,46 @@
+"""Loader that loads LarkSuite (FeiShu) document json dump."""
+import json
+import urllib.request
+from typing import Any, Iterator, List
+
+from langchain.docstore.document import Document
+from langchain.document_loaders.base import BaseLoader
+
+
+class LarkSuiteDocLoader(BaseLoader):
+    """Loader that loads LarkSuite (FeiShu) document."""
+
+    def __init__(self, domain: str, access_token: str, document_id: str):
+        """Initialize with domain, access_token (tenant / user), and document_id."""
+        self.domain = domain
+        self.access_token = access_token
+        self.document_id = document_id
+
+    def _get_larksuite_api_json_data(self, api_url: str) -> Any:
+        """Get LarkSuite (FeiShu) API response json data."""
+        headers = {"Authorization": f"Bearer {self.access_token}"}
+        request = urllib.request.Request(api_url, headers=headers)
+        with urllib.request.urlopen(request) as response:
+            json_data = json.loads(response.read().decode())
+            return json_data
+
+    def lazy_load(self) -> Iterator[Document]:
+        """Lazy load LarkSuite (FeiShu) document."""
+        api_url_prefix = f"{self.domain}/open-apis/docx/v1/documents"
+        metadata_json = self._get_larksuite_api_json_data(
+            f"{api_url_prefix}/{self.document_id}"
+        )
+        raw_content_json = self._get_larksuite_api_json_data(
+            f"{api_url_prefix}/{self.document_id}/raw_content"
+        )
+        text = raw_content_json["data"]["content"]
+        metadata = {
+            "document_id": self.document_id,
+            "revision_id": metadata_json["data"]["document"]["revision_id"],
+            "title": metadata_json["data"]["document"]["title"],
+        }
+        yield Document(page_content=text, metadata=metadata)
+
+    def load(self) -> List[Document]:
+        """Load LarkSuite (FeiShu) document."""
+        return list(self.lazy_load())
--- a/langchain/document_loaders/org_mode.py
+++ b/langchain/document_loaders/org_mode.py
@@ -0,0 +1,22 @@
+"""Loader that loads Org-Mode files."""
+from typing import Any, List
+
+from langchain.document_loaders.unstructured import (
+    UnstructuredFileLoader,
+    validate_unstructured_version,
+)
+
+
+class UnstructuredOrgModeLoader(UnstructuredFileLoader):
+    """Loader that uses unstructured to load Org-Mode files."""
+
+    def __init__(
+        self, file_path: str, mode: str = "single", **unstructured_kwargs: Any
+    ):
+        validate_unstructured_version(min_unstructured_version="0.7.9")
+        super().__init__(file_path=file_path, mode=mode, **unstructured_kwargs)
+
+    def _get_elements(self) -> List:
+        from unstructured.partition.org import partition_org
+
+        return partition_org(filename=self.file_path, **self.unstructured_kwargs)
--- a/langchain/document_loaders/parsers/init.py
+++ b/langchain/document_loaders/parsers/init.py
@@ -1,5 +1,6 @@
 from langchain.document_loaders.parsers.audio import OpenAIWhisperParser
 from langchain.document_loaders.parsers.html import BS4HTMLParser
+from langchain.document_loaders.parsers.language import LanguageParser
 from langchain.document_loaders.parsers.pdf import (
    PDFMinerParser,
    PDFPlumberParser,
@@ -10,6 +11,7 @@ from langchain.document_loaders.parsers.pdf import (

 __all__ = [
    "BS4HTMLParser",
+    "LanguageParser",
    "OpenAIWhisperParser",
    "PDFMinerParser",
    "PDFPlumberParser",
--- a/langchain/document_loaders/parsers/language/init.py
+++ b/langchain/document_loaders/parsers/language/init.py
@@ -0,0 +1,3 @@
+from langchain.document_loaders.parsers.language.language_parser import LanguageParser
+
+__all__ = ["LanguageParser"]
--- a/langchain/document_loaders/parsers/language/code_segmenter.py
+++ b/langchain/document_loaders/parsers/language/code_segmenter.py
@@ -0,0 +1,18 @@
+from abc import ABC, abstractmethod
+from typing import List
+
+
+class CodeSegmenter(ABC):
+    def __init__(self, code: str):
+        self.code = code
+
+    def is_valid(self) -> bool:
+        return True
+
+    @abstractmethod
+    def simplify_code(self) -> str:
+        raise NotImplementedError  # pragma: no cover
+
+    @abstractmethod
+    def extract_functions_classes(self) -> List[str]:
+        raise NotImplementedError  # pragma: no cover
--- a/langchain/document_loaders/parsers/language/javascript.py
+++ b/langchain/document_loaders/parsers/language/javascript.py
@@ -0,0 +1,65 @@
+from typing import Any, List
+
+from langchain.document_loaders.parsers.language.code_segmenter import CodeSegmenter
+
+
+class JavaScriptSegmenter(CodeSegmenter):
+    def __init__(self, code: str):
+        super().__init__(code)
+        self.source_lines = self.code.splitlines()
+
+        try:
+            import esprima  # noqa: F401
+        except ImportError:
+            raise ImportError(
+                "Could not import esprima Python package. "
+                "Please install it with `pip install esprima`."
+            )
+
+    def is_valid(self) -> bool:
+        import esprima
+
+        try:
+            esprima.parseScript(self.code)
+            return True
+        except esprima.Error:
+            return False
+
+    def _extract_code(self, node: Any) -> str:
+        start = node.loc.start.line - 1
+        end = node.loc.end.line
+        return "\n".join(self.source_lines[start:end])
+
+    def extract_functions_classes(self) -> List[str]:
+        import esprima
+
+        tree = esprima.parseScript(self.code, loc=True)
+        functions_classes = []
+
+        for node in tree.body:
+            if isinstance(
+                node,
+                (esprima.nodes.FunctionDeclaration, esprima.nodes.ClassDeclaration),
+            ):
+                functions_classes.append(self._extract_code(node))
+
+        return functions_classes
+
+    def simplify_code(self) -> str:
+        import esprima
+
+        tree = esprima.parseScript(self.code, loc=True)
+        simplified_lines = self.source_lines[:]
+
+        for node in tree.body:
+            if isinstance(
+                node,
+                (esprima.nodes.FunctionDeclaration, esprima.nodes.ClassDeclaration),
+            ):
+                start = node.loc.start.line - 1
+                simplified_lines[start] = f"// Code for: {simplified_lines[start]}"
+
+                for line_num in range(start + 1, node.loc.end.line):
+                    simplified_lines[line_num] = None  # type: ignore
+
+        return "\n".join(line for line in simplified_lines if line is not None)
--- a/langchain/document_loaders/parsers/language/language_parser.py
+++ b/langchain/document_loaders/parsers/language/language_parser.py
@@ -0,0 +1,143 @@
+from typing import Any, Dict, Iterator, Optional
+
+from langchain.docstore.document import Document
+from langchain.document_loaders.base import BaseBlobParser
+from langchain.document_loaders.blob_loaders import Blob
+from langchain.document_loaders.parsers.language.javascript import JavaScriptSegmenter
+from langchain.document_loaders.parsers.language.python import PythonSegmenter
+from langchain.text_splitter import Language
+
+LANGUAGE_EXTENSIONS: Dict[str, str] = {
+    "py": Language.PYTHON,
+    "js": Language.JS,
+}
+
+LANGUAGE_SEGMENTERS: Dict[str, Any] = {
+    Language.PYTHON: PythonSegmenter,
+    Language.JS: JavaScriptSegmenter,
+}
+
+
+class LanguageParser(BaseBlobParser):
+    """
+    Language parser that split code using the respective language syntax.
+
+    Each top-level function and class in the code is loaded into separate documents.
+    Furthermore, an extra document is generated, containing the remaining top-level code
+    that excludes the already segmented functions and classes.
+
+    This approach can potentially improve the accuracy of QA models over source code.
+
+    Currently, the supported languages for code parsing are Python and JavaScript.
+
+    The language used for parsing can be configured, along with the minimum number of
+    lines required to activate the splitting based on syntax.
+
+    Examples:
+
+        .. code-block:: python
+
+            from langchain.text_splitter.Language
+            from langchain.document_loaders.generic import GenericLoader
+            from langchain.document_loaders.parsers import LanguageParser
+
+            loader = GenericLoader.from_filesystem(
+                "./code",
+                glob="**/*",
+                suffixes=[".py", ".js"],
+                parser=LanguageParser()
+            )
+            docs = loader.load()
+
+        Example instantiations to manually select the language:
+
+        ... code-block:: python
+
+            from langchain.text_splitter import Language
+
+            loader = GenericLoader.from_filesystem(
+                "./code",
+                glob="**/*",
+                suffixes=[".py"],
+                parser=LanguageParser(language=Language.PYTHON)
+            )
+
+        Example instantiations to set number of lines threshold:
+
+        ... code-block:: python
+
+            loader = GenericLoader.from_filesystem(
+                "./code",
+                glob="**/*",
+                suffixes=[".py"],
+                parser=LanguageParser(parser_threshold=200)
+            )
+    """
+
+    def __init__(self, language: Optional[Language] = None, parser_threshold: int = 0):
+        """
+        Language parser that split code using the respective language syntax.
+
+        Args:
+            language: If None (default), it will try to infer language from source.
+            parser_threshold: Minimum lines needed to activate parsing (0 by default).
+        """
+        self.language = language
+        self.parser_threshold = parser_threshold
+
+    def lazy_parse(self, blob: Blob) -> Iterator[Document]:
+        code = blob.as_string()
+
+        language = self.language or (
+            LANGUAGE_EXTENSIONS.get(blob.source.rsplit(".", 1)[-1])
+            if isinstance(blob.source, str)
+            else None
+        )
+
+        if language is None:
+            yield Document(
+                page_content=code,
+                metadata={
+                    "source": blob.source,
+                },
+            )
+            return
+
+        if self.parser_threshold >= len(code.splitlines()):
+            yield Document(
+                page_content=code,
+                metadata={
+                    "source": blob.source,
+                    "language": language,
+                },
+            )
+            return
+
+        self.Segmenter = LANGUAGE_SEGMENTERS[language]
+        segmenter = self.Segmenter(blob.as_string())
+        if not segmenter.is_valid():
+            yield Document(
+                page_content=code,
+                metadata={
+                    "source": blob.source,
+                },
+            )
+            return
+
+        for functions_classes in segmenter.extract_functions_classes():
+            yield Document(
+                page_content=functions_classes,
+                metadata={
+                    "source": blob.source,
+                    "content_type": "functions_classes",
+                    "language": language,
+                },
+            )
+        yield Document(
+            page_content=segmenter.simplify_code(),
+            metadata={
+                "source": blob.source,
+                "content_type": "simplified_code",
+                "language": language,
+            },
+        )
--- a/langchain/document_loaders/parsers/language/python.py
+++ b/langchain/document_loaders/parsers/language/python.py
@@ -0,0 +1,47 @@
+import ast
+from typing import Any, List
+
+from langchain.document_loaders.parsers.language.code_segmenter import CodeSegmenter
+
+
+class PythonSegmenter(CodeSegmenter):
+    def __init__(self, code: str):
+        super().__init__(code)
+        self.source_lines = self.code.splitlines()
+
+    def is_valid(self) -> bool:
+        try:
+            ast.parse(self.code)
+            return True
+        except SyntaxError:
+            return False
+
+    def _extract_code(self, node: Any) -> str:
+        start = node.lineno - 1
+        end = node.end_lineno
+        return "\n".join(self.source_lines[start:end])
+
+    def extract_functions_classes(self) -> List[str]:
+        tree = ast.parse(self.code)
+        functions_classes = []
+
+        for node in ast.iter_child_nodes(tree):
+            if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)):
+                functions_classes.append(self._extract_code(node))
+
+        return functions_classes
+
+    def simplify_code(self) -> str:
+        tree = ast.parse(self.code)
+        simplified_lines = self.source_lines[:]
+
+        for node in ast.iter_child_nodes(tree):
+            if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)):
+                start = node.lineno - 1
+                simplified_lines[start] = f"# Code for: {simplified_lines[start]}"
+
+                assert isinstance(node.end_lineno, int)
+                for line_num in range(start + 1, node.end_lineno):
+                    simplified_lines[line_num] = None  # type: ignore
+
+        return "\n".join(line for line in simplified_lines if line is not None)
--- a/langchain/document_loaders/psychic.py
+++ b/langchain/document_loaders/psychic.py
@@ -1,5 +1,5 @@
 """Loader that loads documents from Psychic.dev."""
-from typing import List
+from typing import List, Optional

 from langchain.docstore.document import Document
 from langchain.document_loaders.base import BaseLoader
@@ -8,8 +8,10 @@ from langchain.document_loaders.base import BaseLoader
 class PsychicLoader(BaseLoader):
    """Loader that loads documents from Psychic.dev."""

-    def __init__(self, api_key: str, connector_id: str, connection_id: str):
-        """Initialize with API key, connector id, and connection id."""
+    def __init__(
+        self, api_key: str, account_id: str, connector_id: Optional[str] = None
+    ):
+        """Initialize with API key, connector id, and account id."""

        try:
            from psychicapi import ConnectorId, Psychic  # noqa: F401
@@ -19,16 +21,18 @@ class PsychicLoader(BaseLoader):
            )
        self.psychic = Psychic(secret_key=api_key)
        self.connector_id = ConnectorId(connector_id)
-        self.connection_id = connection_id
+        self.account_id = account_id

    def load(self) -> List[Document]:
        """Load documents."""

-        psychic_docs = self.psychic.get_documents(self.connector_id, self.connection_id)
+        psychic_docs = self.psychic.get_documents(
+            connector_id=self.connector_id, account_id=self.account_id
+        )
        return [
            Document(
                page_content=doc["content"],
                metadata={"title": doc["title"], "source": doc["uri"]},
            )
-            for doc in psychic_docs
+            for doc in psychic_docs.documents
        ]
--- a/langchain/document_loaders/tencent_cos_directory.py
+++ b/langchain/document_loaders/tencent_cos_directory.py
@@ -0,0 +1,50 @@
+"""Loading logic for loading documents from Tencent Cloud COS directory."""
+from typing import Any, Iterator, List
+
+from langchain.docstore.document import Document
+from langchain.document_loaders.base import BaseLoader
+from langchain.document_loaders.tencent_cos_file import TencentCOSFileLoader
+
+
+class TencentCOSDirectoryLoader(BaseLoader):
+    """Loading logic for loading documents from Tencent Cloud COS."""
+
+    def __init__(self, conf: Any, bucket: str, prefix: str = ""):
+        """Initialize with COS config, bucket and prefix.
+        :param conf(CosConfig): COS config.
+        :param bucket(str): COS bucket.
+        :param prefix(str): prefix.
+        """
+        self.conf = conf
+        self.bucket = bucket
+        self.prefix = prefix
+
+    def load(self) -> List[Document]:
+        return list(self.lazy_load())
+
+    def lazy_load(self) -> Iterator[Document]:
+        """Load documents."""
+        try:
+            from qcloud_cos import CosS3Client
+        except ImportError:
+            raise ValueError(
+                "Could not import cos-python-sdk-v5 python package. "
+                "Please install it with `pip install cos-python-sdk-v5`."
+            )
+        client = CosS3Client(self.conf)
+        contents = []
+        marker = ""
+        while True:
+            response = client.list_objects(
+                Bucket=self.bucket, Prefix=self.prefix, Marker=marker, MaxKeys=1000
+            )
+            if "Contents" in response:
+                contents.extend(response["Contents"])
+            if response["IsTruncated"] == "false":
+                break
+            marker = response["NextMarker"]
+        for content in contents:
+            if content["Key"].endswith("/"):
+                continue
+            loader = TencentCOSFileLoader(self.conf, self.bucket, content["Key"])
+            yield loader.load()[0]
--- a/langchain/document_loaders/tencent_cos_file.py
+++ b/langchain/document_loaders/tencent_cos_file.py
@@ -0,0 +1,48 @@
+"""Loading logic for loading documents from Tencent Cloud COS file."""
+import os
+import tempfile
+from typing import Any, Iterator, List
+
+from langchain.docstore.document import Document
+from langchain.document_loaders.base import BaseLoader
+from langchain.document_loaders.unstructured import UnstructuredFileLoader
+
+
+class TencentCOSFileLoader(BaseLoader):
+    """Loading logic for loading documents from Tencent Cloud COS."""
+
+    def __init__(self, conf: Any, bucket: str, key: str):
+        """Initialize with COS config, bucket and key name.
+        :param conf(CosConfig): COS config.
+        :param bucket(str): COS bucket.
+        :param key(str): COS file key.
+        """
+        self.conf = conf
+        self.bucket = bucket
+        self.key = key
+
+    def load(self) -> List[Document]:
+        return list(self.lazy_load())
+
+    def lazy_load(self) -> Iterator[Document]:
+        """Load documents."""
+        try:
+            from qcloud_cos import CosS3Client
+        except ImportError:
+            raise ValueError(
+                "Could not import cos-python-sdk-v5 python package. "
+                "Please install it with `pip install cos-python-sdk-v5`."
+            )
+
+        # Initialise a client
+        client = CosS3Client(self.conf)
+        with tempfile.TemporaryDirectory() as temp_dir:
+            file_path = f"{temp_dir}/{self.bucket}/{self.key}"
+            os.makedirs(os.path.dirname(file_path), exist_ok=True)
+            # Download the file to a destination
+            client.download_file(
+                Bucket=self.bucket, Key=self.key, DestFilePath=file_path
+            )
+            loader = UnstructuredFileLoader(file_path)
+            # UnstructuredFileLoader not implement lazy_load yet
+            return iter(loader.load())
--- a/langchain/document_loaders/web_base.py
+++ b/langchain/document_loaders/web_base.py
@@ -50,6 +50,9 @@ class WebBaseLoader(BaseLoader):
    requests_kwargs: Dict[str, Any] = {}
    """kwargs for requests"""

+    raise_for_status: bool = False
+    """Raise an exception if http status code denotes an error."""
+
    bs_get_text_kwargs: Dict[str, Any] = {}
    """kwargs for beatifulsoup4 get_text"""

@@ -58,6 +61,7 @@ class WebBaseLoader(BaseLoader):
        web_path: Union[str, List[str]],
        header_template: Optional[dict] = None,
        verify: Optional[bool] = True,
+        proxies: Optional[dict] = None,
    ):
        """Initialize with webpage path."""

@@ -94,6 +98,9 @@ class WebBaseLoader(BaseLoader):
                )
        self.session.headers = dict(headers)

+        if proxies:
+            self.session.proxies.update(proxies)
+
    @property
    def web_path(self) -> str:
        if len(self.web_paths) > 1:
@@ -189,6 +196,8 @@ class WebBaseLoader(BaseLoader):
        self._check_parser(parser)

        html_doc = self.session.get(url, verify=self.verify, **self.requests_kwargs)
+        if self.raise_for_status:
+            html_doc.raise_for_status()
        html_doc.encoding = html_doc.apparent_encoding
        return BeautifulSoup(html_doc.text, parser)

--- a/langchain/document_loaders/whatsapp_chat.py
+++ b/langchain/document_loaders/whatsapp_chat.py
@@ -49,13 +49,15 @@ class WhatsAppChatLoader(BaseLoader):
            \s
            (.+)
        """
+        ignore_lines = ["This message was deleted", "<Media omitted>"]
        for line in lines:
            result = re.match(
                message_line_regex, line.strip(), flags=re.VERBOSE | re.IGNORECASE
            )
            if result:
                date, sender, text = result.groups()
-                text_content += concatenate_rows(date, sender, text)
+                if text not in ignore_lines:
+                    text_content += concatenate_rows(date, sender, text)

        metadata = {"source": str(p)}

--- a/langchain/output_parsers/rail_parser.py
+++ b/langchain/output_parsers/rail_parser.py
@@ -61,6 +61,29 @@ class GuardrailsOutputParser(BaseOutputParser):
            kwargs=kwargs,
        )

+    @classmethod
+    def from_pydantic(
+        cls,
+        output_class: Any,
+        num_reasks: int = 1,
+        api: Optional[Callable] = None,
+        *args: Any,
+        **kwargs: Any,
+    ) -> GuardrailsOutputParser:
+        try:
+            from guardrails import Guard
+        except ImportError:
+            raise ValueError(
+                "guardrails-ai package not installed. "
+                "Install it by running `pip install guardrails-ai`."
+            )
+        return cls(
+            guard=Guard.from_pydantic(output_class, "", num_reasks=num_reasks),
+            api=api,
+            args=args,
+            kwargs=kwargs,
+        )
+
    def get_format_instructions(self) -> str:
        return self.guard.raw_prompt.format_instructions

--- a/langchain/retrievers/init.py
+++ b/langchain/retrievers/init.py
@@ -14,6 +14,7 @@ from langchain.retrievers.llama_index import (
 from langchain.retrievers.merger_retriever import MergerRetriever
 from langchain.retrievers.metal import MetalRetriever
 from langchain.retrievers.milvus import MilvusRetriever
+from langchain.retrievers.multi_query import MultiQueryRetriever
 from langchain.retrievers.pinecone_hybrid_search import PineconeHybridSearchRetriever
 from langchain.retrievers.pupmed import PubMedRetriever
 from langchain.retrievers.remote_retriever import RemoteLangChainRetriever
@@ -43,6 +44,7 @@ __all__ = [
    "MergerRetriever",
    "MetalRetriever",
    "MilvusRetriever",
+    "MultiQueryRetriever",
    "PineconeHybridSearchRetriever",
    "PubMedRetriever",
    "RemoteLangChainRetriever",
--- a/langchain/retrievers/kendra.py
+++ b/langchain/retrievers/kendra.py
@@ -32,16 +32,17 @@ class TextWithHighLights(BaseModel, extra=Extra.allow):
    Highlights: Optional[Any]


+class AdditionalResultAttributeValue(BaseModel, extra=Extra.allow):
+    TextWithHighlightsValue: TextWithHighLights
+
+
 class AdditionalResultAttribute(BaseModel, extra=Extra.allow):
    Key: str
    ValueType: Literal["TEXT_WITH_HIGHLIGHTS_VALUE"]
-    Value: Optional[TextWithHighLights]
+    Value: AdditionalResultAttributeValue

    def get_value_text(self) -> str:
-        if not self.Value:
-            return ""
-        else:
-            return self.Value.Text
+        return self.Value.TextWithHighlightsValue.Text


 class QueryResultItem(BaseModel, extra=Extra.allow):
--- a/langchain/retrievers/multi_query.py
+++ b/langchain/retrievers/multi_query.py
@@ -0,0 +1,158 @@
+import logging
+from typing import List
+
+from pydantic import BaseModel, Field
+
+from langchain.chains.llm import LLMChain
+from langchain.llms.base import BaseLLM
+from langchain.output_parsers.pydantic import PydanticOutputParser
+from langchain.prompts.prompt import PromptTemplate
+from langchain.schema import BaseRetriever, Document
+
+logging.basicConfig(level=logging.INFO)
+
+
+class LineList(BaseModel):
+    lines: List[str] = Field(description="Lines of text")
+
+
+class LineListOutputParser(PydanticOutputParser):
+    def __init__(self) -> None:
+        super().__init__(pydantic_object=LineList)
+
+    def parse(self, text: str) -> LineList:
+        lines = text.strip().split("\n")
+        return LineList(lines=lines)
+
+
+# Default prompt
+DEFAULT_QUERY_PROMPT = PromptTemplate(
+    input_variables=["question"],
+    template="""You are an AI language model assistant. Your task is 
+    to generate 3 different versions of the given user 
+    question to retrieve relevant documents from a vector  database. 
+    By generating multiple perspectives on the user question, 
+    your goal is to help the user overcome some of the limitations 
+    of distance-based similarity search. Provide these alternative 
+    questions seperated by newlines. Original question: {question}""",
+)
+
+
+class MultiQueryRetriever(BaseRetriever):
+
+    """Given a user query, use an LLM to write a set of queries.
+    Retrieve docs for each query. Rake the unique union of all retrieved docs."""
+
+    def __init__(
+        self,
+        retriever: BaseRetriever,
+        llm_chain: LLMChain,
+        verbose: bool = True,
+        parser_key: str = "lines",
+    ) -> None:
+        """Initialize MultiQueryRetriever.
+
+        Args:
+            retriever: retriever to query documents from
+            llm_chain: llm_chain for query generation
+            verbose: show the queries that we generated to the user
+            parser_key: attribute name for the parsed output
+
+        Returns:
+            MultiQueryRetriever
+        """
+        self.retriever = retriever
+        self.llm_chain = llm_chain
+        self.verbose = verbose
+        self.parser_key = parser_key
+
+    @classmethod
+    def from_llm(
+        cls,
+        retriever: BaseRetriever,
+        llm: BaseLLM,
+        prompt: PromptTemplate = DEFAULT_QUERY_PROMPT,
+        parser_key: str = "lines",
+    ) -> "MultiQueryRetriever":
+        """Initialize from llm using default template.
+
+        Args:
+            retriever: retriever to query documents from
+            llm: llm for query generation using DEFAULT_QUERY_PROMPT
+
+        Returns:
+            MultiQueryRetriever
+        """
+        output_parser = LineListOutputParser()
+        llm_chain = LLMChain(llm=llm, prompt=prompt, output_parser=output_parser)
+        return cls(
+            retriever=retriever,
+            llm_chain=llm_chain,
+            parser_key=parser_key,
+        )
+
+    def get_relevant_documents(self, question: str) -> List[Document]:
+        """Get relevated documents given a user query.
+
+        Args:
+            question: user query
+
+        Returns:
+            Unique union of relevant documents from all generated queries
+        """
+        queries = self.generate_queries(question)
+        documents = self.retrieve_documents(queries)
+        unique_documents = self.unique_union(documents)
+        return unique_documents
+
+    async def aget_relevant_documents(self, query: str) -> List[Document]:
+        raise NotImplementedError
+
+    def generate_queries(self, question: str) -> List[str]:
+        """Generate queries based upon user input.
+
+        Args:
+            question: user query
+
+        Returns:
+            List of LLM generated queries that are similar to the user input
+        """
+        response = self.llm_chain({"question": question})
+        lines = getattr(response["text"], self.parser_key, [])
+        if self.verbose:
+            logging.info(f"Generated queries: {lines}")
+        return lines
+
+    def retrieve_documents(self, queries: List[str]) -> List[Document]:
+        """Run all LLM generated queries.
+
+        Args:
+            queries: query list
+
+        Returns:
+            List of retrived Documents
+        """
+        documents = []
+        for query in queries:
+            docs = self.retriever.get_relevant_documents(query)
+            documents.extend(docs)
+        return documents
+
+    def unique_union(self, documents: List[Document]) -> List[Document]:
+        """Get uniqe Documents.
+
+        Args:
+            documents: List of retrived Documents
+
+        Returns:
+            List of unique retrived Documents
+        """
+        # Create a dictionary with page_content as keys to remove duplicates
+        # TODO: Add Document ID property (e.g., UUID)
+        unique_documents_dict = {
+            (doc.page_content, tuple(sorted(doc.metadata.items()))): doc
+            for doc in documents
+        }
+
+        unique_documents = list(unique_documents_dict.values())
+        return unique_documents
--- a/langchain/tools/zapier/tool.py
+++ b/langchain/tools/zapier/tool.py
@@ -1,6 +1,6 @@
 """## Zapier Natural Language Actions API
 \
-Full docs here: https://nla.zapier.com/api/v1/docs
+Full docs here: https://nla.zapier.com/start/

 **Zapier Natural Language Actions** gives you access to the 5k+ apps, 20k+ actions
 on Zapier's platform through a natural language API interface.
@@ -24,8 +24,8 @@ NLA offers both API Key and OAuth for signing NLA API requests.
    connected accounts on Zapier.com

 This quick start will focus on the server-side use case for brevity.
-Review [full docs](https://nla.zapier.com/api/v1/docs) or reach out to
-nla@zapier.com for user-facing oauth developer support.
+Review [full docs](https://nla.zapier.com/start/) for user-facing oauth developer
+support.

 Typically, you'd use SequentialChain, here's a basic example:

@@ -42,8 +42,7 @@ import os
 # get from https://platform.openai.com/
 os.environ["OPENAI_API_KEY"] = os.environ.get("OPENAI_API_KEY", "")

-# get from https://nla.zapier.com/demo/provider/debug
-# (under User Information, after logging in):
+# get from https://nla.zapier.com/docs/authentication/
 os.environ["ZAPIER_NLA_API_KEY"] = os.environ.get("ZAPIER_NLA_API_KEY", "")

 from langchain.llms import OpenAI
@@ -61,8 +60,9 @@ from langchain.utilities.zapier import ZapierNLAWrapper

 llm = OpenAI(temperature=0)
 zapier = ZapierNLAWrapper()
-## To leverage a nla_oauth_access_token you may pass the value to the ZapierNLAWrapper
-## If you do this there is no need to initialize the ZAPIER_NLA_API_KEY env variable
+## To leverage OAuth you may pass the value `nla_oauth_access_token` to
+## the ZapierNLAWrapper. If you do this there is no need to initialize
+## the ZAPIER_NLA_API_KEY env variable
 # zapier = ZapierNLAWrapper(zapier_nla_oauth_access_token="TOKEN_HERE")
 toolkit = ZapierToolkit.from_zapier_nla_wrapper(zapier)
 agent = initialize_agent(
@@ -99,7 +99,7 @@ class ZapierNLARunAction(BaseTool):
            (eg. "get the latest email from Mike Knoop" for "Gmail: find email" action)
        params: a dict, optional. Any params provided will *override* AI guesses
            from `instructions` (see "understanding the AI guessing flow" here:
-            https://nla.zapier.com/api/v1/docs)
+            https://nla.zapier.com/docs/using-the-api#ai-guessing)

    """

@@ -142,11 +142,15 @@ class ZapierNLARunAction(BaseTool):

    async def _arun(
        self,
-        _: str,
+        instructions: str,
        run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
    ) -> str:
        """Use the Zapier NLA tool to return a list of all exposed user actions."""
-        raise NotImplementedError("ZapierNLAListActions does not support async")
+        return await self.api_wrapper.arun_as_str(
+            self.action_id,
+            instructions,
+            self.params,
+        )


 ZapierNLARunAction.__doc__ = (
@@ -184,7 +188,7 @@ class ZapierNLAListActions(BaseTool):
        run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
    ) -> str:
        """Use the Zapier NLA tool to return a list of all exposed user actions."""
-        raise NotImplementedError("ZapierNLAListActions does not support async")
+        return await self.api_wrapper.alist_as_str()


 ZapierNLAListActions.__doc__ = (
--- a/langchain/utilities/searx_search.py
+++ b/langchain/utilities/searx_search.py
@@ -322,7 +322,7 @@ class SearxSearchWrapper(BaseModel):
            str: The result of the query.

        Raises:
-            ValueError: If an error occured with the query.
+            ValueError: If an error occurred with the query.


        Example:
--- a/langchain/utilities/serpapi.py
+++ b/langchain/utilities/serpapi.py
@@ -36,7 +36,7 @@ class SerpAPIWrapper(BaseModel):
    Example:
        .. code-block:: python

-            from langchain import SerpAPIWrapper
+            from langchain.utilities import SerpAPIWrapper
            serpapi = SerpAPIWrapper()
    """

--- a/langchain/utilities/zapier.py
+++ b/langchain/utilities/zapier.py
@@ -1,6 +1,6 @@
 """Util that can interact with Zapier NLA.

-Full docs here: https://nla.zapier.com/api/v1/docs
+Full docs here: https://nla.zapier.com/start/

 Note: this wrapper currently only implemented the `api_key` auth method for testing
 and server-side production use cases (using the developer's connected accounts on
@@ -12,8 +12,9 @@ to use oauth. Review the full docs above and reach out to nla@zapier.com for
 developer support.
 """
 import json
-from typing import Dict, List, Optional
+from typing import Any, Dict, List, Optional

+import aiohttp
 import requests
 from pydantic import BaseModel, Extra, root_validator
 from requests import Request, Session
@@ -24,16 +25,20 @@ from langchain.utils import get_from_dict_or_env
 class ZapierNLAWrapper(BaseModel):
    """Wrapper for Zapier NLA.

-    Full docs here: https://nla.zapier.com/api/v1/docs
+    Full docs here: https://nla.zapier.com/start/

-    Note: this wrapper currently only implemented the `api_key` auth method for
-    testingand server-side production use cases (using the developer's connected
-    accounts on Zapier.com)
+    This wrapper supports both API Key and OAuth Credential auth methods. API Key
+    is the fastest way to get started using this wrapper.
+
+    Call this wrapper with either `zapier_nla_api_key` or
+    `zapier_nla_oauth_access_token` arguments, or set the `ZAPIER_NLA_API_KEY`
+    environment variable. If both arguments are set, the Access Token will take
+    precedence.

    For use-cases where LangChain + Zapier NLA is powering a user-facing application,
    and LangChain needs access to the end-user's connected accounts on Zapier.com,
-    you'll need to use oauth. Review the full docs above and reach out to
-    nla@zapier.com for developer support.
+    you'll need to use OAuth. Review the full docs above to learn how to create
+    your own provider and generate credentials.
    """

    zapier_nla_api_key: str
@@ -45,36 +50,63 @@ class ZapierNLAWrapper(BaseModel):

        extra = Extra.forbid

-    def _get_session(self) -> Session:
-        session = requests.Session()
-        session.headers.update(
-            {
-                "Accept": "application/json",
-                "Content-Type": "application/json",
-            }
-        )
+    def _format_headers(self) -> Dict[str, str]:
+        """Format headers for requests."""
+        headers = {
+            "Accept": "application/json",
+            "Content-Type": "application/json",
+        }

        if self.zapier_nla_oauth_access_token:
-            session.headers.update(
+            headers.update(
                {"Authorization": f"Bearer {self.zapier_nla_oauth_access_token}"}
            )
        else:
-            session.params = {"api_key": self.zapier_nla_api_key}
+            headers.update({"X-API-Key": self.zapier_nla_api_key})

+        return headers
+
+    def _get_session(self) -> Session:
+        session = requests.Session()
+        session.headers.update(self._format_headers())
        return session

-    def _get_action_request(
-        self, action_id: str, instructions: str, params: Optional[Dict] = None
-    ) -> Request:
+    async def _arequest(self, method: str, url: str, **kwargs: Any) -> Dict[str, Any]:
+        """Make an async request."""
+        async with aiohttp.ClientSession(headers=self._format_headers()) as session:
+            async with session.request(method, url, **kwargs) as response:
+                response.raise_for_status()
+                return await response.json()
+
+    def _create_action_payload(  # type: ignore[no-untyped-def]
+        self, instructions: str, params: Optional[Dict] = None, preview_only=False
+    ) -> Dict:
+        """Create a payload for an action."""
        data = params if params else {}
        data.update(
            {
                "instructions": instructions,
            }
        )
+        if preview_only:
+            data.update({"preview_only": True})
+        return data
+
+    def _create_action_url(self, action_id: str) -> str:
+        """Create a url for an action."""
+        return self.zapier_nla_api_base + f"exposed/{action_id}/execute/"
+
+    def _create_action_request(  # type: ignore[no-untyped-def]
+        self,
+        action_id: str,
+        instructions: str,
+        params: Optional[Dict] = None,
+        preview_only=False,
+    ) -> Request:
+        data = self._create_action_payload(instructions, params, preview_only)
        return Request(
            "POST",
-            self.zapier_nla_api_base + f"exposed/{action_id}/execute/",
+            self._create_action_url(action_id),
            json=data,
        )

@@ -103,7 +135,7 @@ class ZapierNLAWrapper(BaseModel):

        return values

-    def list(self) -> List[Dict]:
+    async def alist(self) -> List[Dict]:
        """Returns a list of all exposed (enabled) actions associated with
        current user (associated with the set api_key). Change your exposed
        actions here: https://nla.zapier.com/demo/start/
@@ -122,9 +154,45 @@ class ZapierNLAWrapper(BaseModel):
        (see "understanding the AI guessing flow" here:
        https://nla.zapier.com/api/v1/docs)
        """
+        response = await self._arequest("GET", self.zapier_nla_api_base + "exposed/")
+        return response["results"]
+
+    def list(self) -> List[Dict]:
+        """Returns a list of all exposed (enabled) actions associated with
+        current user (associated with the set api_key). Change your exposed
+        actions here: https://nla.zapier.com/demo/start/
+
+        The return list can be empty if no actions exposed. Else will contain
+        a list of action objects:
+
+        [{
+            "id": str,
+            "description": str,
+            "params": Dict[str, str]
+        }]
+
+        `params` will always contain an `instructions` key, the only required
+        param. All others optional and if provided will override any AI guesses
+        (see "understanding the AI guessing flow" here:
+        https://nla.zapier.com/docs/using-the-api#ai-guessing)
+        """
        session = self._get_session()
-        response = session.get(self.zapier_nla_api_base + "exposed/")
-        response.raise_for_status()
+        try:
+            response = session.get(self.zapier_nla_api_base + "exposed/")
+            response.raise_for_status()
+        except requests.HTTPError as http_err:
+            if response.status_code == 401:
+                if self.zapier_nla_oauth_access_token:
+                    raise requests.HTTPError(
+                        f"An unauthorized response occurred. Check that your "
+                        f"access token is correct and doesn't need to be "
+                        f"refreshed. Err: {http_err}"
+                    )
+                raise requests.HTTPError(
+                    f"An unauthorized response occurred. Check that your api "
+                    f"key is correct. Err: {http_err}"
+                )
+            raise http_err
        return response.json()["results"]

    def run(
@@ -139,11 +207,29 @@ class ZapierNLAWrapper(BaseModel):
        call.
        """
        session = self._get_session()
-        request = self._get_action_request(action_id, instructions, params)
+        request = self._create_action_request(action_id, instructions, params)
        response = session.send(session.prepare_request(request))
        response.raise_for_status()
        return response.json()["result"]

+    async def arun(
+        self, action_id: str, instructions: str, params: Optional[Dict] = None
+    ) -> Dict:
+        """Executes an action that is identified by action_id, must be exposed
+        (enabled) by the current user (associated with the set api_key). Change
+        your exposed actions here: https://nla.zapier.com/demo/start/
+
+        The return JSON is guaranteed to be less than ~500 words (350
+        tokens) making it safe to inject into the prompt of another LLM
+        call.
+        """
+        response = await self._arequest(
+            "POST",
+            self._create_action_url(action_id),
+            json=self._create_action_payload(instructions, params),
+        )
+        return response["result"]
+
    def preview(
        self, action_id: str, instructions: str, params: Optional[Dict] = None
    ) -> Dict:
@@ -153,25 +239,58 @@ class ZapierNLAWrapper(BaseModel):
        session = self._get_session()
        params = params if params else {}
        params.update({"preview_only": True})
-        request = self._get_action_request(action_id, instructions, params)
+        request = self._create_action_request(action_id, instructions, params, True)
        response = session.send(session.prepare_request(request))
        response.raise_for_status()
        return response.json()["input_params"]

+    async def apreview(
+        self, action_id: str, instructions: str, params: Optional[Dict] = None
+    ) -> Dict:
+        """Same as run, but instead of actually executing the action, will
+        instead return a preview of params that have been guessed by the AI in
+        case you need to explicitly review before executing."""
+        response = await self._arequest(
+            "POST",
+            self._create_action_url(action_id),
+            json=self._create_action_payload(instructions, params, preview_only=True),
+        )
+        return response["result"]
+
    def run_as_str(self, *args, **kwargs) -> str:  # type: ignore[no-untyped-def]
        """Same as run, but returns a stringified version of the JSON for
        insertting back into an LLM."""
        data = self.run(*args, **kwargs)
        return json.dumps(data)

+    async def arun_as_str(self, *args, **kwargs) -> str:  # type: ignore[no-untyped-def]
+        """Same as run, but returns a stringified version of the JSON for
+        insertting back into an LLM."""
+        data = await self.arun(*args, **kwargs)
+        return json.dumps(data)
+
    def preview_as_str(self, *args, **kwargs) -> str:  # type: ignore[no-untyped-def]
        """Same as preview, but returns a stringified version of the JSON for
        insertting back into an LLM."""
        data = self.preview(*args, **kwargs)
        return json.dumps(data)

+    async def apreview_as_str(  # type: ignore[no-untyped-def]
+        self, *args, **kwargs
+    ) -> str:
+        """Same as preview, but returns a stringified version of the JSON for
+        insertting back into an LLM."""
+        data = await self.apreview(*args, **kwargs)
+        return json.dumps(data)
+
    def list_as_str(self) -> str:  # type: ignore[no-untyped-def]
        """Same as list, but returns a stringified version of the JSON for
        insertting back into an LLM."""
        actions = self.list()
        return json.dumps(actions)
+
+    async def alist_as_str(self) -> str:  # type: ignore[no-untyped-def]
+        """Same as list, but returns a stringified version of the JSON for
+        insertting back into an LLM."""
+        actions = await self.alist()
+        return json.dumps(actions)
--- a/langchain/vectorstores/pinecone.py
+++ b/langchain/vectorstores/pinecone.py
@@ -354,15 +354,16 @@ class Pinecone(VectorStore):
            pinecone.Index(index_name), embedding.embed_query, text_key, namespace
        )

-    def delete(self, ids: List[str]) -> None:
+    def delete(self, ids: List[str], namespace: Optional[str] = None) -> None:
        """Delete by vector IDs.
-
        Args:
            ids: List of ids to delete.
        """

        # This is the maximum number of IDs that can be deleted
+        if namespace is None:
+            namespace = self._namespace
        chunk_size = 1000
        for i in range(0, len(ids), chunk_size):
            chunk = ids[i : i + chunk_size]
-            self._index.delete(ids=chunk)
+            self._index.delete(ids=chunk, namespace=namespace)
--- a/poetry.lock
+++ b/poetry.lock
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "langchain"
-version = "0.0.216"
+version = "0.0.218"
 description = "Building applications with LLMs through composability"
 authors = []
 license = "MIT"
@@ -88,7 +88,6 @@ gql = {version = "^3.4.1", optional = true}
 pandas = {version = "^2.0.1", optional = true}
 telethon = {version = "^1.28.5", optional = true}
 neo4j = {version = "^5.8.1", optional = true}
-psychicapi = {version = "^0.5", optional = true}
 zep-python = {version=">=0.31", optional=true}
 langkit = {version = ">=0.0.1.dev3, <0.1.0", optional = true}
 chardet = {version="^5.1.0", optional=true}
@@ -109,8 +108,10 @@ nebula3-python = {version = "^3.4.0", optional = true}
 langchainplus-sdk = ">=0.0.17"
 awadb = {version = "^0.3.3", optional = true}
 azure-search-documents = {version = "11.4.0a20230509004", source = "azure-sdk-dev", optional = true}
+esprima = {version = "^4.0.1", optional = true}
 openllm = {version = ">=0.1.6", optional = true}
 streamlit = {version = "^1.18.0", optional = true, python = ">=3.8.1,<3.9.7 || >3.9.7,<4.0"}
+psychicapi = {version = "^0.8.0", optional = true}

 [tool.poetry.group.docs.dependencies]
 autodoc_pydantic = "^1.8.0"
@@ -222,6 +223,7 @@ clarifai = ["clarifai"]
 cohere = ["cohere"]
 docarray = ["docarray"]
 embeddings = ["sentence-transformers"]
+javascript = ["esprima"]
 azure = [
    "azure-identity",
    "azure-cosmos",
@@ -303,6 +305,7 @@ all = [
    "tigrisdb",
    "nebula3-python",
    "awadb",
+    "esprima",
 ]

 # An extra used to be able to add extended testing.
@@ -312,6 +315,7 @@ extended_testing = [
 "beautifulsoup4",
 "bibtexparser",
 "chardet",
+ "esprima",
 "jq",
 "pdfminer.six",
 "pgvector",
@@ -354,7 +358,7 @@ exclude = [
 [tool.mypy]
 ignore_missing_imports = "True"
 disallow_untyped_defs = "True"
-exclude = ["notebooks"]
+exclude = ["notebooks", "examples", "example_data"]

 [tool.coverage.run]
 omit = [
--- a/tests/integration_tests/chains/openai_functions/init.py
+++ b/tests/integration_tests/chains/openai_functions/init.py
--- a/tests/integration_tests/chains/openai_functions/test_openapi.py
+++ b/tests/integration_tests/chains/openai_functions/test_openapi.py
@@ -0,0 +1,25 @@
+import os
+from pathlib import Path
+
+from langchain.chains.openai_functions.openapi import get_openapi_chain
+
+
+def test_openai_opeanapi() -> None:
+    chain = get_openapi_chain(
+        "https://www.klarna.com/us/shopping/public/openai/v0/api-docs/"
+    )
+    output = chain.run("What are some options for a men's large blue button down shirt")
+
+    assert isinstance(output, dict)
+
+
+def test_openai_opeanapi_headers() -> None:
+    BRANDFETCH_API_KEY = os.environ.get("BRANDFETCH_API_KEY")
+    headers = {"Authorization": f"Bearer {BRANDFETCH_API_KEY}"}
+    file_path = str(
+        Path(__file__).parents[2] / "examples/brandfetch-brandfetch-2.0.0-resolved.json"
+    )
+    chain = get_openapi_chain(file_path, headers=headers)
+    output = chain.run("I want to know about nike.comgg")
+
+    assert isinstance(output, str)
--- a/tests/integration_tests/document_loaders/parsers/test_language.py
+++ b/tests/integration_tests/document_loaders/parsers/test_language.py
@@ -0,0 +1,133 @@
+from pathlib import Path
+
+import pytest
+
+from langchain.document_loaders.generic import GenericLoader
+from langchain.document_loaders.parsers import LanguageParser
+from langchain.text_splitter import Language
+
+
+def test_language_loader_for_python() -> None:
+    """Test Python loader with parser enabled."""
+    file_path = Path(__file__).parent.parent.parent / "examples"
+    loader = GenericLoader.from_filesystem(
+        file_path, glob="hello_world.py", parser=LanguageParser(parser_threshold=5)
+    )
+    docs = loader.load()
+
+    assert len(docs) == 2
+
+    metadata = docs[0].metadata
+    assert metadata["source"] == str(file_path / "hello_world.py")
+    assert metadata["content_type"] == "functions_classes"
+    assert metadata["language"] == "python"
+    metadata = docs[1].metadata
+    assert metadata["source"] == str(file_path / "hello_world.py")
+    assert metadata["content_type"] == "simplified_code"
+    assert metadata["language"] == "python"
+
+    assert (
+        docs[0].page_content
+        == """def main():
+    print("Hello World!")
+
+    return 0"""
+    )
+    assert (
+        docs[1].page_content
+        == """#!/usr/bin/env python3
+
+import sys
+
+
+# Code for: def main():
+
+
+if __name__ == "__main__":
+    sys.exit(main())"""
+    )
+
+
+def test_language_loader_for_python_with_parser_threshold() -> None:
+    """Test Python loader with parser enabled and below threshold."""
+    file_path = Path(__file__).parent.parent.parent / "examples"
+    loader = GenericLoader.from_filesystem(
+        file_path,
+        glob="hello_world.py",
+        parser=LanguageParser(language=Language.PYTHON, parser_threshold=1000),
+    )
+    docs = loader.load()
+
+    assert len(docs) == 1
+
+
+def esprima_installed() -> bool:
+    try:
+        import esprima  # noqa: F401
+
+        return True
+    except Exception as e:
+        print(f"esprima not installed, skipping test {e}")
+        return False
+
+
+@pytest.mark.skipif(not esprima_installed(), reason="requires esprima package")
+def test_language_loader_for_javascript() -> None:
+    """Test JavaScript loader with parser enabled."""
+    file_path = Path(__file__).parent.parent.parent / "examples"
+    loader = GenericLoader.from_filesystem(
+        file_path, glob="hello_world.js", parser=LanguageParser(parser_threshold=5)
+    )
+    docs = loader.load()
+
+    assert len(docs) == 3
+
+    metadata = docs[0].metadata
+    assert metadata["source"] == str(file_path / "hello_world.js")
+    assert metadata["content_type"] == "functions_classes"
+    assert metadata["language"] == "js"
+    metadata = docs[1].metadata
+    assert metadata["source"] == str(file_path / "hello_world.js")
+    assert metadata["content_type"] == "functions_classes"
+    assert metadata["language"] == "js"
+    metadata = docs[2].metadata
+    assert metadata["source"] == str(file_path / "hello_world.js")
+    assert metadata["content_type"] == "simplified_code"
+    assert metadata["language"] == "js"
+
+    assert (
+        docs[0].page_content
+        == """class HelloWorld {
+  sayHello() {
+    console.log("Hello World!");
+  }
+}"""
+    )
+    assert (
+        docs[1].page_content
+        == """function main() {
+  const hello = new HelloWorld();
+  hello.sayHello();
+}"""
+    )
+    assert (
+        docs[2].page_content
+        == """// Code for: class HelloWorld {
+
+// Code for: function main() {
+
+main();"""
+    )
+
+
+def test_language_loader_for_javascript_with_parser_threshold() -> None:
+    """Test JavaScript loader with parser enabled and below threshold."""
+    file_path = Path(__file__).parent.parent.parent / "examples"
+    loader = GenericLoader.from_filesystem(
+        file_path,
+        glob="hello_world.js",
+        parser=LanguageParser(language=Language.JS, parser_threshold=1000),
+    )
+    docs = loader.load()
+
+    assert len(docs) == 1
--- a/tests/integration_tests/document_loaders/test_language.py
+++ b/tests/integration_tests/document_loaders/test_language.py
--- a/tests/integration_tests/document_loaders/test_larksuite.py
+++ b/tests/integration_tests/document_loaders/test_larksuite.py
@@ -0,0 +1,14 @@
+from langchain.document_loaders.larksuite import LarkSuiteDocLoader
+
+DOMAIN = ""
+ACCESS_TOKEN = ""
+DOCUMENT_ID = ""
+
+
+def test_larksuite_doc_loader() -> None:
+    """Test LarkSuite (FeiShu) document loader."""
+    loader = LarkSuiteDocLoader(DOMAIN, ACCESS_TOKEN, DOCUMENT_ID)
+    docs = loader.load()
+
+    assert len(docs) == 1
+    assert docs[0].page_content is not None
--- a/tests/integration_tests/document_loaders/test_org_mode.py
+++ b/tests/integration_tests/document_loaders/test_org_mode.py
@@ -0,0 +1,15 @@
+import os
+from pathlib import Path
+
+from langchain.document_loaders import UnstructuredOrgModeLoader
+
+EXAMPLE_DIRECTORY = file_path = Path(__file__).parent.parent / "examples"
+
+
+def test_unstructured_org_mode_loader() -> None:
+    """Test unstructured loader."""
+    file_path = os.path.join(EXAMPLE_DIRECTORY, "README.org")
+    loader = UnstructuredOrgModeLoader(str(file_path))
+    docs = loader.load()
+
+    assert len(docs) == 1
--- a/tests/integration_tests/examples/README.org
+++ b/tests/integration_tests/examples/README.org
@@ -0,0 +1,27 @@
+* Example Docs
+
+The sample docs directory contains the following files:
+
+-  ~example-10k.html~ - A 10-K SEC filing in HTML format
+-  ~layout-parser-paper.pdf~ - A PDF copy of the layout parser paper
+-  ~factbook.xml~ / ~factbook.xsl~ - Example XML/XLS files that you
+   can use to test stylesheets
+
+These documents can be used to test out the parsers in the library. In
+addition, here are instructions for pulling in some sample docs that are
+too big to store in the repo.
+
+** XBRL 10-K
+
+You can get an example 10-K in inline XBRL format using the following
+~curl~. Note, you need to have the user agent set in the header or the
+SEC site will reject your request.
+
+#+BEGIN_SRC bash
+
+   curl -O \
+     -A '${organization} ${email}'
+     https://www.sec.gov/Archives/edgar/data/311094/000117184321001344/0001171843-21-001344.txt
+#+END_SRC
+
+You can parse this document using the HTML parser.
--- a/tests/integration_tests/examples/brandfetch-brandfetch-2.0.0-resolved.json
+++ b/tests/integration_tests/examples/brandfetch-brandfetch-2.0.0-resolved.json
@@ -0,0 +1,282 @@
+{
+  "openapi": "3.0.1",
+  "info": {
+    "title": "Brandfetch API",
+    "description": "Brandfetch API (v2) for retrieving brand information.\n\nSee our [documentation](https://docs.brandfetch.com/) for further details.                   ",
+    "termsOfService": "https://brandfetch.com/terms",
+    "contact": {
+      "url": "https://brandfetch.com/developers"
+    },
+    "version": "2.0.0"
+  },
+  "externalDocs": {
+    "description": "Documentation",
+    "url": "https://docs.brandfetch.com/"
+  },
+  "servers": [
+    {
+      "url": "https://api.brandfetch.io/v2"
+    }
+  ],
+  "paths": {
+    "/brands/{domainOrId}": {
+      "get": {
+        "summary": "Retrieve a brand",
+        "description": "Fetch brand information by domain or ID\n\nFurther details here: https://docs.brandfetch.com/reference/retrieve-brand\n",
+        "parameters": [
+          {
+            "name": "domainOrId",
+            "in": "path",
+            "description": "Domain or ID of the brand",
+            "required": true,
+            "style": "simple",
+            "explode": false,
+            "schema": {
+              "type": "string"
+            }
+          }
+        ],
+        "responses": {
+          "200": {
+            "description": "Brand data",
+            "content": {
+              "application/json": {
+                "schema": {
+                  "$ref": "#/components/schemas/Brand"
+                },
+                "examples": {
+                  "brandfetch.com": {
+                    "value": "{\"name\":\"Brandfetch\",\"domain\":\"brandfetch.com\",\"claimed\":true,\"description\":\"All brands. In one place\",\"links\":[{\"name\":\"twitter\",\"url\":\"https://twitter.com/brandfetch\"},{\"name\":\"linkedin\",\"url\":\"https://linkedin.com/company/brandfetch\"}],\"logos\":[{\"type\":\"logo\",\"theme\":\"light\",\"formats\":[{\"src\":\"https://asset.brandfetch.io/idL0iThUh6/id9WE9j86h.svg\",\"background\":\"transparent\",\"format\":\"svg\",\"size\":15555}]},{\"type\":\"logo\",\"theme\":\"dark\",\"formats\":[{\"src\":\"https://asset.brandfetch.io/idL0iThUh6/idWbsK1VCy.png\",\"background\":\"transparent\",\"format\":\"png\",\"height\":215,\"width\":800,\"size\":33937},{\"src\":\"https://asset.brandfetch.io/idL0iThUh6/idtCMfbWO0.svg\",\"background\":\"transparent\",\"format\":\"svg\",\"height\":null,\"width\":null,\"size\":15567}]},{\"type\":\"symbol\",\"theme\":\"light\",\"formats\":[{\"src\":\"https://asset.brandfetch.io/idL0iThUh6/idXGq6SIu2.svg\",\"background\":\"transparent\",\"format\":\"svg\",\"size\":2215}]},{\"type\":\"symbol\",\"theme\":\"dark\",\"formats\":[{\"src\":\"https://asset.brandfetch.io/idL0iThUh6/iddCQ52AR5.svg\",\"background\":\"transparent\",\"format\":\"svg\",\"size\":2215}]},{\"type\":\"icon\",\"theme\":\"dark\",\"formats\":[{\"src\":\"https://asset.brandfetch.io/idL0iThUh6/idls3LaPPQ.png\",\"background\":null,\"format\":\"png\",\"height\":400,\"width\":400,\"size\":2565}]}],\"colors\":[{\"hex\":\"#0084ff\",\"type\":\"accent\",\"brightness\":113},{\"hex\":\"#00193E\",\"type\":\"brand\",\"brightness\":22},{\"hex\":\"#F03063\",\"type\":\"brand\",\"brightness\":93},{\"hex\":\"#7B0095\",\"type\":\"brand\",\"brightness\":37},{\"hex\":\"#76CC4B\",\"type\":\"brand\",\"brightness\":176},{\"hex\":\"#FFDA00\",\"type\":\"brand\",\"brightness\":210},{\"hex\":\"#000000\",\"type\":\"dark\",\"brightness\":0},{\"hex\":\"#ffffff\",\"type\":\"light\",\"brightness\":255}],\"fonts\":[{\"name\":\"Poppins\",\"type\":\"title\",\"origin\":\"google\",\"originId\":\"Poppins\",\"weights\":[]},{\"name\":\"Inter\",\"type\":\"body\",\"origin\":\"google\",\"originId\":\"Inter\",\"weights\":[]}],\"images\":[{\"type\":\"banner\",\"formats\":[{\"src\":\"https://asset.brandfetch.io/idL0iThUh6/idUuia5imo.png\",\"background\":\"transparent\",\"format\":\"png\",\"height\":500,\"width\":1500,\"size\":5539}]}]}"
+                  }
+                }
+              }
+            }
+          },
+          "400": {
+            "description": "Invalid domain or ID supplied"
+          },
+          "404": {
+            "description": "The brand does not exist or the domain can't be resolved."
+          }
+        },
+        "security": [
+          {
+            "bearerAuth": []
+          }
+        ]
+      }
+    }
+  },
+  "components": {
+    "schemas": {
+      "Brand": {
+        "required": [
+          "claimed",
+          "colors",
+          "description",
+          "domain",
+          "fonts",
+          "images",
+          "links",
+          "logos",
+          "name"
+        ],
+        "type": "object",
+        "properties": {
+          "images": {
+            "type": "array",
+            "items": {
+              "$ref": "#/components/schemas/ImageAsset"
+            }
+          },
+          "fonts": {
+            "type": "array",
+            "items": {
+              "$ref": "#/components/schemas/FontAsset"
+            }
+          },
+          "domain": {
+            "type": "string"
+          },
+          "claimed": {
+            "type": "boolean"
+          },
+          "name": {
+            "type": "string"
+          },
+          "description": {
+            "type": "string"
+          },
+          "links": {
+            "type": "array",
+            "items": {
+              "$ref": "#/components/schemas/Brand_links"
+            }
+          },
+          "logos": {
+            "type": "array",
+            "items": {
+              "$ref": "#/components/schemas/ImageAsset"
+            }
+          },
+          "colors": {
+            "type": "array",
+            "items": {
+              "$ref": "#/components/schemas/ColorAsset"
+            }
+          }
+        },
+        "description": "Object representing a brand"
+      },
+      "ColorAsset": {
+        "required": [
+          "brightness",
+          "hex",
+          "type"
+        ],
+        "type": "object",
+        "properties": {
+          "brightness": {
+            "type": "integer"
+          },
+          "hex": {
+            "type": "string"
+          },
+          "type": {
+            "type": "string",
+            "enum": [
+              "accent",
+              "brand",
+              "customizable",
+              "dark",
+              "light",
+              "vibrant"
+            ]
+          }
+        },
+        "description": "Brand color asset"
+      },
+      "FontAsset": {
+        "type": "object",
+        "properties": {
+          "originId": {
+            "type": "string"
+          },
+          "origin": {
+            "type": "string",
+            "enum": [
+              "adobe",
+              "custom",
+              "google",
+              "system"
+            ]
+          },
+          "name": {
+            "type": "string"
+          },
+          "type": {
+            "type": "string"
+          },
+          "weights": {
+            "type": "array",
+            "items": {
+              "type": "number"
+            }
+          },
+          "items": {
+            "type": "string"
+          }
+        },
+        "description": "Brand font asset"
+      },
+      "ImageAsset": {
+        "required": [
+          "formats",
+          "theme",
+          "type"
+        ],
+        "type": "object",
+        "properties": {
+          "formats": {
+            "type": "array",
+            "items": {
+              "$ref": "#/components/schemas/ImageFormat"
+            }
+          },
+          "theme": {
+            "type": "string",
+            "enum": [
+              "light",
+              "dark"
+            ]
+          },
+          "type": {
+            "type": "string",
+            "enum": [
+              "logo",
+              "icon",
+              "symbol",
+              "banner"
+            ]
+          }
+        },
+        "description": "Brand image asset"
+      },
+      "ImageFormat": {
+        "required": [
+          "background",
+          "format",
+          "size",
+          "src"
+        ],
+        "type": "object",
+        "properties": {
+          "size": {
+            "type": "integer"
+          },
+          "src": {
+            "type": "string"
+          },
+          "background": {
+            "type": "string",
+            "enum": [
+              "transparent"
+            ]
+          },
+          "format": {
+            "type": "string"
+          },
+          "width": {
+            "type": "integer"
+          },
+          "height": {
+            "type": "integer"
+          }
+        },
+        "description": "Brand image asset image format"
+      },
+      "Brand_links": {
+        "required": [
+          "name",
+          "url"
+        ],
+        "type": "object",
+        "properties": {
+          "name": {
+            "type": "string"
+          },
+          "url": {
+            "type": "string"
+          }
+        }
+      }
+    },
+    "securitySchemes": {
+      "bearerAuth": {
+        "type": "http",
+        "scheme": "bearer",
+        "bearerFormat": "API Key"
+      }
+    }
+  }
+}
--- a/tests/integration_tests/examples/hello_world.js
+++ b/tests/integration_tests/examples/hello_world.js
@@ -0,0 +1,12 @@
+class HelloWorld {
+  sayHello() {
+    console.log("Hello World!");
+  }
+}
+
+function main() {
+  const hello = new HelloWorld();
+  hello.sayHello();
+}
+
+main();
--- a/tests/integration_tests/examples/hello_world.py
+++ b/tests/integration_tests/examples/hello_world.py
@@ -0,0 +1,13 @@
+#!/usr/bin/env python3
+
+import sys
+
+
+def main():
+    print("Hello World!")
+
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/tests/integration_tests/examples/whatsapp_chat.txt
+++ b/tests/integration_tests/examples/whatsapp_chat.txt
@@ -6,3 +6,5 @@
 [2023/5/4, 16:13:23] ~ User 2: See you!
 7/19/22, 11:32 PM - User 1: Hello
 7/20/22, 11:32 am - User 2: Goodbye
+4/20/23, 9:42 am - User 3: <Media omitted>
+6/29/23, 12:16 am - User 4: This message was deleted
--- a/tests/unit_tests/document_loaders/parsers/language/init.py
+++ b/tests/unit_tests/document_loaders/parsers/language/init.py
--- a/tests/unit_tests/document_loaders/parsers/language/test_javascript.py
+++ b/tests/unit_tests/document_loaders/parsers/language/test_javascript.py
@@ -0,0 +1,46 @@
+import unittest
+
+import pytest
+
+from langchain.document_loaders.parsers.language.javascript import JavaScriptSegmenter
+
+
+@pytest.mark.requires("esprima")
+class TestJavaScriptSegmenter(unittest.TestCase):
+    def setUp(self) -> None:
+        self.example_code = """const os = require('os');
+
+function hello(text) {
+    console.log(text);
+}
+
+class Simple {
+    constructor() {
+        this.a = 1;
+    }
+}
+
+hello("Hello!");"""
+
+        self.expected_simplified_code = """const os = require('os');
+
+// Code for: function hello(text) {
+
+// Code for: class Simple {
+
+hello("Hello!");"""
+
+        self.expected_extracted_code = [
+            "function hello(text) {\n    console.log(text);\n}",
+            "class Simple {\n    constructor() {\n        this.a = 1;\n    }\n}",
+        ]
+
+    def test_extract_functions_classes(self) -> None:
+        segmenter = JavaScriptSegmenter(self.example_code)
+        extracted_code = segmenter.extract_functions_classes()
+        self.assertEqual(extracted_code, self.expected_extracted_code)
+
+    def test_simplify_code(self) -> None:
+        segmenter = JavaScriptSegmenter(self.example_code)
+        simplified_code = segmenter.simplify_code()
+        self.assertEqual(simplified_code, self.expected_simplified_code)
--- a/tests/unit_tests/document_loaders/parsers/language/test_python.py
+++ b/tests/unit_tests/document_loaders/parsers/language/test_python.py
@@ -0,0 +1,40 @@
+import unittest
+
+from langchain.document_loaders.parsers.language.python import PythonSegmenter
+
+
+class TestPythonSegmenter(unittest.TestCase):
+    def setUp(self) -> None:
+        self.example_code = """import os
+
+def hello(text):
+    print(text)
+
+class Simple:
+    def __init__(self):
+        self.a = 1
+
+hello("Hello!")"""
+
+        self.expected_simplified_code = """import os
+
+# Code for: def hello(text):
+
+# Code for: class Simple:
+
+hello("Hello!")"""
+
+        self.expected_extracted_code = [
+            "def hello(text):\n" "    print(text)",
+            "class Simple:\n" "    def __init__(self):\n" "        self.a = 1",
+        ]
+
+    def test_extract_functions_classes(self) -> None:
+        segmenter = PythonSegmenter(self.example_code)
+        extracted_code = segmenter.extract_functions_classes()
+        self.assertEqual(extracted_code, self.expected_extracted_code)
+
+    def test_simplify_code(self) -> None:
+        segmenter = PythonSegmenter(self.example_code)
+        simplified_code = segmenter.simplify_code()
+        self.assertEqual(simplified_code, self.expected_simplified_code)
--- a/tests/unit_tests/document_loaders/parsers/test_public_api.py
+++ b/tests/unit_tests/document_loaders/parsers/test_public_api.py
@@ -5,6 +5,7 @@ def test_parsers_public_api_correct() -> None:
    """Test public API of parsers for breaking changes."""
    assert set(__all__) == {
        "BS4HTMLParser",
+        "LanguageParser",
        "OpenAIWhisperParser",
        "PyPDFParser",
        "PDFMinerParser",
--- a/tests/unit_tests/document_loaders/test_psychic.py
+++ b/tests/unit_tests/document_loaders/test_psychic.py
@@ -23,7 +23,7 @@ def mock_connector_id():  # type: ignore
 class TestPsychicLoader:
    MOCK_API_KEY = "api_key"
    MOCK_CONNECTOR_ID = "notion"
-    MOCK_CONNECTION_ID = "connection_id"
+    MOCK_ACCOUNT_ID = "account_id"

    def test_psychic_loader_initialization(
        self, mock_psychic: MagicMock, mock_connector_id: MagicMock
@@ -31,17 +31,21 @@ class TestPsychicLoader:
        PsychicLoader(
            api_key=self.MOCK_API_KEY,
            connector_id=self.MOCK_CONNECTOR_ID,
-            connection_id=self.MOCK_CONNECTION_ID,
+            account_id=self.MOCK_ACCOUNT_ID,
        )

        mock_psychic.assert_called_once_with(secret_key=self.MOCK_API_KEY)
        mock_connector_id.assert_called_once_with(self.MOCK_CONNECTOR_ID)

    def test_psychic_loader_load_data(self, mock_psychic: MagicMock) -> None:
-        mock_psychic.get_documents.return_value = [
+        mock_get_documents_response = MagicMock()
+        mock_get_documents_response.documents = [
            self._get_mock_document("123"),
            self._get_mock_document("456"),
        ]
+        mock_get_documents_response.next_page_cursor = None
+
+        mock_psychic.get_documents.return_value = mock_get_documents_response

        psychic_loader = self._get_mock_psychic_loader(mock_psychic)

@@ -57,7 +61,7 @@ class TestPsychicLoader:
        psychic_loader = PsychicLoader(
            api_key=self.MOCK_API_KEY,
            connector_id=self.MOCK_CONNECTOR_ID,
-            connection_id=self.MOCK_CONNECTION_ID,
+            account_id=self.MOCK_ACCOUNT_ID,
        )
        psychic_loader.psychic = mock_psychic
        return psychic_loader
--- a/tests/unit_tests/tools/test_zapier.py
+++ b/tests/unit_tests/tools/test_zapier.py
@@ -1,5 +1,8 @@
 """Test building the Zapier tool, not running it."""
+from unittest.mock import MagicMock, patch
+
 import pytest
+import requests

 from langchain.tools.zapier.prompt import BASE_ZAPIER_TOOL_PROMPT
 from langchain.tools.zapier.tool import ZapierNLARunAction
@@ -50,3 +53,234 @@ def test_custom_base_prompt_fail() -> None:
            base_prompt=base_prompt,
            api_wrapper=ZapierNLAWrapper(zapier_nla_api_key="test"),
        )
+
+
+def test_format_headers_api_key() -> None:
+    """Test that the action headers is being created correctly."""
+    tool = ZapierNLARunAction(
+        action_id="test",
+        zapier_description="test",
+        params_schema={"test": "test"},
+        api_wrapper=ZapierNLAWrapper(zapier_nla_api_key="test"),
+    )
+    headers = tool.api_wrapper._format_headers()
+    assert headers["Content-Type"] == "application/json"
+    assert headers["Accept"] == "application/json"
+    assert headers["X-API-Key"] == "test"
+
+
+def test_format_headers_access_token() -> None:
+    """Test that the action headers is being created correctly."""
+    tool = ZapierNLARunAction(
+        action_id="test",
+        zapier_description="test",
+        params_schema={"test": "test"},
+        api_wrapper=ZapierNLAWrapper(zapier_nla_oauth_access_token="test"),
+    )
+    headers = tool.api_wrapper._format_headers()
+    assert headers["Content-Type"] == "application/json"
+    assert headers["Accept"] == "application/json"
+    assert headers["Authorization"] == "Bearer test"
+
+
+def test_create_action_payload() -> None:
+    """Test that the action payload is being created correctly."""
+    tool = ZapierNLARunAction(
+        action_id="test",
+        zapier_description="test",
+        params_schema={"test": "test"},
+        api_wrapper=ZapierNLAWrapper(zapier_nla_api_key="test"),
+    )
+
+    payload = tool.api_wrapper._create_action_payload("some instructions")
+    assert payload["instructions"] == "some instructions"
+    assert payload.get("preview_only") is None
+
+
+def test_create_action_payload_preview() -> None:
+    """Test that the action payload with preview is being created correctly."""
+    tool = ZapierNLARunAction(
+        action_id="test",
+        zapier_description="test",
+        params_schema={"test": "test"},
+        api_wrapper=ZapierNLAWrapper(zapier_nla_api_key="test"),
+    )
+
+    payload = tool.api_wrapper._create_action_payload(
+        "some instructions",
+        preview_only=True,
+    )
+    assert payload["instructions"] == "some instructions"
+    assert payload["preview_only"] is True
+
+
+def test_create_action_payload_with_params() -> None:
+    """Test that the action payload with params is being created correctly."""
+    tool = ZapierNLARunAction(
+        action_id="test",
+        zapier_description="test",
+        params_schema={"test": "test"},
+        api_wrapper=ZapierNLAWrapper(zapier_nla_api_key="test"),
+    )
+
+    payload = tool.api_wrapper._create_action_payload(
+        "some instructions",
+        {"test": "test"},
+        preview_only=True,
+    )
+    assert payload["instructions"] == "some instructions"
+    assert payload["preview_only"] is True
+    assert payload["test"] == "test"
+
+
+@pytest.mark.asyncio
+async def test_apreview(mocker) -> None:  # type: ignore[no-untyped-def]
+    """Test that the action payload with params is being created correctly."""
+    tool = ZapierNLARunAction(
+        action_id="test",
+        zapier_description="test",
+        params_schema={"test": "test"},
+        api_wrapper=ZapierNLAWrapper(
+            zapier_nla_api_key="test",
+            zapier_nla_api_base="http://localhost:8080/v1/",
+        ),
+    )
+    mockObj = mocker.patch.object(ZapierNLAWrapper, "_arequest")
+    await tool.api_wrapper.apreview(
+        "random_action_id",
+        "some instructions",
+        {"test": "test"},
+    )
+    mockObj.assert_called_once_with(
+        "POST",
+        "http://localhost:8080/v1/exposed/random_action_id/execute/",
+        json={
+            "instructions": "some instructions",
+            "preview_only": True,
+            "test": "test",
+        },
+    )
+
+
+@pytest.mark.asyncio
+async def test_arun(mocker) -> None:  # type: ignore[no-untyped-def]
+    """Test that the action payload with params is being created correctly."""
+    tool = ZapierNLARunAction(
+        action_id="test",
+        zapier_description="test",
+        params_schema={"test": "test"},
+        api_wrapper=ZapierNLAWrapper(
+            zapier_nla_api_key="test",
+            zapier_nla_api_base="http://localhost:8080/v1/",
+        ),
+    )
+    mockObj = mocker.patch.object(ZapierNLAWrapper, "_arequest")
+    await tool.api_wrapper.arun(
+        "random_action_id",
+        "some instructions",
+        {"test": "test"},
+    )
+    mockObj.assert_called_once_with(
+        "POST",
+        "http://localhost:8080/v1/exposed/random_action_id/execute/",
+        json={"instructions": "some instructions", "test": "test"},
+    )
+
+
+@pytest.mark.asyncio
+async def test_alist(mocker) -> None:  # type: ignore[no-untyped-def]
+    """Test that the action payload with params is being created correctly."""
+    tool = ZapierNLARunAction(
+        action_id="test",
+        zapier_description="test",
+        params_schema={"test": "test"},
+        api_wrapper=ZapierNLAWrapper(
+            zapier_nla_api_key="test",
+            zapier_nla_api_base="http://localhost:8080/v1/",
+        ),
+    )
+    mockObj = mocker.patch.object(ZapierNLAWrapper, "_arequest")
+    await tool.api_wrapper.alist()
+    mockObj.assert_called_once_with(
+        "GET",
+        "http://localhost:8080/v1/exposed/",
+    )
+
+
+def test_wrapper_fails_no_api_key_or_access_token_initialization() -> None:
+    """Test Wrapper requires either an API Key or OAuth Access Token."""
+    with pytest.raises(ValueError):
+        ZapierNLAWrapper()
+
+
+def test_wrapper_api_key_initialization() -> None:
+    """Test Wrapper initializes with an API Key."""
+    ZapierNLAWrapper(zapier_nla_api_key="test")
+
+
+def test_wrapper_access_token_initialization() -> None:
+    """Test Wrapper initializes with an API Key."""
+    ZapierNLAWrapper(zapier_nla_oauth_access_token="test")
+
+
+def test_list_raises_401_invalid_api_key() -> None:
+    """Test that a valid error is raised when the API Key is invalid."""
+    mock_response = MagicMock()
+    mock_response.status_code = 401
+    mock_response.raise_for_status.side_effect = requests.HTTPError(
+        "401 Client Error: Unauthorized for url: https://nla.zapier.com/api/v1/exposed/"
+    )
+    mock_session = MagicMock()
+    mock_session.get.return_value = mock_response
+
+    with patch("requests.Session", return_value=mock_session):
+        wrapper = ZapierNLAWrapper(zapier_nla_api_key="test")
+
+        with pytest.raises(requests.HTTPError) as err:
+            wrapper.list()
+
+        assert str(err.value).startswith(
+            "An unauthorized response occurred. Check that your api key is correct. "
+            "Err:"
+        )
+
+
+def test_list_raises_401_invalid_access_token() -> None:
+    """Test that a valid error is raised when the API Key is invalid."""
+    mock_response = MagicMock()
+    mock_response.status_code = 401
+    mock_response.raise_for_status.side_effect = requests.HTTPError(
+        "401 Client Error: Unauthorized for url: https://nla.zapier.com/api/v1/exposed/"
+    )
+    mock_session = MagicMock()
+    mock_session.get.return_value = mock_response
+
+    with patch("requests.Session", return_value=mock_session):
+        wrapper = ZapierNLAWrapper(zapier_nla_oauth_access_token="test")
+
+        with pytest.raises(requests.HTTPError) as err:
+            wrapper.list()
+
+        assert str(err.value).startswith(
+            "An unauthorized response occurred. Check that your access token is "
+            "correct and doesn't need to be refreshed. Err:"
+        )
+
+
+def test_list_raises_other_error() -> None:
+    """Test that a valid error is raised when an unknown HTTP Error occurs."""
+    mock_response = MagicMock()
+    mock_response.status_code = 404
+    mock_response.raise_for_status.side_effect = requests.HTTPError(
+        "404 Client Error: Not found for url"
+    )
+    mock_session = MagicMock()
+    mock_session.get.return_value = mock_response
+
+    with patch("requests.Session", return_value=mock_session):
+        wrapper = ZapierNLAWrapper(zapier_nla_oauth_access_token="test")
+
+        with pytest.raises(requests.HTTPError) as err:
+            wrapper.list()
+
+        assert str(err.value) == "404 Client Error: Not found for url"
Author	SHA1	Message	Date
vowelparrot	cddfe05073	Send evaluator logs to new session	2023-06-28 15:45:45 -07:00
Harrison Chase	e5611565b7	bump version to 218 (#6857 )	2023-06-27 23:36:37 -07:00
Yaohui Wang	9d1bd18596	feat (documents): add LarkSuite document loader (#6420 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> ### Summary This PR adds a LarkSuite (FeiShu) document loader. > [LarkSuite](https://www.larksuite.com/) is an enterprise collaboration platform developed by ByteDance. ### Tests - an integration test case is added - an example notebook showing usage is added. [Notebook preview](https://github.com/yaohui-wyh/langchain/blob/master/docs/extras/modules/data_connection/document_loaders/integrations/larksuite.ipynb) <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ### Who can review? - PTAL @eyurtsev @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Yaohui Wang <wangyaohui.01@bytedance.com>	2023-06-27 23:08:05 -07:00
Jingsong Gao	a435a436c1	feat(document_loaders): add tencent cos directory and file loader (#6401 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> - add tencent cos directory and file support for document-loader #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @eyurtsev	2023-06-27 23:07:20 -07:00
Ninely	d6cd0deaef	feat: Add streaming only final aiter of agent (#6274 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> #### Add streaming only final async iterator of agent This callback returns an async iterator and only streams the final output of an agent. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-27 23:06:25 -07:00
Shashank Deshpande	1db266b20d	Update link in apis.mdx (#6812 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-06-27 23:00:26 -07:00
Lance Martin	3f9900a864	Create MultiQueryRetriever (#6833 ) Distance-based vector database retrieval embeds (represents) queries in high-dimensional space and finds similar embedded documents based on "distance". But, retrieval may produce difference results with subtle changes in query wording or if the embeddings do not capture the semantics of the data well. Prompt engineering / tuning is sometimes done to manually address these problems, but can be tedious. The `MultiQueryRetriever` automates the process of prompt tuning by using an LLM to generate multiple queries from different perspectives for a given user input query. For each query, it retrieves a set of relevant documents and takes the unique union across all queries to get a larger set of potentially relevant documents. By generating multiple perspectives on the same question, the `MultiQueryRetriever` might be able to overcome some of the limitations of the distance-based retrieval and get a richer set of results. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-27 22:59:40 -07:00
Tim Asp	3ca1a387c2	Web Loader: Add proxy support (#6792 ) Proxies are helpful, especially when you start querying against more anti-bot websites. [Proxy services](https://developers.oxylabs.io/advanced-proxy-solutions/web-unblocker/making-requests) (of which there are many) and `requests` make it easy to rotate IPs to prevent banning by just passing along a simple dict to `requests`. CC @rlancemartin, @eyurtsev	2023-06-27 22:27:49 -07:00
Ayan Bandyopadhyay	f92ccf70fd	Update to the latest Psychic python library version (#6804 ) Update the Psychic document loader to use the latest `psychicapi` python library version: `0.8.0`	2023-06-27 22:26:38 -07:00
Hun-soo Jung	f3d178f600	Specify utilities package in SerpAPIWrapper docstring (#6821 ) - Description: Specify utilities package in SerpAPIWrapper docstring - Issue: Not an issue - Dependencies: (n/a) - Tag maintainer: @dev2049 - Twitter handle: (n/a)	2023-06-27 22:26:20 -07:00
Matt Robinson	dd2a151543	Docs/unstructured api key (#6781 ) ### Summary The Unstructured API will soon begin requiring API keys. This PR updates the Unstructured integrations docs with instructions on how to generate Unstructured API keys. ### Reviewers @rlancemartin @eyurtsev @hwchase17	2023-06-27 16:54:15 -07:00
Matthew Plachter	d6664af0ee	add async to zapier nla tools (#6791 ) Replace this comment with: - Description: Add Async functionality to Zapier NLA Tools - Issue: n/a - Dependencies: n/a - Tag maintainer: Maintainer responsibilities: - Agents / Tools / Toolkits: @vowelparrot - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md	2023-06-27 16:53:35 -07:00
Neil Neuwirth	efe0d39c6a	Adjusted OpenAI cost calculation (#6798 ) Added parentheses to ensure the division operation is performed before multiplication. This now correctly calculates the cost by dividing the number of tokens by 1000 first (to get the cost per token), and then multiplies it with the model's cost per 1k tokens @agola11	2023-06-27 16:53:06 -07:00
Ian	b4c196f785	fix pinecone delete bug (#6816 ) The implementation of delete in pinecone vector omits the namespace, which will cause delete failed	2023-06-27 16:50:17 -07:00
Janos Tolgyesi	f1070de038	WebBaseLoader: optionally raise exception in the case of http error (#6823 ) - Description: this PR adds the possibility to raise an exception in the case the http request did not return a 2xx status code. This is particularly useful in the situation when the url points to a non-existent web page, the server returns a http status of 404 NOT FOUND, but WebBaseLoader anyway parses and returns the http body of the error message. - Dependencies: none, - Tag maintainer: @rlancemartin, @eyurtsev, - Twitter handle: jtolgyesi	2023-06-27 16:43:59 -07:00
rafael	ef72a7cf26	rail_parser: Allow creation from pydantic (#6832 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Adds a way to create the guardrails output parser from a pydantic model.	2023-06-27 16:40:52 -07:00
Augustine Theodore	a980095efc	Enhancement : Ignore deleted messages and media in WhatsAppChatLoader (#6839 ) - Description: Ignore deleted messages and media - Issue: #6838 - Dependencies: No new dependencies - Tag maintainer: @rlancemartin, @eyurtsev	2023-06-27 16:36:55 -07:00
Robert Lewis	74848aafea	Zapier - Add better error messaging for 401 responses (#6840 ) Description: When a 401 response is given back by Zapier, hint to the end user why that may have occurred - If an API Key was initialized with the wrapper, ask them to check their API Key value - if an access token was initialized with the wrapper, ask them to check their access token or verify that it doesn't need to be refreshed. Tag maintainer: @dev2049	2023-06-27 16:35:42 -07:00
Matt Robinson	b24472eae3	feat: Add `UnstructuredOrgModeLoader` (#6842 ) ### Summary Adds `UnstructuredOrgModeLoader` for processing [Org-mode](https://en.wikipedia.org/wiki/Org-mode) documents. ### Testing ```python from langchain.document_loaders import UnstructuredOrgModeLoader loader = UnstructuredOrgModeLoader( file_path="example_data/README.org", mode="elements" ) docs = loader.load() print(docs[0]) ``` ### Reviewers - @rlancemartin - @eyurtsev - @hwchase17	2023-06-27 16:34:17 -07:00
Piyush Jain	e53995836a	Added missing attribute value object (#6849 ) ## Description Adds a missing type class for [AdditionalResultAttributeValue](https://docs.aws.amazon.com/kendra/latest/APIReference/API_AdditionalResultAttributeValue.html). Fixes validation failure for the query API that have `AdditionalAttributes` in the response. cc @dev2049 cc @zhichenggeng	2023-06-27 16:30:11 -07:00
Cristóbal Carnero Liñán	e494b0a09f	feat (documents): add a source code loader based on AST manipulation (#6486 ) #### Summary A new approach to loading source code is implemented: Each top-level function and class in the code is loaded into separate documents. Then, an additional document is created with the top-level code, but without the already loaded functions and classes. This could improve the accuracy of QA chains over source code. For instance, having this script: ``` class MyClass: def __init__(self, name): self.name = name def greet(self): print(f"Hello, {self.name}!") def main(): name = input("Enter your name: ") obj = MyClass(name) obj.greet() if __name__ == '__main__': main() ``` The loader will create three documents with this content: First document: ``` class MyClass: def __init__(self, name): self.name = name def greet(self): print(f"Hello, {self.name}!") ``` Second document: ``` def main(): name = input("Enter your name: ") obj = MyClass(name) obj.greet() ``` Third document: ``` # Code for: class MyClass: # Code for: def main(): if __name__ == '__main__': main() ``` A threshold parameter is added to control whether small scripts are split in this way or not. At this moment, only Python and JavaScript are supported. The appropriate parser is determined by examining the file extension. #### Tests This PR adds: - Unit tests - Integration tests #### Dependencies Only one dependency was added as optional (needed for the JavaScript parser). #### Documentation A notebook is added showing how the loader can be used. #### Who can review? @eyurtsev @hwchase17 --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-27 15:58:47 -07:00
Robert Lewis	da462d9dd4	Zapier update oauth support (#6780 ) Description: Update documentation to 1) point to updated documentation links at Zapier.com (we've revamped our help docs and paths), and 2) To provide clarity how to use the wrapper with an access token for OAuth support Demo: Initializing the Zapier Wrapper with an OAuth Access Token `ZapierNLAWrapper(zapier_nla_oauth_access_token="<redacted>")` Using LangChain to resolve the current weather in Vancouver BC leveraging Zapier NLA to lookup weather by coords. ``` > Entering new chain... I need to use a tool to get the current weather. Action: The Weather: Get Current Weather Action Input: Get the current weather for Vancouver BC Observation: {"coord__lon": -123.1207, "coord__lat": 49.2827, "weather": [{"id": 802, "main": "Clouds", "description": "scattered clouds", "icon": "03d", "icon_url": "http://openweathermap.org/img/wn/03d@2x.png"}], "weather[]icon_url": ["http://openweathermap.org/img/wn/03d@2x.png"], "weather[]icon": ["03d"], "weather[]id": [802], "weather[]description": ["scattered clouds"], "weather[]main": ["Clouds"], "base": "stations", "main__temp": 71.69, "main__feels_like": 71.56, "main__temp_min": 67.64, "main__temp_max": 76.39, "main__pressure": 1015, "main__humidity": 64, "visibility": 10000, "wind__speed": 3, "wind__deg": 155, "wind__gust": 11.01, "clouds__all": 41, "dt": 1687806607, "sys__type": 2, "sys__id": 2011597, "sys__country": "CA", "sys__sunrise": 1687781297, "sys__sunset": 1687839730, "timezone": -25200, "id": 6173331, "name": "Vancouver", "cod": 200, "summary": "scattered clouds", "_zap_search_was_found_status": true} Thought: I now know the current weather in Vancouver BC. Final Answer: The current weather in Vancouver BC is scattered clouds with a temperature of 71.69 and wind speed of 3 ```	2023-06-27 11:46:32 -07:00
Joshua Carroll	24e4ae95ba	Initial Streamlit callback integration doc (md) (#6788 ) Description: Add a documentation page for the Streamlit Callback Handler integration (#6315) Notes: - Implemented as a markdown file instead of a notebook since example code runs in a Streamlit app (happy to discuss / consider alternatives now or later) - Contains an embedded Streamlit app -> https://mrkl-minimal.streamlit.app/ Currently this app is hosted out of a Streamlit repo but we're working to migrate the code to a LangChain owned repo ![streamlit_docs](https://github.com/hwchase17/langchain/assets/116604821/0b7a6239-361f-470c-8539-f22c40098d1a) cc @dev2049 @tconkling	2023-06-27 11:43:49 -07:00
Harrison Chase	8392ca602c	bump version to 217 (#6831 )	2023-06-27 09:39:56 -07:00
Ismail Pelaseyed	fcb3a64799	Add support for passing headers and search params to openai openapi chain (#6782 ) - Description: add support for passing headers and search params to OpenAI OpenAPI chains. - Issue: n/a - Dependencies: n/a - Tag maintainer: @hwchase17 - Twitter handle: @pelaseyed --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-27 09:09:03 -07:00