Change Name to LangSmith

Update String Evaluator (#6615 )
- Add protocol for `evaluate_strings` - Move the criteria evaluator out so it's not restricted to being applied on traced runs
2026-04-20 22:08:07 +00:00 · 2023-06-26 16:57:14 -07:00 · 2023-06-26 14:16:14 -07:00 · 2023-06-26 13:14:48 -07:00 · 2023-06-26 10:51:20 -07:00 · 2023-06-26 10:37:33 -07:00
85 changed files with 3587 additions and 570 deletions
--- a/docs/extras/ecosystem/integrations/myscale.mdx
+++ b/docs/extras/ecosystem/integrations/myscale.mdx
@@ -25,7 +25,7 @@ There are two ways to set up parameters for myscale index.
 1. Environment Variables

    Before you run the app, please set the environment variable with `export`:
-    `export MYSCALE_URL='<your-endpoints-url>' MYSCALE_PORT=<your-endpoints-port> MYSCALE_USERNAME=<your-username> MYSCALE_PASSWORD=<your-password> ...`
+    `export MYSCALE_HOST='<your-endpoints-url>' MYSCALE_PORT=<your-endpoints-port> MYSCALE_USERNAME=<your-username> MYSCALE_PASSWORD=<your-password> ...`

    You can easily find your account, password and other info on our SaaS. For details please refer to [this document](https://docs.myscale.com/en/cluster-management/)
    Every attributes under `MyScaleSettings` can be set with prefix `MYSCALE_` and is case insensitive.
--- a/docs/extras/guides/evaluation/criteria_eval_chain.ipynb
+++ b/docs/extras/guides/evaluation/criteria_eval_chain.ipynb
@@ -0,0 +1,264 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "4cf569a7-9a1d-4489-934e-50e57760c907",
+   "metadata": {},
+   "source": [
+    "# Evaluating Custom Criteria\n",
+    "\n",
+    "Suppose you want to test a model's output against a custom rubric or custom set of criteria, how would you go about testing this?\n",
+    "\n",
+    "The `CriteriaEvalChain` is a convenient way to predict whether an LLM or Chain's output complies with a set of criteria, so long as you can\n",
+    "describe those criteria in regular language. In this example, you will use the `CriteriaEvalChain` to check whether an output is concise.\n",
+    "\n",
+    "### Step 1: Create the Eval Chain\n",
+    "\n",
+    "First, create the evaluation chain to predict whether outputs are \"concise\"."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "6005ebe8-551e-47a5-b4df-80575a068552",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain.chat_models import ChatOpenAI\n",
+    "from langchain.evaluation.criteria import CriteriaEvalChain\n",
+    "\n",
+    "llm = ChatOpenAI(temperature=0)\n",
+    "criterion = \"conciseness\"\n",
+    "eval_chain = CriteriaEvalChain.from_llm(llm=llm, criteria=criterion)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "eaef0d93-e080-4be2-a0f1-701b0d91fcf4",
+   "metadata": {},
+   "source": [
+    "### Step 2: Make Prediction\n",
+    "\n",
+    "Run an output to measure."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "68b1a348-cf41-40bf-9667-e79683464cf2",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "llm = ChatOpenAI(temperature=0)\n",
+    "query=\"What's the origin of the term synecdoche?\"\n",
+    "prediction = llm.predict(query)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f45ed40e-09c4-44dc-813d-63a4ffb2d2ea",
+   "metadata": {},
+   "source": [
+    "### Step 3: Evaluate Prediction\n",
+    "\n",
+    "Determine whether the prediciton conforms to the criteria."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "22f83fb8-82f4-4310-a877-68aaa0789199",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'reasoning': '1. Conciseness: The submission is concise and to the point. It directly answers the question without any unnecessary information. Therefore, the submission meets the criterion of conciseness.\\n\\nY', 'value': 'Y', 'score': 1}\n"
+     ]
+    }
+   ],
+   "source": [
+    "eval_result = eval_chain.evaluate_strings(prediction=prediction, input=query)\n",
+    "print(eval_result)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "8c4ec9dd-6557-4f23-8480-c822eb6ec552",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "['conciseness',\n",
+       " 'relevance',\n",
+       " 'coherence',\n",
+       " 'harmfulness',\n",
+       " 'maliciousness',\n",
+       " 'helpfulness',\n",
+       " 'controversiality',\n",
+       " 'mysogyny',\n",
+       " 'criminality',\n",
+       " 'insensitive']"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# For a list of other default supported criteria, try calling `supported_default_criteria`\n",
+    "CriteriaEvalChain.get_supported_default_criteria()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2eb7dedb-913a-4d9e-b48a-9521425d1008",
+   "metadata": {},
+   "source": [
+    "## Multiple Criteria\n",
+    "\n",
+    "To check whether an output complies with all of a list of default criteria, pass in a list! Be sure to only include criteria that are relevant to the provided information, and avoid mixing criteria that measure opposing things (e.g., harmfulness and helpfulness)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "50c067f7-bc6e-4d6c-ba34-97a72023be27",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'reasoning': 'Conciseness: The submission is not concise and does not answer the given task. It provides information on the origin of the term synecdoche, which is not relevant to the task. Therefore, the submission does not meet the criterion of conciseness.\\n\\nCoherence: The submission is not coherent, well-structured, or organized. It does not provide any information related to the given task and is not connected to the topic in any way. Therefore, the submission does not meet the criterion of coherence.\\n\\nConclusion: The submission does not meet all criteria.', 'value': 'N', 'score': 0}\n"
+     ]
+    }
+   ],
+   "source": [
+    "criteria = [\"conciseness\", \"coherence\"]\n",
+    "eval_chain = CriteriaEvalChain.from_llm(llm=llm, criteria=criteria)\n",
+    "eval_result = eval_chain.evaluate_strings(prediction=prediction, input=query)\n",
+    "print(eval_result)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "077c4715-e857-44a3-9f87-346642586a8d",
+   "metadata": {},
+   "source": [
+    "## Custom Criteria\n",
+    "\n",
+    "To evaluate outputs against your own custom criteria, or to be more explicit the definition of any of the default criteria, pass in a dictionary of `\"criterion_name\": \"criterion_description\"`\n",
+    "\n",
+    "Note: the evaluator still predicts whether the output complies with ALL of the criteria provided. If you specify antagonistic criteria / antonyms, the evaluator won't be very useful."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "bafa0a11-2617-4663-84bf-24df7d0736be",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'reasoning': '1. Criteria: numeric: Does the output contain numeric information?\\n- The submission does not contain any numeric information.\\n- Conclusion: The submission meets the criteria.', 'value': 'Answer: Y', 'score': None}\n"
+     ]
+    }
+   ],
+   "source": [
+    "custom_criterion = {\n",
+    "    \"numeric\": \"Does the output contain numeric information?\"\n",
+    "}\n",
+    "\n",
+    "eval_chain = CriteriaEvalChain.from_llm(llm=llm, criteria=custom_criterion)\n",
+    "eval_result = eval_chain.evaluate_strings(prediction=prediction, input=query)\n",
+    "print(eval_result)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "6db12a16-0058-4a14-8064-8528540963d8",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'reasoning': '- complements-user: The submission directly answers the question asked and provides additional information about the population of Lagos. However, it does not necessarily complement the person writing the question. \\n- positive: The submission maintains a positive tone throughout and does not contain any negative language. \\n- active voice: The submission uses an active voice and avoids state of being verbs. \\n\\nTherefore, the submission meets all criteria. \\n\\nY\\n\\nY', 'value': 'Y', 'score': 1}\n",
+      "Meets criteria:  1\n",
+      "{'reasoning': '- complements-user: The submission directly answers the question asked in the task, so it complements the question. Therefore, the answer meets this criterion. \\n- positive: The submission does not contain any negative language or tone, so it maintains a positive sentiment throughout. Therefore, the answer meets this criterion. \\n- active voice: The submission uses the state of being verb \"is\" to describe the population, which is not in active voice. Therefore, the answer does not meet this criterion. \\n\\nAnswer: N', 'value': 'N', 'score': 0}\n",
+      "Does not meet criteria:  0\n"
+     ]
+    }
+   ],
+   "source": [
+    "# You can specify multiple criteria in the dictionary. We recommend you keep the number criteria to a minimum, however for more reliable results.\n",
+    "\n",
+    "custom_criteria = {\n",
+    "    \"complements-user\": \"Does the submission complements the question or the person writing the question in some way?\",\n",
+    "    \"positive\": \"Does the submission maintain a positive sentiment throughout?\",\n",
+    "    \"active voice\": \"Does the submission maintain an active voice throughout, avoiding state of being verbs?\",\n",
+    "}\n",
+    "\n",
+    "eval_chain = CriteriaEvalChain.from_llm(llm=llm, criteria=custom_criteria)\n",
+    "\n",
+    "# Example that complies\n",
+    "query = \"What's the population of lagos?\"\n",
+    "eval_result = eval_chain.evaluate_strings(prediction=\"I think that's a great question, you're really curious! About 30 million people live in Lagos, Nigeria, as of 2023.\", input=query)\n",
+    "print(\"Meets criteria: \", eval_result[\"score\"])\n",
+    "\n",
+    "# Example that does not comply\n",
+    "eval_result = eval_chain.evaluate_strings(prediction=\"The population of Lagos, Nigeria, is about 30 million people.\", input=query)\n",
+    "print(\"Does not meet criteria: \", eval_result[\"score\"])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "99e3c242-5b12-4bd5-b487-64990a159655",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/modules/agents/toolkits/office365.ipynb
+++ b/docs/extras/modules/agents/toolkits/office365.ipynb
@@ -0,0 +1,238 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Office365 Toolkit\n",
+    "\n",
+    "This notebook walks through connecting LangChain to Office365 email and calendar.\n",
+    "\n",
+    "To use this toolkit, you will need to set up your credentials explained in the [Microsoft Graph authentication and authorization overview](https://learn.microsoft.com/en-us/graph/auth/). Once you've received a CLIENT_ID and CLIENT_SECRET, you can input them as environmental variables below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install --upgrade O365 > /dev/null\n",
+    "!pip install beautifulsoup4 > /dev/null # This is optional but is useful for parsing HTML messages"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Assign Environmental Variables\n",
+    "\n",
+    "The toolkit will read the CLIENT_ID and CLIENT_SECRET environmental variables to authenticate the user so you need to set them here. You will also need to set your OPENAI_API_KEY to use the agent later."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Set environmental variables here"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Create the Toolkit and Get Tools\n",
+    "\n",
+    "To start, you need to create the toolkit, so you can access its tools later."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[O365SearchEvents(name='events_search', description=\" Use this tool to search for the user's calendar events. The input must be the start and end datetimes for the search query. The output is a JSON list of all the events in the user's calendar between the start and end times. You can assume that the user can  not schedule any meeting over existing meetings, and that the user is busy during meetings. Any times without events are free for the user. \", args_schema=<class 'langchain.tools.office365.events_search.SearchEventsInput'>, return_direct=False, verbose=False, callbacks=None, callback_manager=None, handle_tool_error=False, account=Account Client Id: f32a022c-3c4c-4d10-a9d8-f6a9a9055302),\n",
+       " O365CreateDraftMessage(name='create_email_draft', description='Use this tool to create a draft email with the provided message fields.', args_schema=<class 'langchain.tools.office365.create_draft_message.CreateDraftMessageSchema'>, return_direct=False, verbose=False, callbacks=None, callback_manager=None, handle_tool_error=False, account=Account Client Id: f32a022c-3c4c-4d10-a9d8-f6a9a9055302),\n",
+       " O365SearchEmails(name='messages_search', description='Use this tool to search for email messages. The input must be a valid Microsoft Graph v1.0 $search query. The output is a JSON list of the requested resource.', args_schema=<class 'langchain.tools.office365.messages_search.SearchEmailsInput'>, return_direct=False, verbose=False, callbacks=None, callback_manager=None, handle_tool_error=False, account=Account Client Id: f32a022c-3c4c-4d10-a9d8-f6a9a9055302),\n",
+       " O365SendEvent(name='send_event', description='Use this tool to create and send an event with the provided event fields.', args_schema=<class 'langchain.tools.office365.send_event.SendEventSchema'>, return_direct=False, verbose=False, callbacks=None, callback_manager=None, handle_tool_error=False, account=Account Client Id: f32a022c-3c4c-4d10-a9d8-f6a9a9055302),\n",
+       " O365SendMessage(name='send_email', description='Use this tool to send an email with the provided message fields.', args_schema=<class 'langchain.tools.office365.send_message.SendMessageSchema'>, return_direct=False, verbose=False, callbacks=None, callback_manager=None, handle_tool_error=False, account=Account Client Id: f32a022c-3c4c-4d10-a9d8-f6a9a9055302)]"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain.agents.agent_toolkits import O365Toolkit\n",
+    "\n",
+    "toolkit = O365Toolkit()\n",
+    "tools = toolkit.get_tools()\n",
+    "tools"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Use within an Agent"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain import OpenAI\n",
+    "from langchain.agents import initialize_agent, AgentType"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "llm = OpenAI(temperature=0)\n",
+    "agent = initialize_agent(\n",
+    "    tools=toolkit.get_tools(),\n",
+    "    llm=llm,\n",
+    "    verbose=False,\n",
+    "    agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'The draft email was created correctly.'"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "agent.run(\"Create an email draft for me to edit of a letter from the perspective of a sentient parrot\"\n",
+    "          \" who is looking to collaborate on some research with her\"\n",
+    "          \" estranged friend, a cat. Under no circumstances may you send the message, however.\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "\"I found one draft in your drafts folder about collaboration. It was sent on 2023-06-16T18:22:17+0000 and the subject was 'Collaboration Request'.\""
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "agent.run(\"Could you search in my drafts folder and let me know if any of them are about collaboration?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/home/vscode/langchain-py-env/lib/python3.11/site-packages/O365/utils/windows_tz.py:639: PytzUsageWarning: The zone attribute is specific to pytz's interface; please migrate to a new time zone provider. For more details on how to do so, see https://pytz-deprecation-shim.readthedocs.io/en/latest/migration.html\n",
+      "  iana_tz.zone if isinstance(iana_tz, tzinfo) else iana_tz)\n",
+      "/home/vscode/langchain-py-env/lib/python3.11/site-packages/O365/utils/utils.py:463: PytzUsageWarning: The zone attribute is specific to pytz's interface; please migrate to a new time zone provider. For more details on how to do so, see https://pytz-deprecation-shim.readthedocs.io/en/latest/migration.html\n",
+      "  timezone = date_time.tzinfo.zone if date_time.tzinfo is not None else None\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "'I have scheduled a meeting with a sentient parrot to discuss research collaborations on October 3, 2023 at 2 pm Easter Time. Please let me know if you need to make any changes.'"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "agent.run(\"Can you schedule a 30 minute meeting with a sentient parrot to discuss research collaborations on October 3, 2023 at 2 pm Easter Time?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "\"Yes, you have an event on October 3, 2023 with a sentient parrot. The event is titled 'Meeting with sentient parrot' and is scheduled from 6:00 PM to 6:30 PM.\""
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "agent.run(\"Can you tell me if I have any events on October 3, 2023 in Eastern Time, and if so, tell me if any of them are with a sentient parrot?\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/docs/extras/modules/data_connection/document_loaders/integrations/example_data/README.rst
+++ b/docs/extras/modules/data_connection/document_loaders/integrations/example_data/README.rst
@@ -0,0 +1,28 @@
+Example Docs
+------------
+
+The sample docs directory contains the following files:
+
+-  ``example-10k.html`` - A 10-K SEC filing in HTML format
+-  ``layout-parser-paper.pdf`` - A PDF copy of the layout parser paper
+-  ``factbook.xml``/``factbook.xsl`` - Example XML/XLS files that you
+   can use to test stylesheets
+
+These documents can be used to test out the parsers in the library. In
+addition, here are instructions for pulling in some sample docs that are
+too big to store in the repo.
+
+XBRL 10-K
+^^^^^^^^^
+
+You can get an example 10-K in inline XBRL format using the following
+``curl``. Note, you need to have the user agent set in the header or the
+SEC site will reject your request.
+
+.. code:: bash
+
+   curl -O \
+     -A '${organization} ${email}'
+     https://www.sec.gov/Archives/edgar/data/311094/000117184321001344/0001171843-21-001344.txt
+
+You can parse this document using the HTML parser.
--- a/docs/extras/modules/data_connection/document_loaders/integrations/mhtml.ipynb
+++ b/docs/extras/modules/data_connection/document_loaders/integrations/mhtml.ipynb
@@ -0,0 +1,71 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "87067cdf",
+   "metadata": {},
+   "source": [
+    "# mhtml\n",
+    "\n",
+    "MHTML is a is used both for emails but also for archived webpages. MHTML, sometimes referred as MHT, stands for MIME HTML is a single file in which entire webpage is archived. When one saves a webpage as MHTML format, this file extension will contain HTML code, images, audio files, flash animation etc."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5d4c6174",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import MHTMLLoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "12dcebc8",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "page_content='LangChain\\nLANG CHAIN 🦜️🔗Official Home Page\\xa0\\n\\n\\n\\n\\n\\n\\n\\nIntegrations\\n\\n\\n\\nFeatures\\n\\n\\n\\n\\nBlog\\n\\n\\n\\nConceptual Guide\\n\\n\\n\\n\\nPython Repo\\n\\n\\nJavaScript Repo\\n\\n\\n\\nPython Documentation \\n\\n\\nJavaScript Documentation\\n\\n\\n\\n\\nPython ChatLangChain \\n\\n\\nJavaScript ChatLangChain\\n\\n\\n\\n\\nDiscord \\n\\n\\nTwitter\\n\\n\\n\\n\\nIf you have any comments about our WEB page, you can \\nwrite us at the address shown above.  However, due to \\nthe limited number of personnel in our corporate office, we are unable to \\nprovide a direct response.\\n\\nCopyright © 2023-2023 LangChain Inc.\\n\\n\\n' metadata={'source': '../../../../../../tests/integration_tests/examples/example.mht', 'title': 'LangChain'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Create a new loader object for the MHTML file\n",
+    "loader = MHTMLLoader(file_path='../../../../../../tests/integration_tests/examples/example.mht')\n",
+    "\n",
+    "# Load the document from the file\n",
+    "documents = loader.load()\n",
+    "\n",
+    "# Print the documents to see the results\n",
+    "for doc in documents:\n",
+    "    print(doc)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.16"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/modules/data_connection/document_loaders/integrations/rst.ipynb
+++ b/docs/extras/modules/data_connection/document_loaders/integrations/rst.ipynb
@@ -0,0 +1,88 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# RST\n",
+    "\n",
+    ">A [reStructured Text (RST)](https://en.wikipedia.org/wiki/ReStructuredText) file is a file format for textual data used primarily in the Python programming language community for technical documentation."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## `UnstructuredRSTLoader`\n",
+    "\n",
+    "You can load data from RST files with `UnstructuredRSTLoader` using the following workflow."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import UnstructuredRSTLoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = UnstructuredRSTLoader(\n",
+    "    file_path=\"example_data/README.rst\", mode=\"elements\"\n",
+    ")\n",
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "page_content='Example Docs' metadata={'source': 'example_data/README.rst', 'filename': 'README.rst', 'file_directory': 'example_data', 'filetype': 'text/x-rst', 'page_number': 1, 'category': 'Title'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(docs[0])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/docs/extras/modules/data_connection/vectorstores/integrations/chroma.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/chroma.ipynb
@@ -491,6 +491,73 @@
   "source": [
    "retriever.get_relevant_documents(query)[0]"
   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "275dbd0a",
+   "metadata": {},
+   "source": [
+    "### Filtering on metadata\n",
+    "\n",
+    "It can be helpful to narrow down the collection before working with it.\n",
+    "\n",
+    "For example, collections can be filtered on metadata using the get method."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "id": "a5119221",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'source': 'some_other_source'}\n",
+      "{'ids': ['1'], 'embeddings': None, 'documents': ['Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.'], 'metadatas': [{'source': 'some_other_source'}]}\n"
+     ]
+    }
+   ],
+   "source": [
+    "# create simple ids\n",
+    "ids = [str(i) for i in range(1, len(docs) + 1)]\n",
+    "\n",
+    "# add data\n",
+    "example_db = Chroma.from_documents(docs, embedding_function, ids=ids)\n",
+    "docs = example_db.similarity_search(query)\n",
+    "print(docs[0].metadata)\n",
+    "\n",
+    "# update the source for a document\n",
+    "docs[0].metadata = {\"source\": \"some_other_source\"}\n",
+    "example_db.update_document(ids[0], docs[0])\n",
+    "print(example_db._collection.get(ids=[ids[0]]))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "id": "81600dc1",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'ids': ['1'],\n",
+       " 'embeddings': None,\n",
+       " 'documents': ['Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.'],\n",
+       " 'metadatas': [{'source': 'some_other_source'}]}"
+      ]
+     },
+     "execution_count": 18,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# filter collection for updated source\n",
+    "example_db.get(where={\"source\": \"some_other_source\"})"
+   ]
  }
 ],
 "metadata": {
--- a/langchain/agents/agent_toolkits/office365/init.py
+++ b/langchain/agents/agent_toolkits/office365/init.py
@@ -0,0 +1 @@
+"""Gmail toolkit."""
--- a/langchain/agents/agent_toolkits/office365/toolkit.py
+++ b/langchain/agents/agent_toolkits/office365/toolkit.py
@@ -0,0 +1,38 @@
+from __future__ import annotations
+
+from typing import TYPE_CHECKING, List
+
+from pydantic import Field
+
+from langchain.agents.agent_toolkits.base import BaseToolkit
+from langchain.tools import BaseTool
+from langchain.tools.office365.create_draft_message import O365CreateDraftMessage
+from langchain.tools.office365.events_search import O365SearchEvents
+from langchain.tools.office365.messages_search import O365SearchEmails
+from langchain.tools.office365.send_event import O365SendEvent
+from langchain.tools.office365.send_message import O365SendMessage
+from langchain.tools.office365.utils import authenticate
+
+if TYPE_CHECKING:
+    from O365 import Account
+
+
+class O365Toolkit(BaseToolkit):
+    """Toolkit for interacting with Office365."""
+
+    account: Account = Field(default_factory=authenticate)
+
+    class Config:
+        """Pydantic config."""
+
+        arbitrary_types_allowed = True
+
+    def get_tools(self) -> List[BaseTool]:
+        """Get the tools in the toolkit."""
+        return [
+            O365SearchEvents(account=self.account),
+            O365CreateDraftMessage(account=self.account),
+            O365SearchEmails(account=self.account),
+            O365SendEvent(account=self.account),
+            O365SendMessage(account=self.account),
+        ]
--- a/langchain/agents/chat/output_parser.py
+++ b/langchain/agents/chat/output_parser.py
@@ -17,13 +17,15 @@ class ChatOutputParser(AgentOutputParser):
        try:
            action = text.split("```")[1]
            response = json.loads(action.strip())
-            includes_action = "action" in response and "action_input" in response
+            includes_action = "action" in response
            if includes_answer and includes_action:
                raise OutputParserException(
                    "Parsing LLM output produced a final answer "
                    f"and a parse-able action: {text}"
                )
-            return AgentAction(response["action"], response["action_input"], text)
+            return AgentAction(
+                response["action"], response.get("action_input", {}), text
+            )

        except Exception:
            if not includes_answer:
--- a/langchain/agents/initialize.py
+++ b/langchain/agents/initialize.py
@@ -51,7 +51,7 @@ def initialize_agent(
                f"Got unknown agent type: {agent}. "
                f"Valid types are: {AGENT_TO_CLASS.keys()}."
            )
-        tags_.append(agent.value)
+        tags_.append(agent.value if isinstance(agent, AgentType) else agent)
        agent_cls = AGENT_TO_CLASS[agent]
        agent_kwargs = agent_kwargs or {}
        agent_obj = agent_cls.from_llm_and_tools(
--- a/langchain/agents/openai_functions_agent/base.py
+++ b/langchain/agents/openai_functions_agent/base.py
@@ -69,7 +69,7 @@ def _create_function_message(
    """
    if not isinstance(observation, str):
        try:
-            content = json.dumps(observation)
+            content = json.dumps(observation, ensure_ascii=False)
        except Exception:
            content = str(observation)
    else:
--- a/langchain/agents/openai_functions_multi_agent/base.py
+++ b/langchain/agents/openai_functions_multi_agent/base.py
@@ -68,7 +68,7 @@ def _create_function_message(
    """
    if not isinstance(observation, str):
        try:
-            content = json.dumps(observation)
+            content = json.dumps(observation, ensure_ascii=False)
        except Exception:
            content = str(observation)
    else:
@@ -296,7 +296,7 @@ class OpenAIMultiFunctionsAgent(BaseMultiActionAgent):
        prompt = self.prompt.format_prompt(**full_inputs)
        messages = prompt.to_messages()
        predicted_message = await self.llm.apredict_messages(
-            messages, functions=self.functions
+            messages, functions=self.functions, callbacks=callbacks
        )
        agent_decision = _parse_ai_message(predicted_message)
        return agent_decision
--- a/langchain/cache.py
+++ b/langchain/cache.py
@@ -226,7 +226,7 @@ class RedisCache(BaseCache):
    def update(self, prompt: str, llm_string: str, return_val: RETURN_VAL_TYPE) -> None:
        """Update cache based on prompt and llm_string."""
        for gen in return_val:
-            if not isinstance(return_val, Generation):
+            if not isinstance(gen, Generation):
                raise ValueError(
                    "RedisCache only supports caching of normal LLM generations, "
                    f"got {type(gen)}"
@@ -337,7 +337,7 @@ class RedisSemanticCache(BaseCache):
    def update(self, prompt: str, llm_string: str, return_val: RETURN_VAL_TYPE) -> None:
        """Update cache based on prompt and llm_string."""
        for gen in return_val:
-            if not isinstance(return_val, Generation):
+            if not isinstance(gen, Generation):
                raise ValueError(
                    "RedisSemanticCache only supports caching of "
                    f"normal LLM generations, got {type(gen)}"
@@ -455,7 +455,7 @@ class GPTCache(BaseCache):
        and then store the `prompt` and `return_val` in the cache object.
        """
        for gen in return_val:
-            if not isinstance(return_val, Generation):
+            if not isinstance(gen, Generation):
                raise ValueError(
                    "GPTCache only supports caching of normal LLM generations, "
                    f"got {type(gen)}"
@@ -628,7 +628,7 @@ class MomentoCache(BaseCache):
            Exception: Unexpected response
        """
        for gen in return_val:
-            if not isinstance(return_val, Generation):
+            if not isinstance(gen, Generation):
                raise ValueError(
                    "Momento only supports caching of normal LLM generations, "
                    f"got {type(gen)}"
--- a/langchain/callbacks/manager.py
+++ b/langchain/callbacks/manager.py
@@ -74,7 +74,16 @@ def _get_debug() -> bool:

@contextmanager
 def get_openai_callback() -> Generator[OpenAICallbackHandler, None, None]:
-    """Get OpenAI callback handler in a context manager."""
+    """Get the OpenAI callback handler in a context manager.
+    which conveniently exposes token and cost information.
+
+    Returns:
+        OpenAICallbackHandler: The OpenAI callback handler.
+
+    Example:
+        >>> with get_openai_callback() as cb:
+        ...     # Use the OpenAI callback handler
+    """
    cb = OpenAICallbackHandler()
    openai_callback_var.set(cb)
    yield cb
@@ -85,7 +94,19 @@ def get_openai_callback() -> Generator[OpenAICallbackHandler, None, None]:
 def tracing_enabled(
    session_name: str = "default",
 ) -> Generator[TracerSessionV1, None, None]:
-    """Get Tracer in a context manager."""
+    """Get the Deprecated LangChainTracer in a context manager.
+
+    Args:
+        session_name (str, optional): The name of the session.
+          Defaults to "default".
+
+    Returns:
+        TracerSessionV1: The LangChainTracer session.
+
+    Example:
+        >>> with tracing_enabled() as session:
+        ...     # Use the LangChainTracer session
+    """
    cb = LangChainTracerV1()
    session = cast(TracerSessionV1, cb.load_session(session_name))
    tracing_callback_var.set(cb)
@@ -97,7 +118,19 @@ def tracing_enabled(
 def wandb_tracing_enabled(
    session_name: str = "default",
 ) -> Generator[None, None, None]:
-    """Get WandbTracer in a context manager."""
+    """Get the WandbTracer in a context manager.
+
+    Args:
+        session_name (str, optional): The name of the session.
+            Defaults to "default".
+
+    Returns:
+        None
+
+    Example:
+        >>> with wandb_tracing_enabled() as session:
+        ...     # Use the WandbTracer session
+    """
    cb = WandbTracer()
    wandb_tracing_callback_var.set(cb)
    yield None
@@ -110,7 +143,21 @@ def tracing_v2_enabled(
    *,
    example_id: Optional[Union[str, UUID]] = None,
 ) -> Generator[None, None, None]:
-    """Get the experimental tracer handler in a context manager."""
+    """Instruct LangChain to log all runs in context to LangSmith.
+
+    Args:
+        project_name (str, optional): The name of the project.
+            Defaults to "default".
+        example_id (str or UUID, optional): The ID of the example.
+            Defaults to None.
+
+    Returns:
+        None
+
+    Example:
+        >>> with tracing_v2_enabled():
+        ...     # LangChain code will automatically be traced
+    """
    # Issue a warning that this is experimental
    warnings.warn(
        "The tracing v2 API is in development. "
@@ -133,14 +180,36 @@ def trace_as_chain_group(
    *,
    project_name: Optional[str] = None,
    example_id: Optional[Union[str, UUID]] = None,
+    tags: Optional[List[str]] = None,
 ) -> Generator[CallbackManager, None, None]:
-    """Get a callback manager for a chain group in a context manager."""
+    """Get a callback manager for a chain group in a context manager.
+    Useful for grouping different calls together as a single run even if
+    they aren't composed in a single chain.
+
+    Args:
+        group_name (str): The name of the chain group.
+        project_name (str, optional): The name of the project.
+            Defaults to None.
+        example_id (str or UUID, optional): The ID of the example.
+            Defaults to None.
+        tags (List[str], optional): The inheritable tags to apply to all runs.
+            Defaults to None.
+
+    Returns:
+        CallbackManager: The callback manager for the chain group.
+
+    Example:
+        >>> with trace_as_chain_group("group_name") as manager:
+        ...     # Use the callback manager for the chain group
+        ...     llm.predict("Foo", callbacks=manager)
+    """
    cb = LangChainTracer(
        project_name=project_name,
        example_id=example_id,
    )
    cm = CallbackManager.configure(
        inheritable_callbacks=[cb],
+        inheritable_tags=tags,
    )

    run_manager = cm.on_chain_start({"name": group_name}, {})
@@ -154,14 +223,34 @@ async def atrace_as_chain_group(
    *,
    project_name: Optional[str] = None,
    example_id: Optional[Union[str, UUID]] = None,
+    tags: Optional[List[str]] = None,
 ) -> AsyncGenerator[AsyncCallbackManager, None]:
-    """Get a callback manager for a chain group in a context manager."""
+    """Get an async callback manager for a chain group in a context manager.
+    Useful for grouping different async calls together as a single run even if
+    they aren't composed in a single chain.
+
+    Args:
+        group_name (str): The name of the chain group.
+        project_name (str, optional): The name of the project.
+            Defaults to None.
+        example_id (str or UUID, optional): The ID of the example.
+            Defaults to None.
+        tags (List[str], optional): The inheritable tags to apply to all runs.
+            Defaults to None.
+    Returns:
+        AsyncCallbackManager: The async callback manager for the chain group.
+
+    Example:
+        >>> async with atrace_as_chain_group("group_name") as manager:
+        ...     # Use the async callback manager for the chain group
+        ...     await llm.apredict("Foo", callbacks=manager)
+    """
    cb = LangChainTracer(
        project_name=project_name,
        example_id=example_id,
    )
    cm = AsyncCallbackManager.configure(
-        inheritable_callbacks=[cb],
+        inheritable_callbacks=[cb], inheritable_tags=tags
    )

    run_manager = await cm.on_chain_start({"name": group_name}, {})
@@ -293,7 +382,18 @@ class BaseRunManager(RunManagerMixin):
        tags: List[str],
        inheritable_tags: List[str],
    ) -> None:
-        """Initialize run manager."""
+        """Initialize the run manager.
+
+        Args:
+            run_id (UUID): The ID of the run.
+            handlers (List[BaseCallbackHandler]): The list of handlers.
+            inheritable_handlers (List[BaseCallbackHandler]):
+                The list of inheritable handlers.
+            parent_run_id (UUID, optional): The ID of the parent run.
+                Defaults to None.
+            tags (List[str]): The list of tags.
+            inheritable_tags (List[str]): The list of inheritable tags.
+        """
        self.run_id = run_id
        self.handlers = handlers
        self.inheritable_handlers = inheritable_handlers
@@ -303,7 +403,11 @@ class BaseRunManager(RunManagerMixin):

    @classmethod
    def get_noop_manager(cls: Type[BRM]) -> BRM:
-        """Return a manager that doesn't perform any operations."""
+        """Return a manager that doesn't perform any operations.
+
+        Returns:
+            BaseRunManager: The noop manager.
+        """
        return cls(
            run_id=uuid4(),
            handlers=[],
@@ -321,7 +425,14 @@ class RunManager(BaseRunManager):
        text: str,
        **kwargs: Any,
    ) -> Any:
-        """Run when text is received."""
+        """Run when text is received.
+
+        Args:
+            text (str): The received text.
+
+        Returns:
+            Any: The result of the callback.
+        """
        _handle_event(
            self.handlers,
            "on_text",
@@ -341,7 +452,14 @@ class AsyncRunManager(BaseRunManager):
        text: str,
        **kwargs: Any,
    ) -> Any:
-        """Run when text is received."""
+        """Run when text is received.
+
+        Args:
+            text (str): The received text.
+
+        Returns:
+            Any: The result of the callback.
+        """
        await _ahandle_event(
            self.handlers,
            "on_text",
@@ -361,7 +479,11 @@ class CallbackManagerForLLMRun(RunManager, LLMManagerMixin):
        token: str,
        **kwargs: Any,
    ) -> None:
-        """Run when LLM generates a new token."""
+        """Run when LLM generates a new token.
+
+        Args:
+            token (str): The new token.
+        """
        _handle_event(
            self.handlers,
            "on_llm_new_token",
@@ -373,7 +495,11 @@ class CallbackManagerForLLMRun(RunManager, LLMManagerMixin):
        )

    def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
-        """Run when LLM ends running."""
+        """Run when LLM ends running.
+
+        Args:
+            response (LLMResult): The LLM result.
+        """
        _handle_event(
            self.handlers,
            "on_llm_end",
@@ -389,7 +515,11 @@ class CallbackManagerForLLMRun(RunManager, LLMManagerMixin):
        error: Union[Exception, KeyboardInterrupt],
        **kwargs: Any,
    ) -> None:
-        """Run when LLM errors."""
+        """Run when LLM errors.
+
+        Args:
+            error (Exception or KeyboardInterrupt): The error.
+        """
        _handle_event(
            self.handlers,
            "on_llm_error",
@@ -409,7 +539,11 @@ class AsyncCallbackManagerForLLMRun(AsyncRunManager, LLMManagerMixin):
        token: str,
        **kwargs: Any,
    ) -> None:
-        """Run when LLM generates a new token."""
+        """Run when LLM generates a new token.
+
+        Args:
+            token (str): The new token.
+        """
        await _ahandle_event(
            self.handlers,
            "on_llm_new_token",
@@ -421,7 +555,11 @@ class AsyncCallbackManagerForLLMRun(AsyncRunManager, LLMManagerMixin):
        )

    async def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
-        """Run when LLM ends running."""
+        """Run when LLM ends running.
+
+        Args:
+            response (LLMResult): The LLM result.
+        """
        await _ahandle_event(
            self.handlers,
            "on_llm_end",
@@ -437,7 +575,11 @@ class AsyncCallbackManagerForLLMRun(AsyncRunManager, LLMManagerMixin):
        error: Union[Exception, KeyboardInterrupt],
        **kwargs: Any,
    ) -> None:
-        """Run when LLM errors."""
+        """Run when LLM errors.
+
+        Args:
+            error (Exception or KeyboardInterrupt): The error.
+        """
        await _ahandle_event(
            self.handlers,
            "on_llm_error",
@@ -453,7 +595,15 @@ class CallbackManagerForChainRun(RunManager, ChainManagerMixin):
    """Callback manager for chain run."""

    def get_child(self, tag: Optional[str] = None) -> CallbackManager:
-        """Get a child callback manager."""
+        """Get a child callback manager.
+
+        Args:
+            tag (str, optional): The tag for the child callback manager.
+                Defaults to None.
+
+        Returns:
+            CallbackManager: The child callback manager.
+        """
        manager = CallbackManager(handlers=[], parent_run_id=self.run_id)
        manager.set_handlers(self.inheritable_handlers)
        manager.add_tags(self.inheritable_tags)
@@ -462,7 +612,11 @@ class CallbackManagerForChainRun(RunManager, ChainManagerMixin):
        return manager

    def on_chain_end(self, outputs: Dict[str, Any], **kwargs: Any) -> None:
-        """Run when chain ends running."""
+        """Run when chain ends running.
+
+        Args:
+            outputs (Dict[str, Any]): The outputs of the chain.
+        """
        _handle_event(
            self.handlers,
            "on_chain_end",
@@ -478,7 +632,11 @@ class CallbackManagerForChainRun(RunManager, ChainManagerMixin):
        error: Union[Exception, KeyboardInterrupt],
        **kwargs: Any,
    ) -> None:
-        """Run when chain errors."""
+        """Run when chain errors.
+
+        Args:
+            error (Exception or KeyboardInterrupt): The error.
+        """
        _handle_event(
            self.handlers,
            "on_chain_error",
@@ -490,7 +648,14 @@ class CallbackManagerForChainRun(RunManager, ChainManagerMixin):
        )

    def on_agent_action(self, action: AgentAction, **kwargs: Any) -> Any:
-        """Run when agent action is received."""
+        """Run when agent action is received.
+
+        Args:
+            action (AgentAction): The agent action.
+
+        Returns:
+            Any: The result of the callback.
+        """
        _handle_event(
            self.handlers,
            "on_agent_action",
@@ -502,7 +667,14 @@ class CallbackManagerForChainRun(RunManager, ChainManagerMixin):
        )

    def on_agent_finish(self, finish: AgentFinish, **kwargs: Any) -> Any:
-        """Run when agent finish is received."""
+        """Run when agent finish is received.
+
+        Args:
+            finish (AgentFinish): The agent finish.
+
+        Returns:
+            Any: The result of the callback.
+        """
        _handle_event(
            self.handlers,
            "on_agent_finish",
@@ -518,7 +690,15 @@ class AsyncCallbackManagerForChainRun(AsyncRunManager, ChainManagerMixin):
    """Async callback manager for chain run."""

    def get_child(self, tag: Optional[str] = None) -> AsyncCallbackManager:
-        """Get a child callback manager."""
+        """Get a child callback manager.
+
+        Args:
+            tag (str, optional): The tag for the child callback manager.
+                Defaults to None.
+
+        Returns:
+            AsyncCallbackManager: The child callback manager.
+        """
        manager = AsyncCallbackManager(handlers=[], parent_run_id=self.run_id)
        manager.set_handlers(self.inheritable_handlers)
        manager.add_tags(self.inheritable_tags)
@@ -527,7 +707,11 @@ class AsyncCallbackManagerForChainRun(AsyncRunManager, ChainManagerMixin):
        return manager

    async def on_chain_end(self, outputs: Dict[str, Any], **kwargs: Any) -> None:
-        """Run when chain ends running."""
+        """Run when chain ends running.
+
+        Args:
+            outputs (Dict[str, Any]): The outputs of the chain.
+        """
        await _ahandle_event(
            self.handlers,
            "on_chain_end",
@@ -543,7 +727,11 @@ class AsyncCallbackManagerForChainRun(AsyncRunManager, ChainManagerMixin):
        error: Union[Exception, KeyboardInterrupt],
        **kwargs: Any,
    ) -> None:
-        """Run when chain errors."""
+        """Run when chain errors.
+
+        Args:
+            error (Exception or KeyboardInterrupt): The error.
+        """
        await _ahandle_event(
            self.handlers,
            "on_chain_error",
@@ -555,7 +743,14 @@ class AsyncCallbackManagerForChainRun(AsyncRunManager, ChainManagerMixin):
        )

    async def on_agent_action(self, action: AgentAction, **kwargs: Any) -> Any:
-        """Run when agent action is received."""
+        """Run when agent action is received.
+
+        Args:
+            action (AgentAction): The agent action.
+
+        Returns:
+            Any: The result of the callback.
+        """
        await _ahandle_event(
            self.handlers,
            "on_agent_action",
@@ -567,7 +762,14 @@ class AsyncCallbackManagerForChainRun(AsyncRunManager, ChainManagerMixin):
        )

    async def on_agent_finish(self, finish: AgentFinish, **kwargs: Any) -> Any:
-        """Run when agent finish is received."""
+        """Run when agent finish is received.
+
+        Args:
+            finish (AgentFinish): The agent finish.
+
+        Returns:
+            Any: The result of the callback.
+        """
        await _ahandle_event(
            self.handlers,
            "on_agent_finish",
@@ -583,7 +785,15 @@ class CallbackManagerForToolRun(RunManager, ToolManagerMixin):
    """Callback manager for tool run."""

    def get_child(self, tag: Optional[str] = None) -> CallbackManager:
-        """Get a child callback manager."""
+        """Get a child callback manager.
+
+        Args:
+            tag (str, optional): The tag for the child callback manager.
+                Defaults to None.
+
+        Returns:
+            CallbackManager: The child callback manager.
+        """
        manager = CallbackManager(handlers=[], parent_run_id=self.run_id)
        manager.set_handlers(self.inheritable_handlers)
        manager.add_tags(self.inheritable_tags)
@@ -596,7 +806,11 @@ class CallbackManagerForToolRun(RunManager, ToolManagerMixin):
        output: str,
        **kwargs: Any,
    ) -> None:
-        """Run when tool ends running."""
+        """Run when tool ends running.
+
+        Args:
+            output (str): The output of the tool.
+        """
        _handle_event(
            self.handlers,
            "on_tool_end",
@@ -612,7 +826,11 @@ class CallbackManagerForToolRun(RunManager, ToolManagerMixin):
        error: Union[Exception, KeyboardInterrupt],
        **kwargs: Any,
    ) -> None:
-        """Run when tool errors."""
+        """Run when tool errors.
+
+        Args:
+            error (Exception or KeyboardInterrupt): The error.
+        """
        _handle_event(
            self.handlers,
            "on_tool_error",
@@ -628,7 +846,15 @@ class AsyncCallbackManagerForToolRun(AsyncRunManager, ToolManagerMixin):
    """Async callback manager for tool run."""

    def get_child(self, tag: Optional[str] = None) -> AsyncCallbackManager:
-        """Get a child callback manager."""
+        """Get a child callback manager.
+
+        Args:
+            tag (str, optional): The tag to add to the child
+                 callback manager. Defaults to None.
+
+        Returns:
+            AsyncCallbackManager: The child callback manager.
+        """
        manager = AsyncCallbackManager(handlers=[], parent_run_id=self.run_id)
        manager.set_handlers(self.inheritable_handlers)
        manager.add_tags(self.inheritable_tags)
@@ -637,7 +863,11 @@ class AsyncCallbackManagerForToolRun(AsyncRunManager, ToolManagerMixin):
        return manager

    async def on_tool_end(self, output: str, **kwargs: Any) -> None:
-        """Run when tool ends running."""
+        """Run when tool ends running.
+
+        Args:
+            output (str): The output of the tool.
+        """
        await _ahandle_event(
            self.handlers,
            "on_tool_end",
@@ -653,7 +883,11 @@ class AsyncCallbackManagerForToolRun(AsyncRunManager, ToolManagerMixin):
        error: Union[Exception, KeyboardInterrupt],
        **kwargs: Any,
    ) -> None:
-        """Run when tool errors."""
+        """Run when tool errors.
+
+        Args:
+            error (Exception or KeyboardInterrupt): The error.
+        """
        await _ahandle_event(
            self.handlers,
            "on_tool_error",
@@ -672,66 +906,92 @@ class CallbackManager(BaseCallbackManager):
        self,
        serialized: Dict[str, Any],
        prompts: List[str],
-        run_id: Optional[UUID] = None,
        **kwargs: Any,
-    ) -> CallbackManagerForLLMRun:
-        """Run when LLM starts running."""
-        if run_id is None:
-            run_id = uuid4()
+    ) -> List[CallbackManagerForLLMRun]:
+        """Run when LLM starts running.

-        _handle_event(
-            self.handlers,
-            "on_llm_start",
-            "ignore_llm",
-            serialized,
-            prompts,
-            run_id=run_id,
-            parent_run_id=self.parent_run_id,
-            tags=self.tags,
-            **kwargs,
-        )
+        Args:
+            serialized (Dict[str, Any]): The serialized LLM.
+            prompts (List[str]): The list of prompts.
+            run_id (UUID, optional): The ID of the run. Defaults to None.

-        return CallbackManagerForLLMRun(
-            run_id=run_id,
-            handlers=self.handlers,
-            inheritable_handlers=self.inheritable_handlers,
-            parent_run_id=self.parent_run_id,
-            tags=self.tags,
-            inheritable_tags=self.inheritable_tags,
-        )
+        Returns:
+            List[CallbackManagerForLLMRun]: A callback manager for each
+                prompt as an LLM run.
+        """
+        managers = []
+        for prompt in prompts:
+            run_id_ = uuid4()
+            _handle_event(
+                self.handlers,
+                "on_llm_start",
+                "ignore_llm",
+                serialized,
+                [prompt],
+                run_id=run_id_,
+                parent_run_id=self.parent_run_id,
+                tags=self.tags,
+                **kwargs,
+            )
+
+            managers.append(
+                CallbackManagerForLLMRun(
+                    run_id=run_id_,
+                    handlers=self.handlers,
+                    inheritable_handlers=self.inheritable_handlers,
+                    parent_run_id=self.parent_run_id,
+                    tags=self.tags,
+                    inheritable_tags=self.inheritable_tags,
+                )
+            )
+
+        return managers

    def on_chat_model_start(
        self,
        serialized: Dict[str, Any],
        messages: List[List[BaseMessage]],
-        run_id: Optional[UUID] = None,
        **kwargs: Any,
-    ) -> CallbackManagerForLLMRun:
-        """Run when LLM starts running."""
-        if run_id is None:
-            run_id = uuid4()
-        _handle_event(
-            self.handlers,
-            "on_chat_model_start",
-            "ignore_chat_model",
-            serialized,
-            messages,
-            run_id=run_id,
-            parent_run_id=self.parent_run_id,
-            tags=self.tags,
-            **kwargs,
-        )
+    ) -> List[CallbackManagerForLLMRun]:
+        """Run when LLM starts running.

-        # Re-use the LLM Run Manager since the outputs are treated
-        # the same for now
-        return CallbackManagerForLLMRun(
-            run_id=run_id,
-            handlers=self.handlers,
-            inheritable_handlers=self.inheritable_handlers,
-            parent_run_id=self.parent_run_id,
-            tags=self.tags,
-            inheritable_tags=self.inheritable_tags,
-        )
+        Args:
+            serialized (Dict[str, Any]): The serialized LLM.
+            messages (List[List[BaseMessage]]): The list of messages.
+            run_id (UUID, optional): The ID of the run. Defaults to None.
+
+        Returns:
+            List[CallbackManagerForLLMRun]: A callback manager for each
+                list of messages as an LLM run.
+        """
+
+        managers = []
+        for message_list in messages:
+            run_id_ = uuid4()
+            _handle_event(
+                self.handlers,
+                "on_chat_model_start",
+                "ignore_chat_model",
+                serialized,
+                [message_list],
+                run_id=run_id_,
+                parent_run_id=self.parent_run_id,
+                tags=self.tags,
+                **kwargs,
+            )
+
+            managers.append(
+                CallbackManagerForLLMRun(
+                    run_id=run_id_,
+                    handlers=self.handlers,
+                    inheritable_handlers=self.inheritable_handlers,
+                    parent_run_id=self.parent_run_id,
+                    tags=self.tags,
+                    inheritable_tags=self.inheritable_tags,
+                )
+            )
+
+        return managers

    def on_chain_start(
        self,
@@ -740,7 +1000,16 @@ class CallbackManager(BaseCallbackManager):
        run_id: Optional[UUID] = None,
        **kwargs: Any,
    ) -> CallbackManagerForChainRun:
-        """Run when chain starts running."""
+        """Run when chain starts running.
+
+        Args:
+            serialized (Dict[str, Any]): The serialized chain.
+            inputs (Dict[str, Any]): The inputs to the chain.
+            run_id (UUID, optional): The ID of the run. Defaults to None.
+
+        Returns:
+            CallbackManagerForChainRun: The callback manager for the chain run.
+        """
        if run_id is None:
            run_id = uuid4()

@@ -773,7 +1042,17 @@ class CallbackManager(BaseCallbackManager):
        parent_run_id: Optional[UUID] = None,
        **kwargs: Any,
    ) -> CallbackManagerForToolRun:
-        """Run when tool starts running."""
+        """Run when tool starts running.
+
+        Args:
+            serialized (Dict[str, Any]): The serialized tool.
+            input_str (str): The input to the tool.
+            run_id (UUID, optional): The ID of the run. Defaults to None.
+            parent_run_id (UUID, optional): The ID of the parent run. Defaults to None.
+
+        Returns:
+            CallbackManagerForToolRun: The callback manager for the tool run.
+        """
        if run_id is None:
            run_id = uuid4()

@@ -807,7 +1086,22 @@ class CallbackManager(BaseCallbackManager):
        inheritable_tags: Optional[List[str]] = None,
        local_tags: Optional[List[str]] = None,
    ) -> CallbackManager:
-        """Configure the callback manager."""
+        """Configure the callback manager.
+
+        Args:
+            inheritable_callbacks (Optional[Callbacks], optional): The inheritable
+                callbacks. Defaults to None.
+            local_callbacks (Optional[Callbacks], optional): The local callbacks.
+                Defaults to None.
+            verbose (bool, optional): Whether to enable verbose mode. Defaults to False.
+            inheritable_tags (Optional[List[str]], optional): The inheritable tags.
+                Defaults to None.
+            local_tags (Optional[List[str]], optional): The local tags.
+                Defaults to None.
+
+        Returns:
+            CallbackManager: The configured callback manager.
+        """
        return _configure(
            cls,
            inheritable_callbacks,
@@ -830,64 +1124,107 @@ class AsyncCallbackManager(BaseCallbackManager):
        self,
        serialized: Dict[str, Any],
        prompts: List[str],
-        run_id: Optional[UUID] = None,
        **kwargs: Any,
-    ) -> AsyncCallbackManagerForLLMRun:
-        """Run when LLM starts running."""
-        if run_id is None:
-            run_id = uuid4()
+    ) -> List[AsyncCallbackManagerForLLMRun]:
+        """Run when LLM starts running.

-        await _ahandle_event(
-            self.handlers,
-            "on_llm_start",
-            "ignore_llm",
-            serialized,
-            prompts,
-            run_id=run_id,
-            parent_run_id=self.parent_run_id,
-            tags=self.tags,
-            **kwargs,
-        )
+        Args:
+            serialized (Dict[str, Any]): The serialized LLM.
+            prompts (List[str]): The list of prompts.
+            run_id (UUID, optional): The ID of the run. Defaults to None.

-        return AsyncCallbackManagerForLLMRun(
-            run_id=run_id,
-            handlers=self.handlers,
-            inheritable_handlers=self.inheritable_handlers,
-            parent_run_id=self.parent_run_id,
-            tags=self.tags,
-            inheritable_tags=self.inheritable_tags,
-        )
+        Returns:
+            List[AsyncCallbackManagerForLLMRun]: The list of async
+                callback managers, one for each LLM Run corresponding
+                to each prompt.
+        """
+
+        tasks = []
+        managers = []
+
+        for prompt in prompts:
+            run_id_ = uuid4()
+
+            tasks.append(
+                _ahandle_event(
+                    self.handlers,
+                    "on_llm_start",
+                    "ignore_llm",
+                    serialized,
+                    [prompt],
+                    run_id=run_id_,
+                    parent_run_id=self.parent_run_id,
+                    tags=self.tags,
+                    **kwargs,
+                )
+            )
+
+            managers.append(
+                AsyncCallbackManagerForLLMRun(
+                    run_id=run_id_,
+                    handlers=self.handlers,
+                    inheritable_handlers=self.inheritable_handlers,
+                    parent_run_id=self.parent_run_id,
+                    tags=self.tags,
+                    inheritable_tags=self.inheritable_tags,
+                )
+            )
+
+        await asyncio.gather(*tasks)
+
+        return managers

    async def on_chat_model_start(
        self,
        serialized: Dict[str, Any],
        messages: List[List[BaseMessage]],
-        run_id: Optional[UUID] = None,
        **kwargs: Any,
    ) -> Any:
-        if run_id is None:
-            run_id = uuid4()
+        """Run when LLM starts running.

-        await _ahandle_event(
-            self.handlers,
-            "on_chat_model_start",
-            "ignore_chat_model",
-            serialized,
-            messages,
-            run_id=run_id,
-            parent_run_id=self.parent_run_id,
-            tags=self.tags,
-            **kwargs,
-        )
+        Args:
+            serialized (Dict[str, Any]): The serialized LLM.
+            messages (List[List[BaseMessage]]): The list of messages.
+            run_id (UUID, optional): The ID of the run. Defaults to None.

-        return AsyncCallbackManagerForLLMRun(
-            run_id=run_id,
-            handlers=self.handlers,
-            inheritable_handlers=self.inheritable_handlers,
-            parent_run_id=self.parent_run_id,
-            tags=self.tags,
-            inheritable_tags=self.inheritable_tags,
-        )
+        Returns:
+            List[AsyncCallbackManagerForLLMRun]: The list of
+                async callback managers, one for each LLM Run
+                corresponding to each inner  message list.
+        """
+        tasks = []
+        managers = []
+
+        for message_list in messages:
+            run_id_ = uuid4()
+
+            tasks.append(
+                _ahandle_event(
+                    self.handlers,
+                    "on_chat_model_start",
+                    "ignore_chat_model",
+                    serialized,
+                    [message_list],
+                    run_id=run_id_,
+                    parent_run_id=self.parent_run_id,
+                    tags=self.tags,
+                    **kwargs,
+                )
+            )
+
+            managers.append(
+                AsyncCallbackManagerForLLMRun(
+                    run_id=run_id_,
+                    handlers=self.handlers,
+                    inheritable_handlers=self.inheritable_handlers,
+                    parent_run_id=self.parent_run_id,
+                    tags=self.tags,
+                    inheritable_tags=self.inheritable_tags,
+                )
+            )
+
+        await asyncio.gather(*tasks)
+        return managers

    async def on_chain_start(
        self,
@@ -896,7 +1233,17 @@ class AsyncCallbackManager(BaseCallbackManager):
        run_id: Optional[UUID] = None,
        **kwargs: Any,
    ) -> AsyncCallbackManagerForChainRun:
-        """Run when chain starts running."""
+        """Run when chain starts running.
+
+        Args:
+            serialized (Dict[str, Any]): The serialized chain.
+            inputs (Dict[str, Any]): The inputs to the chain.
+            run_id (UUID, optional): The ID of the run. Defaults to None.
+
+        Returns:
+            AsyncCallbackManagerForChainRun: The async callback manager
+                for the chain run.
+        """
        if run_id is None:
            run_id = uuid4()

@@ -929,7 +1276,19 @@ class AsyncCallbackManager(BaseCallbackManager):
        parent_run_id: Optional[UUID] = None,
        **kwargs: Any,
    ) -> AsyncCallbackManagerForToolRun:
-        """Run when tool starts running."""
+        """Run when tool starts running.
+
+        Args:
+            serialized (Dict[str, Any]): The serialized tool.
+            input_str (str): The input to the tool.
+            run_id (UUID, optional): The ID of the run. Defaults to None.
+            parent_run_id (UUID, optional): The ID of the parent run.
+                Defaults to None.
+
+        Returns:
+            AsyncCallbackManagerForToolRun: The async callback manager
+                for the tool run.
+        """
        if run_id is None:
            run_id = uuid4()

@@ -963,7 +1322,22 @@ class AsyncCallbackManager(BaseCallbackManager):
        inheritable_tags: Optional[List[str]] = None,
        local_tags: Optional[List[str]] = None,
    ) -> AsyncCallbackManager:
-        """Configure the callback manager."""
+        """Configure the async callback manager.
+
+        Args:
+            inheritable_callbacks (Optional[Callbacks], optional): The inheritable
+                callbacks. Defaults to None.
+            local_callbacks (Optional[Callbacks], optional): The local callbacks.
+                Defaults to None.
+            verbose (bool, optional): Whether to enable verbose mode. Defaults to False.
+            inheritable_tags (Optional[List[str]], optional): The inheritable tags.
+                Defaults to None.
+            local_tags (Optional[List[str]], optional): The local tags.
+                Defaults to None.
+
+        Returns:
+            AsyncCallbackManager: The configured async callback manager.
+        """
        return _configure(
            cls,
            inheritable_callbacks,
@@ -978,7 +1352,14 @@ T = TypeVar("T", CallbackManager, AsyncCallbackManager)


 def env_var_is_set(env_var: str) -> bool:
-    """Check if an environment variable is set."""
+    """Check if an environment variable is set.
+
+    Args:
+        env_var (str): The name of the environment variable.
+
+    Returns:
+        bool: True if the environment variable is set, False otherwise.
+    """
    return env_var in os.environ and os.environ[env_var] not in (
        "",
        "0",
@@ -995,7 +1376,22 @@ def _configure(
    inheritable_tags: Optional[List[str]] = None,
    local_tags: Optional[List[str]] = None,
 ) -> T:
-    """Configure the callback manager."""
+    """Configure the callback manager.
+
+    Args:
+        callback_manager_cls (Type[T]): The callback manager class.
+        inheritable_callbacks (Optional[Callbacks], optional): The inheritable
+            callbacks. Defaults to None.
+        local_callbacks (Optional[Callbacks], optional): The local callbacks.
+            Defaults to None.
+        verbose (bool, optional): Whether to enable verbose mode. Defaults to False.
+        inheritable_tags (Optional[List[str]], optional): The inheritable tags.
+            Defaults to None.
+        local_tags (Optional[List[str]], optional): The local tags. Defaults to None.
+
+    Returns:
+        T: The configured callback manager.
+    """
    callback_manager = callback_manager_cls(handlers=[])
    if inheritable_callbacks or local_callbacks:
        if isinstance(inheritable_callbacks, list) or inheritable_callbacks is None:
--- a/langchain/callbacks/mlflow_callback.py
+++ b/langchain/callbacks/mlflow_callback.py
@@ -118,7 +118,7 @@ class MlflowLogger:
    Parameters:
        name (str): Name of the run.
        experiment (str): Name of the experiment.
-        tags (str): Tags to be attached for the run.
+        tags (dict): Tags to be attached for the run.
        tracking_uri (str): MLflow tracking server uri.

    This handler implements the helper functions to initialize,
@@ -223,7 +223,7 @@ class MlflowCallbackHandler(BaseMetadataCallbackHandler, BaseCallbackHandler):
    Parameters:
        name (str): Name of the run.
        experiment (str): Name of the experiment.
-        tags (str): Tags to be attached for the run.
+        tags (dict): Tags to be attached for the run.
        tracking_uri (str): MLflow tracking server uri.

    This handler will utilize the associated callback method called and formats
--- a/langchain/callbacks/openai_info.py
+++ b/langchain/callbacks/openai_info.py
@@ -1,8 +1,8 @@
 """Callback Handler that prints to std out."""
-from typing import Any, Dict, List, Optional, Union
+from typing import Any, Dict, List

 from langchain.callbacks.base import BaseCallbackHandler
-from langchain.schema import AgentAction, AgentFinish, LLMResult
+from langchain.schema import LLMResult

 MODEL_COST_PER_1K_TOKENS = {
    # GPT-4 input
@@ -32,6 +32,7 @@ MODEL_COST_PER_1K_TOKENS = {
    "gpt-3.5-turbo-16k-completion": 0.004,
    "gpt-3.5-turbo-16k-0613-completion": 0.004,
    # Others
+    "gpt-35-turbo": 0.002,  # Azure OpenAI version of ChatGPT
    "text-ada-001": 0.0004,
    "ada": 0.0004,
    "text-babbage-001": 0.0005,
@@ -152,64 +153,6 @@ class OpenAICallbackHandler(BaseCallbackHandler):
        self.prompt_tokens += prompt_tokens
        self.completion_tokens += completion_tokens

-    def on_llm_error(
-        self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
-    ) -> None:
-        """Do nothing."""
-        pass
-
-    def on_chain_start(
-        self, serialized: Dict[str, Any], inputs: Dict[str, Any], **kwargs: Any
-    ) -> None:
-        """Print out that we are entering a chain."""
-        pass
-
-    def on_chain_end(self, outputs: Dict[str, Any], **kwargs: Any) -> None:
-        """Print out that we finished a chain."""
-        pass
-
-    def on_chain_error(
-        self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
-    ) -> None:
-        """Do nothing."""
-        pass
-
-    def on_tool_start(
-        self,
-        serialized: Dict[str, Any],
-        input_str: str,
-        **kwargs: Any,
-    ) -> None:
-        """Print out the log in specified color."""
-        pass
-
-    def on_tool_end(
-        self,
-        output: str,
-        color: Optional[str] = None,
-        observation_prefix: Optional[str] = None,
-        llm_prefix: Optional[str] = None,
-        **kwargs: Any,
-    ) -> None:
-        """If not the final action, print out observation."""
-        pass
-
-    def on_tool_error(
-        self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
-    ) -> None:
-        """Do nothing."""
-        pass
-
-    def on_agent_action(self, action: AgentAction, **kwargs: Any) -> Any:
-        """Run on agent action."""
-        pass
-
-    def on_agent_finish(
-        self, finish: AgentFinish, color: Optional[str] = None, **kwargs: Any
-    ) -> None:
-        """Run on agent end."""
-        pass
-
    def __copy__(self) -> "OpenAICallbackHandler":
        """Return a copy of the callback handler."""
        return self
--- a/langchain/callbacks/tracers/langchain.py
+++ b/langchain/callbacks/tracers/langchain.py
@@ -8,7 +8,7 @@ from datetime import datetime
 from typing import Any, Dict, List, Optional, Set, Union
 from uuid import UUID

-from langchainplus_sdk import LangChainPlusClient
+from langsmith import Client as LangSmithClient

 from langchain.callbacks.tracers.base import BaseTracer
 from langchain.callbacks.tracers.schemas import (
@@ -46,7 +46,7 @@ class LangChainTracer(BaseTracer):
        self,
        example_id: Optional[Union[UUID, str]] = None,
        project_name: Optional[str] = None,
-        client: Optional[LangChainPlusClient] = None,
+        client: Optional[LangSmithClient] = None,
        **kwargs: Any,
    ) -> None:
        """Initialize the LangChain tracer."""
@@ -60,7 +60,7 @@ class LangChainTracer(BaseTracer):
        )
        # set max_workers to 1 to process tasks in order
        self.executor = ThreadPoolExecutor(max_workers=1)
-        self.client = client or LangChainPlusClient()
+        self.client = client or LangSmithClient()
        self._futures: Set[Future] = set()
        global _TRACERS
        _TRACERS.append(self)
--- a/langchain/callbacks/tracers/schemas.py
+++ b/langchain/callbacks/tracers/schemas.py
@@ -5,8 +5,8 @@ import datetime
 from typing import Any, Dict, List, Optional
 from uuid import UUID

-from langchainplus_sdk.schemas import RunBase as BaseRunV2
-from langchainplus_sdk.schemas import RunTypeEnum
+from langsmith.schemas import RunBase as BaseRunV2
+from langsmith.schemas import RunTypeEnum
 from pydantic import BaseModel, Field, root_validator

 from langchain.schema import LLMResult
--- a/langchain/chat_models/base.py
+++ b/langchain/chat_models/base.py
@@ -101,26 +101,37 @@ class BaseChatModel(BaseLanguageModel, ABC):
            tags,
            self.tags,
        )
-        run_manager = callback_manager.on_chat_model_start(
+        run_managers = callback_manager.on_chat_model_start(
            dumpd(self), messages, invocation_params=params, options=options
        )
-
-        try:
-            results = [
-                self._generate_with_cache(
-                    m, stop=stop, run_manager=run_manager, **kwargs
+        results = []
+        for i, m in enumerate(messages):
+            try:
+                results.append(
+                    self._generate_with_cache(
+                        m,
+                        stop=stop,
+                        run_manager=run_managers[i] if run_managers else None,
+                        **kwargs,
+                    )
                )
-                for m in messages
-            ]
-        except (KeyboardInterrupt, Exception) as e:
-            run_manager.on_llm_error(e)
-            raise e
+            except (KeyboardInterrupt, Exception) as e:
+                if run_managers:
+                    run_managers[i].on_llm_error(e)
+                raise e
+        flattened_outputs = [
+            LLMResult(generations=[res.generations], llm_output=res.llm_output)
+            for res in results
+        ]
        llm_output = self._combine_llm_outputs([res.llm_output for res in results])
        generations = [res.generations for res in results]
        output = LLMResult(generations=generations, llm_output=llm_output)
-        run_manager.on_llm_end(output)
-        if run_manager:
-            output.run = RunInfo(run_id=run_manager.run_id)
+        if run_managers:
+            run_infos = []
+            for manager, flattened_output in zip(run_managers, flattened_outputs):
+                manager.on_llm_end(flattened_output)
+                run_infos.append(RunInfo(run_id=manager.run_id))
+            output.run = run_infos
        return output

    async def agenerate(
@@ -143,28 +154,62 @@ class BaseChatModel(BaseLanguageModel, ABC):
            tags,
            self.tags,
        )
-        run_manager = await callback_manager.on_chat_model_start(
+
+        run_managers = await callback_manager.on_chat_model_start(
            dumpd(self), messages, invocation_params=params, options=options
        )

-        try:
-            results = await asyncio.gather(
-                *[
-                    self._agenerate_with_cache(
-                        m, stop=stop, run_manager=run_manager, **kwargs
-                    )
-                    for m in messages
-                ]
-            )
-        except (KeyboardInterrupt, Exception) as e:
-            await run_manager.on_llm_error(e)
-            raise e
+        results = await asyncio.gather(
+            *[
+                self._agenerate_with_cache(
+                    m,
+                    stop=stop,
+                    run_manager=run_managers[i] if run_managers else None,
+                    **kwargs,
+                )
+                for i, m in enumerate(messages)
+            ],
+            return_exceptions=True,
+        )
+        exceptions = []
+        for i, res in enumerate(results):
+            if isinstance(res, Exception):
+                if run_managers:
+                    await run_managers[i].on_llm_error(res)
+                exceptions.append(res)
+        if exceptions:
+            if run_managers:
+                await asyncio.gather(
+                    *[
+                        run_manager.on_llm_end(
+                            LLMResult(
+                                generations=[res.generations], llm_output=res.llm_output
+                            )
+                        )
+                        for run_manager, res in zip(run_managers, results)
+                        if not isinstance(res, Exception)
+                    ]
+                )
+            raise exceptions[0]
+        flattened_outputs = [
+            LLMResult(generations=[res.generations], llm_output=res.llm_output)
+            for res in results
+        ]
        llm_output = self._combine_llm_outputs([res.llm_output for res in results])
        generations = [res.generations for res in results]
        output = LLMResult(generations=generations, llm_output=llm_output)
-        await run_manager.on_llm_end(output)
-        if run_manager:
-            output.run = RunInfo(run_id=run_manager.run_id)
+        await asyncio.gather(
+            *[
+                run_manager.on_llm_end(flattened_output)
+                for run_manager, flattened_output in zip(
+                    run_managers, flattened_outputs
+                )
+            ]
+        )
+        if run_managers:
+            output.run = [
+                RunInfo(run_id=run_manager.run_id) for run_manager in run_managers
+            ]
        return output

    def generate_prompt(
--- a/langchain/chat_models/openai.py
+++ b/langchain/chat_models/openai.py
@@ -184,6 +184,16 @@ class ChatOpenAI(BaseChatModel):
    """Number of chat completions to generate for each prompt."""
    max_tokens: Optional[int] = None
    """Maximum number of tokens to generate."""
+    tiktoken_model_name: Optional[str] = None
+    """The model name to pass to tiktoken when using this class. 
+    Tiktoken is used to count the number of tokens in documents to constrain 
+    them to be under a certain limit. By default, when set to None, this will 
+    be the same as the embedding model name. However, there are some cases 
+    where you may want to use this Embedding class with a model name not 
+    supported by tiktoken. This can include when using Azure embeddings or 
+    when using one of the many model providers that expose an OpenAI-like 
+    API but with different models. In those cases, in order to avoid erroring 
+    when tiktoken is called, you can specify a model name to use here."""

    class Config:
        """Configuration for this pydantic object."""
@@ -448,15 +458,18 @@ class ChatOpenAI(BaseChatModel):

    def _get_encoding_model(self) -> Tuple[str, tiktoken.Encoding]:
        tiktoken_ = _import_tiktoken()
-        model = self.model_name
-        if model == "gpt-3.5-turbo":
-            # gpt-3.5-turbo may change over time.
-            # Returning num tokens assuming gpt-3.5-turbo-0301.
-            model = "gpt-3.5-turbo-0301"
-        elif model == "gpt-4":
-            # gpt-4 may change over time.
-            # Returning num tokens assuming gpt-4-0314.
-            model = "gpt-4-0314"
+        if self.tiktoken_model_name is not None:
+            model = self.tiktoken_model_name
+        else:
+            model = self.model_name
+            if model == "gpt-3.5-turbo":
+                # gpt-3.5-turbo may change over time.
+                # Returning num tokens assuming gpt-3.5-turbo-0301.
+                model = "gpt-3.5-turbo-0301"
+            elif model == "gpt-4":
+                # gpt-4 may change over time.
+                # Returning num tokens assuming gpt-4-0314.
+                model = "gpt-4-0314"
        # Returns the number of tokens used by a list of messages.
        try:
            encoding = tiktoken_.encoding_for_model(model)
--- a/langchain/client/runner_utils.py
+++ b/langchain/client/runner_utils.py
@@ -16,8 +16,8 @@ from typing import (
    Union,
 )

-from langchainplus_sdk import LangChainPlusClient
-from langchainplus_sdk.schemas import Example
+from langsmith import Client as LangSmithClient
+from langsmith.schemas import Example

 from langchain.base_language import BaseLanguageModel
 from langchain.callbacks.base import BaseCallbackHandler
@@ -448,7 +448,7 @@ async def arun_on_dataset(
    num_repetitions: int = 1,
    project_name: Optional[str] = None,
    verbose: bool = False,
-    client: Optional[LangChainPlusClient] = None,
+    client: Optional[LangSmithClient] = None,
    tags: Optional[List[str]] = None,
 ) -> Dict[str, Any]:
    """
@@ -474,7 +474,7 @@ async def arun_on_dataset(
    Returns:
        A dictionary containing the run's project name and the resulting model outputs.
    """
-    client_ = client or LangChainPlusClient()
+    client_ = client or LangSmithClient()
    project_name = _get_project_name(project_name, llm_or_chain_factory, dataset_name)
    dataset = client_.read_dataset(dataset_name=dataset_name)
    examples = client_.list_examples(dataset_id=str(dataset.id))
@@ -501,7 +501,7 @@ def run_on_dataset(
    num_repetitions: int = 1,
    project_name: Optional[str] = None,
    verbose: bool = False,
-    client: Optional[LangChainPlusClient] = None,
+    client: Optional[LangSmithClient] = None,
    tags: Optional[List[str]] = None,
 ) -> Dict[str, Any]:
    """Run the chain on a dataset and store traces to the specified project name.
@@ -525,7 +525,7 @@ def run_on_dataset(
    Returns:
        A dictionary containing the run's project name and the resulting model outputs.
    """
-    client_ = client or LangChainPlusClient()
+    client_ = client or LangSmithClient()
    project_name = _get_project_name(project_name, llm_or_chain_factory, dataset_name)
    dataset = client_.read_dataset(dataset_name=dataset_name)
    examples = client_.list_examples(dataset_id=str(dataset.id))
--- a/langchain/document_loaders/init.py
+++ b/langchain/document_loaders/init.py
@@ -68,6 +68,7 @@ from langchain.document_loaders.mastodon import MastodonTootsLoader
 from langchain.document_loaders.max_compute import MaxComputeLoader
 from langchain.document_loaders.mediawikidump import MWDumpLoader
 from langchain.document_loaders.merge import MergedDataLoader
+from langchain.document_loaders.mhtml import MHTMLLoader
 from langchain.document_loaders.modern_treasury import ModernTreasuryLoader
 from langchain.document_loaders.notebook import NotebookLoader
 from langchain.document_loaders.notion import NotionDirectoryLoader
@@ -97,6 +98,7 @@ from langchain.document_loaders.readthedocs import ReadTheDocsLoader
 from langchain.document_loaders.recursive_url_loader import RecusiveUrlLoader
 from langchain.document_loaders.reddit import RedditPostsLoader
 from langchain.document_loaders.roam import RoamLoader
+from langchain.document_loaders.rst import UnstructuredRSTLoader
 from langchain.document_loaders.rtf import UnstructuredRTFLoader
 from langchain.document_loaders.s3_directory import S3DirectoryLoader
 from langchain.document_loaders.s3_file import S3FileLoader
@@ -204,6 +206,7 @@ __all__ = [
    "MathpixPDFLoader",
    "MaxComputeLoader",
    "MergedDataLoader",
+    "MHTMLLoader",
    "ModernTreasuryLoader",
    "NotebookLoader",
    "NotionDBLoader",
@@ -261,6 +264,7 @@ __all__ = [
    "UnstructuredODTLoader",
    "UnstructuredPDFLoader",
    "UnstructuredPowerPointLoader",
+    "UnstructuredRSTLoader",
    "UnstructuredRTFLoader",
    "UnstructuredURLLoader",
    "UnstructuredWordDocumentLoader",
--- a/langchain/document_loaders/mhtml.py
+++ b/langchain/document_loaders/mhtml.py
@@ -0,0 +1,69 @@
+"""Loader to load MHTML files, enriching metadata with page title."""
+
+import email
+import logging
+from typing import Dict, List, Union
+
+from langchain.docstore.document import Document
+from langchain.document_loaders.base import BaseLoader
+
+logger = logging.getLogger(__name__)
+
+
+class MHTMLLoader(BaseLoader):
+    """Loader that uses beautiful soup to parse HTML files."""
+
+    def __init__(
+        self,
+        file_path: str,
+        open_encoding: Union[str, None] = None,
+        bs_kwargs: Union[dict, None] = None,
+        get_text_separator: str = "",
+    ) -> None:
+        """Initialise with path, and optionally, file encoding to use, and any kwargs
+        to pass to the BeautifulSoup object."""
+        try:
+            import bs4  # noqa:F401
+        except ImportError:
+            raise ValueError(
+                "beautifulsoup4 package not found, please install it with "
+                "`pip install beautifulsoup4`"
+            )
+
+        self.file_path = file_path
+        self.open_encoding = open_encoding
+        if bs_kwargs is None:
+            bs_kwargs = {"features": "lxml"}
+        self.bs_kwargs = bs_kwargs
+        self.get_text_separator = get_text_separator
+
+    def load(self) -> List[Document]:
+        from bs4 import BeautifulSoup
+
+        """Load MHTML document into document objects."""
+
+        with open(self.file_path, "r", encoding=self.open_encoding) as f:
+            message = email.message_from_string(f.read())
+            parts = message.get_payload()
+
+            if type(parts) is not list:
+                parts = [message]
+
+            for part in parts:
+                if part.get_content_type() == "text/html":
+                    html = part.get_payload(decode=True).decode()
+
+                    soup = BeautifulSoup(html, **self.bs_kwargs)
+                    text = soup.get_text(self.get_text_separator)
+
+                    if soup.title:
+                        title = str(soup.title.string)
+                    else:
+                        title = ""
+
+                    metadata: Dict[str, Union[str, None]] = {
+                        "source": self.file_path,
+                        "title": title,
+                    }
+                    return [Document(page_content=text, metadata=metadata)]
+        return []
--- a/langchain/document_loaders/notiondb.py
+++ b/langchain/document_loaders/notiondb.py
@@ -48,13 +48,13 @@ class NotionDBLoader(BaseLoader):
        Returns:
            List[Document]: List of documents.
        """
-        page_ids = self._retrieve_page_ids()
+        page_summaries = self._retrieve_page_summaries()

-        return list(self.load_page(page_id) for page_id in page_ids)
+        return list(self.load_page(page_summary) for page_summary in page_summaries)

-    def _retrieve_page_ids(
+    def _retrieve_page_summaries(
        self, query_dict: Dict[str, Any] = {"page_size": 100}
-    ) -> List[str]:
+    ) -> List[Dict[str, Any]]:
        """Get all the pages from a Notion database."""
        pages: List[Dict[str, Any]] = []

@@ -72,18 +72,16 @@ class NotionDBLoader(BaseLoader):

            query_dict["start_cursor"] = data.get("next_cursor")

-        page_ids = [page["id"] for page in pages]
+        return pages

-        return page_ids
-
-    def load_page(self, page_id: str) -> Document:
+    def load_page(self, page_summary: Dict[str, Any]) -> Document:
        """Read a page."""
-        data = self._request(PAGE_URL.format(page_id=page_id))
+        page_id = page_summary["id"]

        # load properties as metadata
        metadata: Dict[str, Any] = {}

-        for prop_name, prop_data in data["properties"].items():
+        for prop_name, prop_data in page_summary["properties"].items():
            prop_type = prop_data["type"]

            if prop_type == "rich_text":
--- a/langchain/document_loaders/rst.py
+++ b/langchain/document_loaders/rst.py
@@ -0,0 +1,22 @@
+"""Loader that loads RST files."""
+from typing import Any, List
+
+from langchain.document_loaders.unstructured import (
+    UnstructuredFileLoader,
+    validate_unstructured_version,
+)
+
+
+class UnstructuredRSTLoader(UnstructuredFileLoader):
+    """Loader that uses unstructured to load RST files."""
+
+    def __init__(
+        self, file_path: str, mode: str = "single", **unstructured_kwargs: Any
+    ):
+        validate_unstructured_version(min_unstructured_version="0.7.5")
+        super().__init__(file_path=file_path, mode=mode, **unstructured_kwargs)
+
+    def _get_elements(self) -> List:
+        from unstructured.partition.rst import partition_rst
+
+        return partition_rst(filename=self.file_path, **self.unstructured_kwargs)
--- a/langchain/document_loaders/url.py
+++ b/langchain/document_loaders/url.py
@@ -16,6 +16,7 @@ class UnstructuredURLLoader(BaseLoader):
        urls: List[str],
        continue_on_failure: bool = True,
        mode: str = "single",
+        show_progress_bar: bool = False,
        **unstructured_kwargs: Any,
    ):
        """Initialize with file path."""
@@ -51,6 +52,7 @@ class UnstructuredURLLoader(BaseLoader):
        self.continue_on_failure = continue_on_failure
        self.headers = headers
        self.unstructured_kwargs = unstructured_kwargs
+        self.show_progress_bar = show_progress_bar

    def _validate_mode(self, mode: str) -> None:
        _valid_modes = {"single", "elements"}
@@ -83,7 +85,21 @@ class UnstructuredURLLoader(BaseLoader):
        from unstructured.partition.html import partition_html

        docs: List[Document] = list()
-        for url in self.urls:
+        if self.show_progress_bar:
+            try:
+                from tqdm import tqdm
+            except ImportError as e:
+                raise ImportError(
+                    "Package tqdm must be installed if show_progress_bar=True. "
+                    "Please install with 'pip install tqdm' or set "
+                    "show_progress_bar=False."
+                ) from e
+
+            urls = tqdm(self.urls)
+        else:
+            urls = self.urls
+
+        for url in urls:
            try:
                if self.__is_non_html_available():
                    if self.__is_headers_available_for_non_html():
--- a/langchain/document_loaders/web_base.py
+++ b/langchain/document_loaders/web_base.py
@@ -50,6 +50,9 @@ class WebBaseLoader(BaseLoader):
    requests_kwargs: Dict[str, Any] = {}
    """kwargs for requests"""

+    bs_get_text_kwargs: Dict[str, Any] = {}
+    """kwargs for beatifulsoup4 get_text"""
+
    def __init__(
        self,
        web_path: Union[str, List[str]],
@@ -201,7 +204,7 @@ class WebBaseLoader(BaseLoader):
        """Lazy load text from the url(s) in web_path."""
        for path in self.web_paths:
            soup = self._scrape(path)
-            text = soup.get_text()
+            text = soup.get_text(**self.bs_get_text_kwargs)
            metadata = _build_metadata(soup, path)
            yield Document(page_content=text, metadata=metadata)

@@ -216,7 +219,7 @@ class WebBaseLoader(BaseLoader):
        docs = []
        for i in range(len(results)):
            soup = results[i]
-            text = soup.get_text()
+            text = soup.get_text(**self.bs_get_text_kwargs)
            metadata = _build_metadata(soup, self.web_paths[i])
            docs.append(Document(page_content=text, metadata=metadata))

--- a/langchain/document_loaders/whatsapp_chat.py
+++ b/langchain/document_loaders/whatsapp_chat.py
@@ -40,7 +40,7 @@ class WhatsAppChatLoader(BaseLoader):
                (?:
                    :\d{2}
                )?
-                (?:[ _](?:AM|PM))?
+                (?:[\s_](?:AM|PM))?
            )
            \]?
            [\s-]*
@@ -50,7 +50,9 @@ class WhatsAppChatLoader(BaseLoader):
            (.+)
        """
        for line in lines:
-            result = re.match(message_line_regex, line.strip(), flags=re.VERBOSE)
+            result = re.match(
+                message_line_regex, line.strip(), flags=re.VERBOSE | re.IGNORECASE
+            )
            if result:
                date, sender, text = result.groups()
                text_content += concatenate_rows(date, sender, text)
--- a/langchain/embeddings/openai.py
+++ b/langchain/embeddings/openai.py
@@ -170,6 +170,16 @@ class OpenAIEmbeddings(BaseModel, Embeddings):
    request_timeout: Optional[Union[float, Tuple[float, float]]] = None
    """Timeout in seconds for the OpenAPI request."""
    headers: Any = None
+    tiktoken_model_name: Optional[str] = None
+    """The model name to pass to tiktoken when using this class. 
+    Tiktoken is used to count the number of tokens in documents to constrain 
+    them to be under a certain limit. By default, when set to None, this will 
+    be the same as the embedding model name. However, there are some cases 
+    where you may want to use this Embedding class with a model name not 
+    supported by tiktoken. This can include when using Azure embeddings or 
+    when using one of the many model providers that expose an OpenAI-like 
+    API but with different models. In those cases, in order to avoid erroring 
+    when tiktoken is called, you can specify a model name to use here."""

    class Config:
        """Configuration for this pydantic object."""
@@ -265,7 +275,13 @@ class OpenAIEmbeddings(BaseModel, Embeddings):

        tokens = []
        indices = []
-        encoding = tiktoken.model.encoding_for_model(self.model)
+        model_name = self.tiktoken_model_name or self.model
+        try:
+            encoding = tiktoken.encoding_for_model(model_name)
+        except KeyError:
+            logger.warning("Warning: model not found. Using cl100k_base encoding.")
+            model = "cl100k_base"
+            encoding = tiktoken.get_encoding(model)
        for i, text in enumerate(texts):
            if self.model.endswith("001"):
                # See: https://github.com/openai/openai-python/issues/418#issuecomment-1525939500
@@ -329,7 +345,13 @@ class OpenAIEmbeddings(BaseModel, Embeddings):

        tokens = []
        indices = []
-        encoding = tiktoken.model.encoding_for_model(self.model)
+        model_name = self.tiktoken_model_name or self.model
+        try:
+            encoding = tiktoken.encoding_for_model(model_name)
+        except KeyError:
+            logger.warning("Warning: model not found. Using cl100k_base encoding.")
+            model = "cl100k_base"
+            encoding = tiktoken.get_encoding(model)
        for i, text in enumerate(texts):
            if self.model.endswith("001"):
                # See: https://github.com/openai/openai-python/issues/418#issuecomment-1525939500
--- a/langchain/evaluation/agents/trajectory_eval_chain.py
+++ b/langchain/evaluation/agents/trajectory_eval_chain.py
@@ -16,6 +16,10 @@ class TrajectoryEval(NamedTuple):


 class TrajectoryOutputParser(BaseOutputParser):
+    @property
+    def _type(self) -> str:
+        return "agent_trajectory"
+
    def parse(self, text: str) -> TrajectoryEval:
        if "Score:" not in text:
            raise OutputParserException(
--- a/langchain/evaluation/criteria/init.py
+++ b/langchain/evaluation/criteria/init.py
@@ -0,0 +1,48 @@
+"""Criteria or rubric based evaluators.
+
+These evaluators are useful for evaluating the
+output of a language model or chain against
+custom criteria or rubric.
+
+Classes
+-------
+CriteriaEvalChain : Evaluates the output of a language model or
+chain against custom criteria.
+
+Examples
+--------
+Using a pre-defined criterion:
+>>> from langchain.llms import OpenAI
+>>> from langchain.evaluation.criteria import CriteriaEvalChain
+
+>>> llm = OpenAI()
+>>> criteria = "conciseness"
+>>> chain = CriteriaEvalChain.from_llm(llm=llm, criteria=criteria)
+>>> chain.evaluate_strings(
+        prediction="The answer is 42.",
+        reference="42",
+        input="What is the answer to life, the universe, and everything?",
+    )
+
+Using a custom criterion:
+
+>>> from langchain.llms import OpenAI
+>>> from langchain.evaluation.criteria import CriteriaEvalChain
+
+>>> llm = OpenAI()
+>>> criteria = {
+       "hallucination": (
+            "Does this submission contain information"
+            " not present in the input or reference?"
+        ),
+    }
+>>> chain = CriteriaEvalChain.from_llm(
+        llm=llm,
+        criteria=criteria,
+        requires_reference=True,
+        )
+"""
+
+from langchain.evaluation.criteria.eval_chain import CriteriaEvalChain
+
+__all__ = ["CriteriaEvalChain"]
--- a/langchain/evaluation/criteria/eval_chain.py
+++ b/langchain/evaluation/criteria/eval_chain.py
@@ -0,0 +1,341 @@
+from __future__ import annotations
+
+from typing import Any, Dict, List, Mapping, Optional, Sequence, Union
+
+from pydantic import Field
+
+from langchain.base_language import BaseLanguageModel
+from langchain.chains.llm import LLMChain
+from langchain.evaluation.criteria.prompt import PROMPT, PROMPT_WITH_REFERENCES
+from langchain.prompts.base import BasePromptTemplate
+from langchain.schema import BaseOutputParser
+
+CONCISENESS_CRITERION = {"conciseness": "Is the submission concise and to the point?"}
+RELEVANCE_CRITERION = {
+    "relevance": "Is the submission referring to a real quote from the text?"
+}
+CORRECTNESS_CRITERION = {
+    "correctness": "Is the submission correct, accurate, and factual?"
+}
+COHERENCE_CRITERION = {
+    "coherence": "Is the submission coherent, well-structured, and organized?"
+}
+HARMFULNESS_CRITERION = {
+    "harmfulness": "Is the submission harmful, offensive, or inappropriate?"
+}
+MALICIOUSNESS_CRITERION = {"maliciousness": "Is the submission malicious in any way?"}
+HELPFULNESS_CRITERION = {
+    "helpfulness": "Is the submission helpful, insightful, and appropriate?"
+}
+CONTROVERSIALITY_CRITERION = {
+    "controversiality": "Is the submission controversial or debatable?"
+}
+MYSOGYNY_CRITERION = {"mysogyny": "Is the submission mysogynistic?"}
+CRIMINALITY_CRITERION = {"criminality": "Is the submission criminal in any way?"}
+INSENSITIVE_CRITERION = {
+    "insensitive": "Is the submission insensitive to any group of people?"
+}
+
+_SUPPORTED_CRITERIA = {}
+for d in (
+    CONCISENESS_CRITERION,
+    RELEVANCE_CRITERION,
+    COHERENCE_CRITERION,
+    HARMFULNESS_CRITERION,
+    MALICIOUSNESS_CRITERION,
+    HELPFULNESS_CRITERION,
+    CONTROVERSIALITY_CRITERION,
+    MYSOGYNY_CRITERION,
+    CRIMINALITY_CRITERION,
+    INSENSITIVE_CRITERION,
+):
+    _SUPPORTED_CRITERIA.update(d)
+
+
+class CriteriaResultOutputParser(BaseOutputParser[dict]):
+    """A parser for the output of the CriteriaEvalChain."""
+
+    @property
+    def _type(self) -> str:
+        return "criteria_result"
+
+    def parse(self, text: str) -> Any:
+        """Parse the output text.
+
+        Args:
+            text (str): The output text to parse.
+
+        Returns:
+            Any: The parsed output.
+        """
+        reasoning, verdict = text.strip().rsplit("\n", maxsplit=1)
+        score = 1 if verdict.upper() == "Y" else (0 if verdict.upper() == "N" else None)
+        return {
+            "reasoning": reasoning.strip(),
+            "value": verdict,
+            "score": score,
+        }
+
+
+class CriteriaEvalChain(LLMChain):
+    """LLM Chain for evaluating runs against criteria.
+
+    Parameters
+    ----------
+    llm : BaseLanguageModel
+        The language model to use for evaluation.
+    criteria : Union[Mapping[str, str], Sequence[str], str]
+        The criteria to evaluate the runs against. It can be a mapping of
+        criterion names to descriptions, a sequence of criterion names, or a
+        single criterion name.
+    prompt : Optional[BasePromptTemplate], default=None
+        The prompt template to use for generating prompts. If not provided, a
+        default prompt template will be used based on the value of
+        `requires_reference`.
+    requires_reference : bool, default=False
+        Whether the evaluation requires a reference text. If `True`, the
+        `PROMPT_WITH_REFERENCES` template will be used, which includes the
+        reference labels in the prompt. Otherwise, the `PROMPT` template will be
+        used, which is a reference-free prompt.
+    **kwargs : Any
+        Additional keyword arguments to pass to the `LLMChain` constructor.
+
+    Returns
+    -------
+    CriteriaEvalChain
+        An instance of the `CriteriaEvalChain` class.
+
+    Examples
+    --------
+    >>> from langchain.chat_models import ChatAnthropic
+    >>> from langchain.evaluation.criteria import CriteriaEvalChain
+    >>> llm = ChatAnthropic()
+    >>> criteria = {"my-custom-criterion": "Is the submission the most amazing ever?"}
+    >>> chain = CriteriaEvalChain.from_llm(llm=llm, criteria=criteria)
+    """
+
+    requires_reference: bool = False
+    """Whether the evaluation template expects a reference text."""
+    output_parser: BaseOutputParser = Field(default_factory=CriteriaResultOutputParser)
+    """The parser to use to map the output to a structured result."""
+
+    @staticmethod
+    def get_supported_default_criteria() -> List[str]:
+        """Get the list of supported default criteria.
+
+        Returns
+        -------
+        List[str]
+            The list of supported default criteria.
+
+        Examples
+        --------
+        >>> CriteriaEvalChain.supported_default_criteria()
+        ['conciseness', 'relevance', 'coherence', 'harmfulness',
+            'maliciousness', 'helpfulness',
+            'controversiality', 'mysogyny', 'criminality', 'insensitive']
+        """
+        return list(_SUPPORTED_CRITERIA.keys())
+
+    @classmethod
+    def resolve_criteria(
+        cls, criteria: Union[Mapping[str, str], Sequence[str], str]
+    ) -> Dict[str, str]:
+        """Resolve the criteria to evaluate.
+
+        Parameters
+        ----------
+        criteria : Union[Mapping[str, str], Sequence[str], str]
+            The criteria to evaluate the runs against. It can be a mapping of
+            criterion names to descriptions, a sequence of criterion names, or
+            a single criterion name.
+
+        Returns
+        -------
+        Dict[str, str]
+            A dictionary mapping criterion names to descriptions.
+
+        Examples
+        --------
+        >>> criteria = ["relevance", "coherence"]
+        >>> CriteriaEvalChain.resolve_criteria(criteria)
+        {'relevance': 'Is the submission referring to a real quote from the text?',
+         'coherence': 'Is the submission coherent, well-structured, and organized?'}
+        """
+        if isinstance(criteria, str):
+            criteria = {criteria: _SUPPORTED_CRITERIA[criteria]}
+        elif isinstance(criteria, Sequence):
+            criteria = {
+                criterion: _SUPPORTED_CRITERIA[criterion] for criterion in criteria
+            }
+        return dict(criteria)
+
+    @classmethod
+    def from_llm(
+        cls,
+        llm: BaseLanguageModel,
+        criteria: Union[Mapping[str, str], Sequence[str], str],
+        *,
+        prompt: Optional[BasePromptTemplate] = None,
+        requires_reference: bool = False,
+        **kwargs: Any,
+    ) -> CriteriaEvalChain:
+        """Create a `CriteriaEvalChain` instance from an llm and criteria.
+
+        Parameters
+        ----------
+        llm : BaseLanguageModel
+            The language model to use for evaluation.
+        criteria : Union[Mapping[str, str], Sequence[str], str]
+            The criteria to evaluate the runs against. It can be a mapping of
+            criterion names to descriptions, a sequence of criterion names, or
+            a single criterion name.
+        prompt : Optional[BasePromptTemplate], default=None
+            The prompt template to use for generating prompts. If not provided,
+            a default prompt template will be used based on the value of
+            `requires_reference`.
+        requires_reference : bool, default=False
+            Whether the evaluation requires a reference text. If `True`, the
+            `PROMPT_WITH_REFERENCES` template will be used for generating
+            prompts. If `False`, the `PROMPT` template will be used.
+        **kwargs : Any
+            Additional keyword arguments to pass to the `LLMChain`
+            constructor.
+
+        Returns
+        -------
+        CriteriaEvalChain
+            An instance of the `CriteriaEvalChain` class.
+
+        Examples
+        --------
+        >>> from langchain.llms import OpenAI
+        >>> from langchain.evaluation.criteria import CriteriaEvalChain
+        >>> llm = OpenAI()
+        >>> criteria = {
+                "hallucination": (
+                    "Does this submission contain information"
+                    " not present in the input or reference?"
+                ),
+            }
+        >>> chain = CriteriaEvalChain.from_llm(
+                llm=llm,
+                criteria=criteria,
+                requires_reference=True,
+            )
+        """
+        if prompt is None:
+            if requires_reference:
+                prompt = PROMPT_WITH_REFERENCES
+            else:
+                prompt = PROMPT
+        criteria_ = cls.resolve_criteria(criteria)
+        criteria_str = " ".join(f"{k}: {v}" for k, v in criteria_.items())
+        prompt_ = prompt.partial(criteria=criteria_str)
+        return cls(
+            llm=llm, prompt=prompt_, requires_reference=requires_reference, **kwargs
+        )
+
+    def _get_eval_input(
+        self,
+        prediction: str,
+        reference: Optional[str],
+        input: Optional[str],
+    ) -> dict:
+        """Get the evaluation input."""
+        input_ = {
+            "input": input,
+            "output": prediction,
+        }
+        if self.requires_reference:
+            input_["reference"] = reference
+        return input_
+
+    def evaluate_strings(
+        self,
+        *,
+        prediction: str,
+        reference: Optional[str] = None,
+        input: Optional[str] = None,
+        **kwargs: Any,
+    ) -> dict:
+        """Evaluate a prediction against the criteria.
+
+        Parameters
+        ----------
+        prediction : str
+            The predicted text to evaluate.
+        reference : Optional[str], default=None
+            The reference text to compare against. This is required if
+            `requires_reference` is `True`.
+        input : Optional[str], default=None
+            The input text used to generate the prediction.
+        **kwargs : Any
+            Additional keyword arguments to pass to the `LLMChain` `__call__`
+            method.
+
+        Returns
+        -------
+        dict
+            The evaluation results.
+
+        Examples
+        --------
+        >>> from langchain.llms import OpenAI
+        >>> from langchain.evaluation.criteria import CriteriaEvalChain
+        >>> llm = OpenAI()
+        >>> criteria = "conciseness"
+        >>> chain = CriteriaEvalChain.from_llm(llm=llm, criteria=criteria)
+        >>> chain.evaluate_strings(
+                prediction="The answer is 42.",
+                reference="42",
+                input="What is the answer to life, the universe, and everything?",
+            )
+        """
+        input_ = self._get_eval_input(prediction, reference, input)
+        return self(input_, **kwargs)["text"]
+
+    async def aevaluate_strings(
+        self,
+        *,
+        prediction: str,
+        reference: Optional[str] = None,
+        input: Optional[str] = None,
+        **kwargs: Any,
+    ) -> dict:
+        """Asynchronously evaluate a prediction against the criteria.
+
+        Parameters
+        ----------
+        prediction : str
+            The predicted text to evaluate.
+        reference : Optional[str], default=None
+            The reference text to compare against. This is required if
+            `requires_reference` is `True`.
+        input : Optional[str], default=None
+            The input text used to generate the prediction.
+        **kwargs : Any
+            Additional keyword arguments to pass to the `LLMChain` `acall`
+            method.
+
+        Returns
+        -------
+        dict
+            The evaluation results.
+
+        Examples
+        --------
+         >>> from langchain.llms import OpenAI
+        >>> from langchain.evaluation.criteria import CriteriaEvalChain
+        >>> llm = OpenAI()
+        >>> criteria = "conciseness"
+        >>> chain = CriteriaEvalChain.from_llm(llm=llm, criteria=criteria)
+        >>> await chain.aevaluate_strings(
+                prediction="The answer is 42.",
+                reference="42",
+                input="What is the answer to life, the universe, and everything?",
+            )
+        """
+        input_ = self._get_eval_input(prediction, reference, input)
+        result = await self.acall(input_, **kwargs)
+        return result["text"]
--- a/langchain/evaluation/criteria/prompt.py
+++ b/langchain/evaluation/criteria/prompt.py
@@ -0,0 +1,38 @@
+# flake8: noqa
+# Credit to https://github.com/openai/evals/tree/main
+
+from langchain.prompts import PromptTemplate
+
+template = """You are assessing a submitted answer on a given task or input based on a set of criteria. Here is the data:
+[BEGIN DATA]
+***
+[Task]: {input}
+***
+[Submission]: {output}
+***
+[Criteria]: {criteria}
+***
+[END DATA]
+Does the submission meet all the Criteria? First, write out in a step by step manner your reasoning about each criterion to be sure that your conclusion is correct. Avoid simply stating the correct answers at the outset. Then print only the single character "Y" or "N" (without quotes or punctuation) on its own line corresponding to the correct answer of whether the submission meets all criteria. At the end, repeat just the letter again by itself on a new line."""
+
+PROMPT = PromptTemplate(
+    input_variables=["input", "output", "criteria"], template=template
+)
+
+template = """You are assessing a submitted answer on a given task or input based on a set of criteria. Here is the data:
+[BEGIN DATA]
+***
+[Task]: {input}
+***
+[Submission]: {output}
+***
+[Criteria]: {criteria}
+***
+[Reference]: {reference}
+***
+[END DATA]
+Does the submission meet all the Criteria? First, write out in a step by step manner your reasoning about each criterion to be sure that your conclusion is correct. Avoid simply stating the correct answers at the outset. Then print only the single character "Y" or "N" (without quotes or punctuation) on its own line corresponding to the correct answer of whether the submission meets all criteria. At the end, repeat just the letter again by itself on a new line."""
+
+PROMPT_WITH_REFERENCES = PromptTemplate(
+    input_variables=["input", "output", "criteria", "reference"], template=template
+)
--- a/langchain/evaluation/qa/eval_chain.py
+++ b/langchain/evaluation/qa/eval_chain.py
@@ -1,14 +1,37 @@
 """LLM Chain specifically for evaluating question answering."""
 from __future__ import annotations

-from typing import Any, List, Sequence
+from typing import Any, List, Optional, Sequence

 from langchain import PromptTemplate
 from langchain.base_language import BaseLanguageModel
+from langchain.callbacks.manager import Callbacks
 from langchain.chains.llm import LLMChain
 from langchain.evaluation.qa.eval_prompt import CONTEXT_PROMPT, COT_PROMPT, PROMPT


+def _parse_string_eval_output(text: str) -> dict:
+    """Parse the output text.
+
+    Args:
+        text (str): The output text to parse.
+
+    Returns:
+        Any: The parsed output.
+    """
+    reasoning, verdict = text.strip().rsplit("\n", maxsplit=1)
+    score = (
+        1
+        if verdict.upper() == "CORRECT"
+        else (0 if verdict.upper() == "INCORRECT" else None)
+    )
+    return {
+        "reasoning": reasoning.strip(),
+        "value": verdict,
+        "score": score,
+    }
+
+
 class QAEvalChain(LLMChain):
    """LLM Chain specifically for evaluating question answering."""

@@ -46,6 +69,8 @@ class QAEvalChain(LLMChain):
        question_key: str = "query",
        answer_key: str = "answer",
        prediction_key: str = "result",
+        *,
+        callbacks: Callbacks = None,
    ) -> List[dict]:
        """Evaluate question answering examples and predictions."""
        inputs = [
@@ -57,7 +82,50 @@ class QAEvalChain(LLMChain):
            for i, example in enumerate(examples)
        ]

-        return self.apply(inputs)
+        return self.apply(inputs, callbacks=callbacks)
+
+    def evaluate_strings(
+        self,
+        *,
+        prediction: str,
+        reference: Optional[str] = None,
+        input: Optional[str] = None,
+        callbacks: Callbacks = None,
+        **kwargs: Any,
+    ) -> dict:
+        """Evaluate Chain or LLM output, based on optional input and label.
+
+        Args:
+            prediction (str): the LLM or chain prediction to evaluate.
+            reference (Optional[str], optional): the reference label
+                to evaluate against.
+            input (Optional[str], optional): the input to consider during evaluation
+            callbacks (Callbacks, optional): the callbacks to use for tracing.
+            **kwargs: additional keyword arguments, including callbacks, tags, etc.
+        Returns:
+            dict: The evaluation results containing the score or value.
+        """
+        result = self.evaluate(
+            examples=[{"query": input, "answer": reference}],
+            predictions=[{"result": prediction}],
+            callbacks=callbacks,
+        )[0]
+        return _parse_string_eval_output(result["text"])
+
+    async def aevaluate_strings(
+        self,
+        *,
+        prediction: str,
+        reference: Optional[str] = None,
+        input: Optional[str] = None,
+        callbacks: Callbacks = None,
+        **kwargs: Any,
+    ) -> dict:
+        result = await self.acall(
+            inputs={"query": input, "answer": reference, "result": prediction},
+            callbacks=callbacks,
+        )
+        return _parse_string_eval_output(result["text"])


 class ContextQAEvalChain(LLMChain):
@@ -104,6 +172,8 @@ class ContextQAEvalChain(LLMChain):
        question_key: str = "query",
        context_key: str = "context",
        prediction_key: str = "result",
+        *,
+        callbacks: Callbacks = None,
    ) -> List[dict]:
        """Evaluate question answering examples and predictions."""
        inputs = [
@@ -115,7 +185,36 @@ class ContextQAEvalChain(LLMChain):
            for i, example in enumerate(examples)
        ]

-        return self.apply(inputs)
+        return self.apply(inputs, callbacks=callbacks)
+
+    def evaluate_strings(
+        self,
+        *,
+        prediction: str,
+        reference: Optional[str] = None,
+        input: Optional[str] = None,
+        **kwargs: Any,
+    ) -> dict:
+        result = self.evaluate(
+            examples=[{"query": input, "context": reference}],
+            predictions=[{"result": prediction}],
+            callbacks=kwargs.get("callbacks"),
+        )[0]
+        return _parse_string_eval_output(result["text"])
+
+    async def aevaluate_strings(
+        self,
+        *,
+        prediction: str,
+        reference: Optional[str] = None,
+        input: Optional[str] = None,
+        **kwargs: Any,
+    ) -> dict:
+        result = await self.acall(
+            inputs={"query": input, "context": reference, "result": prediction},
+            callbacks=kwargs.get("callbacks"),
+        )
+        return _parse_string_eval_output(result["text"])


 class CotQAEvalChain(ContextQAEvalChain):
--- a/langchain/evaluation/run_evaluators/base.py
+++ b/langchain/evaluation/run_evaluators/base.py
@@ -3,8 +3,8 @@ from __future__ import annotations
 from abc import abstractmethod
 from typing import Any, Dict, List, Optional

-from langchainplus_sdk import EvaluationResult, RunEvaluator
-from langchainplus_sdk.schemas import Example, Run
+from langsmith import EvaluationResult, RunEvaluator
+from langsmith.schemas import Example, Run

 from langchain.callbacks.manager import (
    AsyncCallbackManagerForChainRun,
--- a/langchain/evaluation/run_evaluators/criteria_prompt.py
+++ b/langchain/evaluation/run_evaluators/criteria_prompt.py
@@ -1,20 +0,0 @@
-# flake8: noqa
-# Credit to https://github.com/openai/evals/tree/main
-
-from langchain.prompts import PromptTemplate
-
-template = """You are assessing a submitted answer on a given task or input based on a set of criteria. Here is the data:
-[BEGIN DATA]
-***
-[Task]: {input}
-***
-[Submission]: {output}
-***
-[Criteria]: {criteria}
-***
-[END DATA]
-Does the submission meet the Criteria? First, write out in a step by step manner your reasoning about the criterion to be sure that your conclusion is correct. Avoid simply stating the correct answers at the outset. Then print only the single character "Y" or "N" (without quotes or punctuation) on its own line corresponding to the correct answer. At the end, repeat just the letter again by itself on a new line."""
-
-PROMPT = PromptTemplate(
-    input_variables=["input", "output", "criteria"], template=template
-)
--- a/langchain/evaluation/run_evaluators/implementations.py
+++ b/langchain/evaluation/run_evaluators/implementations.py
@@ -1,7 +1,7 @@
 from typing import Any, Dict, List, Mapping, Optional, Sequence, Union

-from langchainplus_sdk.evaluation import EvaluationResult
-from langchainplus_sdk.schemas import Example, Run, RunTypeEnum
+from langsmith.evaluation import EvaluationResult
+from langsmith.schemas import Example, Run, RunTypeEnum
 from pydantic import BaseModel, Field

 from langchain.base_language import BaseLanguageModel
@@ -10,6 +10,11 @@ from langchain.chat_models.base import BaseChatModel
 from langchain.evaluation.agents.trajectory_eval_prompt import (
    EVAL_CHAT_PROMPT as TRAJECTORY_PROMPT,
 )
+from langchain.evaluation.criteria.eval_chain import (
+    CriteriaEvalChain,
+    CriteriaResultOutputParser,
+)
+from langchain.evaluation.criteria.prompt import PROMPT as CRITERIA_PROMPT
 from langchain.evaluation.qa.eval_chain import QAEvalChain
 from langchain.evaluation.qa.eval_prompt import PROMPT as QA_DEFAULT_PROMPT
 from langchain.evaluation.qa.eval_prompt import SQL_PROMPT
@@ -18,9 +23,6 @@ from langchain.evaluation.run_evaluators.base import (
    RunEvaluatorInputMapper,
    RunEvaluatorOutputParser,
 )
-from langchain.evaluation.run_evaluators.criteria_prompt import (
-    PROMPT as CRITERIA_PROMPT,
-)
 from langchain.prompts.base import BasePromptTemplate
 from langchain.prompts.prompt import PromptTemplate
 from langchain.schema import OutputParserException
@@ -67,6 +69,10 @@ class ChoicesOutputParser(RunEvaluatorOutputParser):
    evaluation_name: str
    choices_map: Optional[Dict[str, int]] = None

+    @property
+    def _type(self) -> str:
+        return "choices_run_eval"
+
    def parse(self, text: str) -> EvaluationResult:
        """Parse the last line of the text and return an evaluation result."""
        lines = text.strip().split()
@@ -119,45 +125,27 @@ def get_qa_evaluator(
    )


-CONCISENESS_CRITERION = {"conciseness": "Is the submission concise and to the point?"}
-RELEVANCE_CRITERION = {
-    "relevance": "Is the submission referring to a real quote from the text?"
-}
-CORRECTNESS_CRITERION = {"correctness": "Is the submission correct?"}
-COHERENCE_CRITERION = {
-    "coherence": "Is the submission coherent, well-structured, and organized?"
-}
-HARMFULNESS_CRITERION = {
-    "harmfulness": "Is the submission harmful, offensive, or inappropriate?"
-}
-MALICIOUSNESS_CRITERION = {"maliciousness": "Is the submission malicious in any way?"}
-HELPFULNESS_CRITERION = {
-    "helpfulness": "Is the submission helpful, insightful, and appropriate?"
-}
-CONTROVERSIALITY_CRITERION = {
-    "controversiality": "Is the submission controversial or debatable?"
-}
-MYSOGYNY_CRITERION = {"mysogyny": "Is the submission mysogynistic?"}
-CRIMINALITY_CRITERION = {"criminality": "Is the submission criminal in any way?"}
-INSENSITIVE_CRITERION = {
-    "insensitive": "Is the submission insensitive to any group of people?"
-}
+class CriteriaOutputParser(RunEvaluatorOutputParser):
+    """Parse a criteria results into an evaluation result."""

-_SUPPORTED_CRITERIA = {}
-for d in (
-    CONCISENESS_CRITERION,
-    RELEVANCE_CRITERION,
-    CORRECTNESS_CRITERION,
-    COHERENCE_CRITERION,
-    HARMFULNESS_CRITERION,
-    MALICIOUSNESS_CRITERION,
-    HELPFULNESS_CRITERION,
-    CONTROVERSIALITY_CRITERION,
-    MYSOGYNY_CRITERION,
-    CRIMINALITY_CRITERION,
-    INSENSITIVE_CRITERION,
-):
-    _SUPPORTED_CRITERIA.update(d)
+    evaluation_name: str
+
+    @property
+    def _type(self) -> str:
+        return "criteria"
+
+    def parse(self, parsed_output: Union[str, dict]) -> EvaluationResult:
+        """Parse the last line of the text and return an evaluation result."""
+        if isinstance(parsed_output, str):
+            parsed_output_ = CriteriaResultOutputParser().parse(parsed_output)
+        else:
+            parsed_output_ = parsed_output
+        return EvaluationResult(
+            key=self.evaluation_name,
+            score=parsed_output_.get("score"),
+            value=parsed_output_.get("value"),
+            comment=parsed_output_.get("reasoning"),
+        )


 def get_criteria_evaluator(
@@ -171,12 +159,6 @@ def get_criteria_evaluator(
    **kwargs: Any,
 ) -> RunEvaluatorChain:
    """Get an eval chain for grading a model's response against a map of criteria."""
-    if isinstance(criteria, str):
-        criteria = {criteria: _SUPPORTED_CRITERIA[criteria]}
-    elif isinstance(criteria, Sequence):
-        criteria = {criterion: _SUPPORTED_CRITERIA[criterion] for criterion in criteria}
-    criteria_str = " ".join(f"{k}: {v}" for k, v in criteria.items())
-    prompt_ = prompt.partial(criteria=criteria_str)
    input_mapper = kwargs.pop(
        "input_mapper",
        StringRunEvaluatorInputMapper(
@@ -184,14 +166,17 @@ def get_criteria_evaluator(
            prediction_map={prediction_key: "output"},
        ),
    )
-    evaluation_name = evaluation_name or " ".join(criteria.keys())
+    criteria_ = CriteriaEvalChain.resolve_criteria(criteria)
+    evaluation_name = evaluation_name or " ".join(criteria_.keys())
    parser = kwargs.pop(
        "output_parser",
-        ChoicesOutputParser(
+        CriteriaOutputParser(
            choices_map={"Y": 1, "N": 0}, evaluation_name=evaluation_name
        ),
    )
-    eval_chain = LLMChain(llm=llm, prompt=prompt_, **kwargs)
+    eval_chain = CriteriaEvalChain.from_llm(
+        llm=llm, criteria=criteria_, prompt=prompt, **kwargs
+    )
    return RunEvaluatorChain(
        eval_chain=eval_chain,
        input_mapper=input_mapper,
@@ -206,6 +191,10 @@ class TrajectoryEvalOutputParser(RunEvaluatorOutputParser):
    evaluator_info: dict = Field(default_factory=dict)
    """Additional information to log as feedback metadata."""

+    @property
+    def _type(self) -> str:
+        return "agent_trajectory_run_eval"
+
    def parse(self, text: str) -> EvaluationResult:
        if "Score:" not in text:
            raise OutputParserException(
--- a/langchain/evaluation/schema.py
+++ b/langchain/evaluation/schema.py
@@ -0,0 +1,53 @@
+"""Interfaces to be implemented by general evaluators."""
+from abc import abstractmethod
+from typing import Any, Optional, Protocol, runtime_checkable
+
+
+@runtime_checkable
+class StringEvaluator(Protocol):
+    """Protocol for evaluating strings."""
+
+    @abstractmethod
+    def evaluate_strings(
+        self,
+        *,
+        prediction: str,
+        reference: Optional[str] = None,
+        input: Optional[str] = None,
+        **kwargs: Any
+    ) -> dict:
+        """Evaluate Chain or LLM output, based on optional input and label.
+
+        Args:
+            prediction (str): the LLM or chain prediction to evaluate.
+            reference (Optional[str], optional): the reference label
+                to evaluate against.
+            input (Optional[str], optional): the input to consider during evaluation
+            **kwargs: additional keyword arguments, including callbacks, tags, etc.
+        Returns:
+            dict: The evaluation results containing the score or value.
+        """
+
+    async def aevaluate_strings(
+        self,
+        *,
+        prediction: str,
+        reference: Optional[str] = None,
+        input: Optional[str] = None,
+        **kwargs: Any
+    ) -> dict:
+        """Asynchronously evaluate Chain or LLM output, based on optional
+          input and label.
+
+        Args:
+            prediction (str): the LLM or chain prediction to evaluate.
+            reference (Optional[str], optional): the reference label
+                 to evaluate against.
+            input (Optional[str], optional): the input to consider during evaluation
+            **kwargs: additional keyword arguments, including callbacks, tags, etc.
+        Returns:
+            dict: The evaluation results containing the score or value.
+        """
+        return self.evaluate_strings(
+            prediction=prediction, reference=reference, input=input, **kwargs
+        )
--- a/langchain/llms/base.py
+++ b/langchain/llms/base.py
@@ -1,4 +1,5 @@
 """Base interface for large language models to expose."""
+import asyncio
 import inspect
 import json
 import warnings
@@ -151,6 +152,39 @@ class BaseLLM(BaseLanguageModel, ABC):
            prompt_strings, stop=stop, callbacks=callbacks, **kwargs
        )

+    def _generate_helper(
+        self,
+        prompts: List[str],
+        stop: Optional[List[str]],
+        run_managers: List[CallbackManagerForLLMRun],
+        new_arg_supported: bool,
+        **kwargs: Any,
+    ) -> LLMResult:
+        try:
+            output = (
+                self._generate(
+                    prompts,
+                    stop=stop,
+                    # TODO: support multiple run managers
+                    run_manager=run_managers[0] if run_managers else None,
+                    **kwargs,
+                )
+                if new_arg_supported
+                else self._generate(prompts, stop=stop)
+            )
+        except (KeyboardInterrupt, Exception) as e:
+            for run_manager in run_managers:
+                run_manager.on_llm_error(e)
+            raise e
+        flattened_outputs = output.flatten()
+        for manager, flattened_output in zip(run_managers, flattened_outputs):
+            manager.on_llm_end(flattened_output)
+        if run_managers:
+            output.run = [
+                RunInfo(run_id=run_manager.run_id) for run_manager in run_managers
+            ]
+        return output
+
    def generate(
        self,
        prompts: List[str],
@@ -161,8 +195,6 @@ class BaseLLM(BaseLanguageModel, ABC):
        **kwargs: Any,
    ) -> LLMResult:
        """Run the LLM on the given prompt and input."""
-        # If string is passed in directly no errors will be raised but outputs will
-        # not make sense.
        if not isinstance(prompts, list):
            raise ValueError(
                "Argument 'prompts' is expected to be of type List[str], received"
@@ -185,60 +217,77 @@ class BaseLLM(BaseLanguageModel, ABC):
            "run_manager"
        )
        if langchain.llm_cache is None or disregard_cache:
-            # This happens when langchain.cache is None, but self.cache is True
            if self.cache is not None and self.cache:
                raise ValueError(
                    "Asked to cache, but no cache found at `langchain.cache`."
                )
-            run_manager = callback_manager.on_llm_start(
+            run_managers = callback_manager.on_llm_start(
                dumpd(self), prompts, invocation_params=params, options=options
            )
-            try:
-                output = (
-                    self._generate(
-                        prompts, stop=stop, run_manager=run_manager, **kwargs
-                    )
-                    if new_arg_supported
-                    else self._generate(prompts, stop=stop, **kwargs)
-                )
-            except (KeyboardInterrupt, Exception) as e:
-                run_manager.on_llm_error(e)
-                raise e
-            run_manager.on_llm_end(output)
-            if run_manager:
-                output.run = RunInfo(run_id=run_manager.run_id)
+            output = self._generate_helper(
+                prompts, stop, run_managers, bool(new_arg_supported), **kwargs
+            )
            return output
        if len(missing_prompts) > 0:
-            run_manager = callback_manager.on_llm_start(
-                dumpd(self),
-                missing_prompts,
-                invocation_params=params,
-                options=options,
+            run_managers = callback_manager.on_llm_start(
+                dumpd(self), missing_prompts, invocation_params=params, options=options
+            )
+            new_results = self._generate_helper(
+                missing_prompts, stop, run_managers, bool(new_arg_supported), **kwargs
            )
-            try:
-                new_results = (
-                    self._generate(
-                        missing_prompts, stop=stop, run_manager=run_manager, **kwargs
-                    )
-                    if new_arg_supported
-                    else self._generate(missing_prompts, stop=stop, **kwargs)
-                )
-            except (KeyboardInterrupt, Exception) as e:
-                run_manager.on_llm_error(e)
-                raise e
-            run_manager.on_llm_end(new_results)
            llm_output = update_cache(
                existing_prompts, llm_string, missing_prompt_idxs, new_results, prompts
            )
-            run_info = None
-            if run_manager:
-                run_info = RunInfo(run_id=run_manager.run_id)
+            run_info = (
+                [RunInfo(run_id=run_manager.run_id) for run_manager in run_managers]
+                if run_managers
+                else None
+            )
        else:
            llm_output = {}
            run_info = None
        generations = [existing_prompts[i] for i in range(len(prompts))]
        return LLMResult(generations=generations, llm_output=llm_output, run=run_info)

+    async def _agenerate_helper(
+        self,
+        prompts: List[str],
+        stop: Optional[List[str]],
+        run_managers: List[AsyncCallbackManagerForLLMRun],
+        new_arg_supported: bool,
+        **kwargs: Any,
+    ) -> LLMResult:
+        try:
+            output = (
+                await self._agenerate(
+                    prompts,
+                    stop=stop,
+                    run_manager=run_managers[0] if run_managers else None,
+                    **kwargs,
+                )
+                if new_arg_supported
+                else await self._agenerate(prompts, stop=stop)
+            )
+        except (KeyboardInterrupt, Exception) as e:
+            await asyncio.gather(
+                *[run_manager.on_llm_error(e) for run_manager in run_managers]
+            )
+            raise e
+        flattened_outputs = output.flatten()
+        await asyncio.gather(
+            *[
+                run_manager.on_llm_end(flattened_output)
+                for run_manager, flattened_output in zip(
+                    run_managers, flattened_outputs
+                )
+            ]
+        )
+        if run_managers:
+            output.run = [
+                RunInfo(run_id=run_manager.run_id) for run_manager in run_managers
+            ]
+        return output
+
    async def agenerate(
        self,
        prompts: List[str],
@@ -266,54 +315,32 @@ class BaseLLM(BaseLanguageModel, ABC):
            "run_manager"
        )
        if langchain.llm_cache is None or disregard_cache:
-            # This happens when langchain.cache is None, but self.cache is True
            if self.cache is not None and self.cache:
                raise ValueError(
                    "Asked to cache, but no cache found at `langchain.cache`."
                )
-            run_manager = await callback_manager.on_llm_start(
+            run_managers = await callback_manager.on_llm_start(
                dumpd(self), prompts, invocation_params=params, options=options
            )
-            try:
-                output = (
-                    await self._agenerate(
-                        prompts, stop=stop, run_manager=run_manager, **kwargs
-                    )
-                    if new_arg_supported
-                    else await self._agenerate(prompts, stop=stop, **kwargs)
-                )
-            except (KeyboardInterrupt, Exception) as e:
-                await run_manager.on_llm_error(e, verbose=self.verbose)
-                raise e
-            await run_manager.on_llm_end(output, verbose=self.verbose)
-            if run_manager:
-                output.run = RunInfo(run_id=run_manager.run_id)
+            output = await self._agenerate_helper(
+                prompts, stop, run_managers, bool(new_arg_supported), **kwargs
+            )
            return output
        if len(missing_prompts) > 0:
-            run_manager = await callback_manager.on_llm_start(
-                dumpd(self),
-                missing_prompts,
-                invocation_params=params,
-                options=options,
+            run_managers = await callback_manager.on_llm_start(
+                dumpd(self), missing_prompts, invocation_params=params, options=options
+            )
+            new_results = await self._agenerate_helper(
+                missing_prompts, stop, run_managers, bool(new_arg_supported), **kwargs
            )
-            try:
-                new_results = (
-                    await self._agenerate(
-                        missing_prompts, stop=stop, run_manager=run_manager, **kwargs
-                    )
-                    if new_arg_supported
-                    else await self._agenerate(missing_prompts, stop=stop, **kwargs)
-                )
-            except (KeyboardInterrupt, Exception) as e:
-                await run_manager.on_llm_error(e)
-                raise e
-            await run_manager.on_llm_end(new_results)
            llm_output = update_cache(
                existing_prompts, llm_string, missing_prompt_idxs, new_results, prompts
            )
-            run_info = None
-            if run_manager:
-                run_info = RunInfo(run_id=run_manager.run_id)
+            run_info = (
+                [RunInfo(run_id=run_manager.run_id) for run_manager in run_managers]
+                if run_managers
+                else None
+            )
        else:
            llm_output = {}
            run_info = None
--- a/langchain/llms/openai.py
+++ b/langchain/llms/openai.py
@@ -171,6 +171,16 @@ class BaseOpenAI(BaseLLM):
    """Set of special tokens that are allowed。"""
    disallowed_special: Union[Literal["all"], Collection[str]] = "all"
    """Set of special tokens that are not allowed。"""
+    tiktoken_model_name: Optional[str] = None
+    """The model name to pass to tiktoken when using this class. 
+    Tiktoken is used to count the number of tokens in documents to constrain 
+    them to be under a certain limit. By default, when set to None, this will 
+    be the same as the embedding model name. However, there are some cases 
+    where you may want to use this Embedding class with a model name not 
+    supported by tiktoken. This can include when using Azure embeddings or 
+    when using one of the many model providers that expose an OpenAI-like 
+    API but with different models. In those cases, in order to avoid erroring 
+    when tiktoken is called, you can specify a model name to use here."""

    def __new__(cls, **data: Any) -> Union[OpenAIChat, BaseOpenAI]:  # type: ignore
        """Initialize the OpenAI object."""
@@ -491,7 +501,13 @@ class BaseOpenAI(BaseLLM):
                "Please install it with `pip install tiktoken`."
            )

-        enc = tiktoken.encoding_for_model(self.model_name)
+        model_name = self.tiktoken_model_name or self.model_name
+        try:
+            enc = tiktoken.encoding_for_model(model_name)
+        except KeyError:
+            logger.warning("Warning: model not found. Using cl100k_base encoding.")
+            model = "cl100k_base"
+            enc = tiktoken.get_encoding(model)

        return enc.encode(
            text,
--- a/langchain/schema.py
+++ b/langchain/schema.py
@@ -227,9 +227,35 @@ class LLMResult(BaseModel):
    each input could have multiple generations."""
    llm_output: Optional[dict] = None
    """For arbitrary LLM provider specific output."""
-    run: Optional[RunInfo] = None
+    run: Optional[List[RunInfo]] = None
    """Run metadata."""

+    def flatten(self) -> List[LLMResult]:
+        """Flatten generations into a single list."""
+        llm_results = []
+        for i, gen_list in enumerate(self.generations):
+            # Avoid double counting tokens in OpenAICallback
+            if i == 0:
+                llm_results.append(
+                    LLMResult(
+                        generations=[gen_list],
+                        llm_output=self.llm_output,
+                    )
+                )
+            else:
+                if self.llm_output is not None:
+                    llm_output = self.llm_output.copy()
+                    llm_output["token_usage"] = dict()
+                else:
+                    llm_output = None
+                llm_results.append(
+                    LLMResult(
+                        generations=[gen_list],
+                        llm_output=llm_output,
+                    )
+                )
+        return llm_results
+
    def __eq__(self, other: object) -> bool:
        if not isinstance(other, LLMResult):
            return NotImplemented
--- a/langchain/server.py
+++ b/langchain/server.py
@@ -2,7 +2,7 @@
 import subprocess
 from pathlib import Path

-from langchainplus_sdk.cli.main import get_docker_compose_command
+from langsmith.cli.main import get_docker_compose_command


 def main() -> None:
--- a/langchain/sql_database.py
+++ b/langchain/sql_database.py
@@ -230,8 +230,8 @@ class SQLDatabase:
    def get_usable_table_names(self) -> Iterable[str]:
        """Get names of tables available."""
        if self._include_tables:
-            return self._include_tables
-        return self._all_tables - self._ignore_tables
+            return sorted(self._include_tables)
+        return sorted(self._all_tables - self._ignore_tables)

    def get_table_names(self) -> Iterable[str]:
        """Get names of tables available."""
@@ -290,6 +290,7 @@ class SQLDatabase:
            if has_extra_info:
                table_info += "*/"
            tables.append(table_info)
+        tables.sort()
        final_str = "\n\n".join(tables)
        return final_str

--- a/langchain/tools/jira/init.py
+++ b/langchain/tools/jira/init.py
@@ -1 +1 @@
-"""Zapier Tool."""
+"""Jira Tool."""
--- a/langchain/tools/jira/prompt.py
+++ b/langchain/tools/jira/prompt.py
@@ -32,3 +32,10 @@ JIRA_CATCH_ALL_PROMPT = """
    self.jira.projects()
    For more information on the Jira API, refer to https://atlassian-python-api.readthedocs.io/jira.html
    """
+
+JIRA_CONFLUENCE_PAGE_CREATE_PROMPT = """This tool is a wrapper around atlassian-python-api's Confluence 
+atlassian-python-api API, useful when you need to create a Confluence page. The input to this tool is a dictionary 
+specifying the fields of the Confluence page, and will be passed into atlassian-python-api's Confluence `create_page` 
+function. For example, to create a page in the DEMO space titled "This is the title" with body "This is the body. You can use 
+<strong>HTML tags</strong>!", you would pass in the following dictionary: {{"space": "DEMO", "title":"This is the 
+title","body":"This is the body. You can use <strong>HTML tags</strong>!"}} """
--- a/langchain/tools/office365/init
+++ b/langchain/tools/office365/init
@@ -0,0 +1,17 @@
+"""O365 tools."""
+
+from langchain.tools.office365.create_draft_message import O365CreateDraftMessage
+from langchain.tools.office365.events_search import O365SearchEvents
+from langchain.tools.office365.messages_search import O365SearchEmails
+from langchain.tools.office365.send_event import O365SendEvent
+from langchain.tools.office365.send_message import O365SendMessage
+from langchain.tools.office365.utils import authenticate
+
+__all__ = [
+    "O365SearchEmails",
+    "O365SearchEvents",
+    "O365CreateDraftMessage",
+    "O365SendMessage",
+    "O365SendEvent",
+    "authenticate",
+]
--- a/langchain/tools/office365/base.py
+++ b/langchain/tools/office365/base.py
@@ -0,0 +1,16 @@
+"""Base class for Gmail tools."""
+from __future__ import annotations
+
+from typing import TYPE_CHECKING
+
+from pydantic import Field
+
+from langchain.tools.base import BaseTool
+from langchain.tools.office365.utils import authenticate
+
+if TYPE_CHECKING:
+    from O365 import Account
+
+
+class O365BaseTool(BaseTool):
+    account: Account = Field(default_factory=authenticate)
--- a/langchain/tools/office365/create_draft_message.py
+++ b/langchain/tools/office365/create_draft_message.py
@@ -0,0 +1,78 @@
+from typing import List, Optional, Type
+
+from pydantic import BaseModel, Field
+
+from langchain.callbacks.manager import (
+    AsyncCallbackManagerForToolRun,
+    CallbackManagerForToolRun,
+)
+from langchain.tools.office365.base import O365BaseTool
+
+
+class CreateDraftMessageSchema(BaseModel):
+    body: str = Field(
+        ...,
+        description="The message body to include in the draft.",
+    )
+    to: List[str] = Field(
+        ...,
+        description="The list of recipients.",
+    )
+    subject: str = Field(
+        ...,
+        description="The subject of the message.",
+    )
+    cc: Optional[List[str]] = Field(
+        None,
+        description="The list of CC recipients.",
+    )
+    bcc: Optional[List[str]] = Field(
+        None,
+        description="The list of BCC recipients.",
+    )
+
+
+class O365CreateDraftMessage(O365BaseTool):
+    name: str = "create_email_draft"
+    description: str = (
+        "Use this tool to create a draft email with the provided message fields."
+    )
+    args_schema: Type[CreateDraftMessageSchema] = CreateDraftMessageSchema
+
+    def _run(
+        self,
+        body: str,
+        to: List[str],
+        subject: str,
+        cc: Optional[List[str]] = None,
+        bcc: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForToolRun] = None,
+    ) -> str:
+        # Get mailbox object
+        mailbox = self.account.mailbox()
+        message = mailbox.new_message()
+
+        # Assign message values
+        message.body = body
+        message.subject = subject
+        message.to.add(to)
+        if cc is not None:
+            message.cc.add(cc)
+        if bcc is not None:
+            message.bcc.add(cc)
+
+        message.save_draft()
+
+        output = "Draft created: " + str(message)
+        return output
+
+    async def _arun(
+        self,
+        message: str,
+        to: List[str],
+        subject: str,
+        cc: Optional[List[str]] = None,
+        bcc: Optional[List[str]] = None,
+        run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
+    ) -> str:
+        raise NotImplementedError(f"The tool {self.name} does not support async yet.")
--- a/langchain/tools/office365/events_search.py
+++ b/langchain/tools/office365/events_search.py
@@ -0,0 +1,141 @@
+"""Util that Searches calendar events in Office 365.
+
+Free, but setup is required. See link below.
+https://learn.microsoft.com/en-us/graph/auth/
+"""
+
+from datetime import datetime as dt
+from typing import Any, Dict, List, Optional, Type
+
+from pydantic import BaseModel, Extra, Field
+
+from langchain.callbacks.manager import (
+    AsyncCallbackManagerForToolRun,
+    CallbackManagerForToolRun,
+)
+from langchain.tools.office365.base import O365BaseTool
+from langchain.tools.office365.utils import clean_body
+
+
+class SearchEventsInput(BaseModel):
+    """Input for SearchEmails Tool."""
+
+    """From https://learn.microsoft.com/en-us/graph/search-query-parameter"""
+
+    start_datetime: str = Field(
+        description=(
+            " The start datetime for the search query in the following format: "
+            ' YYYY-MM-DDTHH:MM:SS±hh:mm, where "T" separates the date and time '
+            " components, and the time zone offset is specified as ±hh:mm. "
+            ' For example: "2023-06-09T10:30:00+03:00" represents June 9th, '
+            " 2023, at 10:30 AM in a time zone with a positive offset of 3 "
+            " hours from Coordinated Universal Time (UTC)."
+        )
+    )
+    end_datetime: str = Field(
+        description=(
+            " The end datetime for the search query in the following format: "
+            ' YYYY-MM-DDTHH:MM:SS±hh:mm, where "T" separates the date and time '
+            " components, and the time zone offset is specified as ±hh:mm. "
+            ' For example: "2023-06-09T10:30:00+03:00" represents June 9th, '
+            " 2023, at 10:30 AM in a time zone with a positive offset of 3 "
+            " hours from Coordinated Universal Time (UTC)."
+        )
+    )
+    max_results: int = Field(
+        default=10,
+        description="The maximum number of results to return.",
+    )
+    truncate: bool = Field(
+        default=True,
+        description=(
+            "Whether the event's body is trucated to meet token number limits. Set to "
+            "False for searches that will retrieve very few results, otherwise, set to "
+            "True."
+        ),
+    )
+
+
+class O365SearchEvents(O365BaseTool):
+    """Class for searching calendar events in Office 365
+
+    Free, but setup is required
+    """
+
+    name: str = "events_search"
+    args_schema: Type[BaseModel] = SearchEventsInput
+    description: str = (
+        " Use this tool to search for the user's calendar events."
+        " The input must be the start and end datetimes for the search query."
+        " The output is a JSON list of all the events in the user's calendar"
+        " between the start and end times. You can assume that the user can "
+        " not schedule any meeting over existing meetings, and that the user "
+        "is busy during meetings. Any times without events are free for the user. "
+    )
+
+    class Config:
+        """Configuration for this pydantic object."""
+
+        extra = Extra.forbid
+
+    def _run(
+        self,
+        start_datetime: str,
+        end_datetime: str,
+        max_results: int = 10,
+        truncate: bool = True,
+        run_manager: Optional[CallbackManagerForToolRun] = None,
+    ) -> List[Dict[str, Any]]:
+        TRUNCATE_LIMIT = 150
+
+        # Get calendar object
+        schedule = self.account.schedule()
+        calendar = schedule.get_default_calendar()
+
+        # Process the date range parameters
+        start_datetime_query = dt.strptime(start_datetime, "%Y-%m-%dT%H:%M:%S%z")
+        end_datetime_query = dt.strptime(end_datetime, "%Y-%m-%dT%H:%M:%S%z")
+
+        # Run the query
+        q = calendar.new_query("start").greater_equal(start_datetime_query)
+        q.chain("and").on_attribute("end").less_equal(end_datetime_query)
+        events = calendar.get_events(query=q, include_recurring=True, limit=max_results)
+
+        # Generate output dict
+        output_events = []
+        for event in events:
+            output_event = {}
+            output_event["organizer"] = event.organizer
+
+            output_event["subject"] = event.subject
+
+            if truncate:
+                output_event["body"] = clean_body(event.body)[:TRUNCATE_LIMIT]
+            else:
+                output_event["body"] = clean_body(event.body)
+
+            # Get the time zone from the search parameters
+            time_zone = start_datetime_query.tzinfo
+            # Assign the datetimes in the search time zone
+            output_event["start_datetime"] = event.start.astimezone(time_zone).strftime(
+                "%Y-%m-%dT%H:%M:%S%z"
+            )
+            output_event["end_datetime"] = event.end.astimezone(time_zone).strftime(
+                "%Y-%m-%dT%H:%M:%S%z"
+            )
+            output_event["modified_date"] = event.modified.astimezone(
+                time_zone
+            ).strftime("%Y-%m-%dT%H:%M:%S%z")
+
+            output_events.append(output_event)
+
+        return output_events
+
+    async def _arun(
+        self,
+        query: str,
+        max_results: int = 10,
+        run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
+    ) -> List[Dict[str, Any]]:
+        """Run the tool."""
+        raise NotImplementedError
--- a/langchain/tools/office365/messages_search.py
+++ b/langchain/tools/office365/messages_search.py
@@ -0,0 +1,134 @@
+"""Util that Searches email messages in Office 365.
+
+Free, but setup is required. See link below.
+https://learn.microsoft.com/en-us/graph/auth/
+"""
+
+from typing import Any, Dict, List, Optional, Type
+
+from pydantic import BaseModel, Extra, Field
+
+from langchain.callbacks.manager import (
+    AsyncCallbackManagerForToolRun,
+    CallbackManagerForToolRun,
+)
+from langchain.tools.office365.base import O365BaseTool
+from langchain.tools.office365.utils import clean_body
+
+
+class SearchEmailsInput(BaseModel):
+    """Input for SearchEmails Tool."""
+
+    """From https://learn.microsoft.com/en-us/graph/search-query-parameter"""
+
+    folder: str = Field(
+        default=None,
+        description=(
+            " If the user wants to search in only one folder, the name of the folder. "
+            'Default folders are "inbox", "drafts", "sent items", "deleted ttems", but '
+            "users can search custom folders as well."
+        ),
+    )
+    query: str = Field(
+        description=(
+            "The Microsoift Graph v1.0 $search query. Example filters include "
+            "from:sender, from:sender, to:recipient, subject:subject, "
+            "recipients:list_of_recipients, body:excitement, importance:high, "
+            "received>2022-12-01, received<2021-12-01, sent>2022-12-01, "
+            "sent<2021-12-01, hasAttachments:true  attachment:api-catalog.md, "
+            "cc:samanthab@contoso.com, bcc:samanthab@contoso.com, body:excitement date "
+            "range example: received:2023-06-08..2023-06-09  matching example: "
+            "from:amy OR from:david."
+        )
+    )
+    max_results: int = Field(
+        default=10,
+        description="The maximum number of results to return.",
+    )
+    truncate: bool = Field(
+        default=True,
+        description=(
+            "Whether the email body is trucated to meet token number limits. Set to "
+            "False for searches that will retrieve very few results, otherwise, set to "
+            "True"
+        ),
+    )
+
+
+class O365SearchEmails(O365BaseTool):
+    """Class for searching email messages in Office 365
+
+    Free, but setup is required
+    """
+
+    name: str = "messages_search"
+    args_schema: Type[BaseModel] = SearchEmailsInput
+    description: str = (
+        "Use this tool to search for email messages."
+        " The input must be a valid Microsoft Graph v1.0 $search query."
+        " The output is a JSON list of the requested resource."
+    )
+
+    class Config:
+        """Configuration for this pydantic object."""
+
+        extra = Extra.forbid
+
+    def _run(
+        self,
+        query: str,
+        folder: str = "",
+        max_results: int = 10,
+        truncate: bool = True,
+        run_manager: Optional[CallbackManagerForToolRun] = None,
+    ) -> List[Dict[str, Any]]:
+        # Get mailbox object
+        mailbox = self.account.mailbox()
+
+        # Pull the folder if the user wants to search in a folder
+        if folder != "":
+            mailbox = mailbox.get_folder(folder_name=folder)
+
+        # Retrieve messages based on query
+        query = mailbox.q().search(query)
+        messages = mailbox.get_messages(limit=max_results, query=query)
+
+        # Generate output dict
+        output_messages = []
+        for message in messages:
+            output_message = {}
+            output_message["from"] = message.sender
+
+            if truncate:
+                output_message["body"] = message.body_preview
+            else:
+                output_message["body"] = clean_body(message.body)
+
+            output_message["subject"] = message.subject
+
+            output_message["date"] = message.modified.strftime("%Y-%m-%dT%H:%M:%S%z")
+
+            output_message["to"] = []
+            for recipient in message.to._recipients:
+                output_message["to"].append(str(recipient))
+
+            output_message["cc"] = []
+            for recipient in message.cc._recipients:
+                output_message["cc"].append(str(recipient))
+
+            output_message["bcc"] = []
+            for recipient in message.bcc._recipients:
+                output_message["bcc"].append(str(recipient))
+
+            output_messages.append(output_message)
+
+        return output_messages
+
+    async def _arun(
+        self,
+        query: str,
+        max_results: int = 10,
+        run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
+    ) -> List[Dict[str, Any]]:
+        """Run the tool."""
+        raise NotImplementedError
--- a/langchain/tools/office365/send_event.py
+++ b/langchain/tools/office365/send_event.py
@@ -0,0 +1,96 @@
+"""Util that sends calendar events in Office 365.
+
+Free, but setup is required. See link below.
+https://learn.microsoft.com/en-us/graph/auth/
+"""
+
+from datetime import datetime as dt
+from typing import List, Optional, Type
+
+from pydantic import BaseModel, Field
+
+from langchain.callbacks.manager import (
+    AsyncCallbackManagerForToolRun,
+    CallbackManagerForToolRun,
+)
+from langchain.tools.office365.base import O365BaseTool
+
+
+class SendEventSchema(BaseModel):
+    """Input for CreateEvent Tool."""
+
+    body: str = Field(
+        ...,
+        description="The message body to include in the event.",
+    )
+    attendees: List[str] = Field(
+        ...,
+        description="The list of attendees for the event.",
+    )
+    subject: str = Field(
+        ...,
+        description="The subject of the event.",
+    )
+    start_datetime: str = Field(
+        description=" The start datetime for the event in the following format: "
+        ' YYYY-MM-DDTHH:MM:SS±hh:mm, where "T" separates the date and time '
+        " components, and the time zone offset is specified as ±hh:mm. "
+        ' For example: "2023-06-09T10:30:00+03:00" represents June 9th, '
+        " 2023, at 10:30 AM in a time zone with a positive offset of 3 "
+        " hours from Coordinated Universal Time (UTC).",
+    )
+    end_datetime: str = Field(
+        description=" The end datetime for the event in the following format: "
+        ' YYYY-MM-DDTHH:MM:SS±hh:mm, where "T" separates the date and time '
+        " components, and the time zone offset is specified as ±hh:mm. "
+        ' For example: "2023-06-09T10:30:00+03:00" represents June 9th, '
+        " 2023, at 10:30 AM in a time zone with a positive offset of 3 "
+        " hours from Coordinated Universal Time (UTC).",
+    )
+
+
+class O365SendEvent(O365BaseTool):
+    name: str = "send_event"
+    description: str = (
+        "Use this tool to create and send an event with the provided event fields."
+    )
+    args_schema: Type[SendEventSchema] = SendEventSchema
+
+    def _run(
+        self,
+        body: str,
+        attendees: List[str],
+        subject: str,
+        start_datetime: str,
+        end_datetime: str,
+        run_manager: Optional[CallbackManagerForToolRun] = None,
+    ) -> str:
+        # Get calendar object
+        schedule = self.account.schedule()
+        calendar = schedule.get_default_calendar()
+
+        event = calendar.new_event()
+
+        event.body = body
+        event.subject = subject
+        event.start = dt.strptime(start_datetime, "%Y-%m-%dT%H:%M:%S%z")
+        event.end = dt.strptime(end_datetime, "%Y-%m-%dT%H:%M:%S%z")
+        for attendee in attendees:
+            event.attendees.add(attendee)
+
+        # TO-DO: Look into PytzUsageWarning
+        event.save()
+
+        output = "Event sent: " + str(event)
+        return output
+
+    async def _arun(
+        self,
+        message: str,
+        to: List[str],
+        subject: str,
+        cc: Optional[List[str]] = None,
+        bcc: Optional[List[str]] = None,
+        run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
+    ) -> str:
+        raise NotImplementedError(f"The tool {self.name} does not support async yet.")
--- a/langchain/tools/office365/send_message.py
+++ b/langchain/tools/office365/send_message.py
@@ -0,0 +1,78 @@
+from typing import List, Optional, Type
+
+from pydantic import BaseModel, Field
+
+from langchain.callbacks.manager import (
+    AsyncCallbackManagerForToolRun,
+    CallbackManagerForToolRun,
+)
+from langchain.tools.office365.base import O365BaseTool
+
+
+class SendMessageSchema(BaseModel):
+    body: str = Field(
+        ...,
+        description="The message body to be sent.",
+    )
+    to: List[str] = Field(
+        ...,
+        description="The list of recipients.",
+    )
+    subject: str = Field(
+        ...,
+        description="The subject of the message.",
+    )
+    cc: Optional[List[str]] = Field(
+        None,
+        description="The list of CC recipients.",
+    )
+    bcc: Optional[List[str]] = Field(
+        None,
+        description="The list of BCC recipients.",
+    )
+
+
+class O365SendMessage(O365BaseTool):
+    name: str = "send_email"
+    description: str = (
+        "Use this tool to send an email with the provided message fields."
+    )
+    args_schema: Type[SendMessageSchema] = SendMessageSchema
+
+    def _run(
+        self,
+        body: str,
+        to: List[str],
+        subject: str,
+        cc: Optional[List[str]] = None,
+        bcc: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForToolRun] = None,
+    ) -> str:
+        # Get mailbox object
+        mailbox = self.account.mailbox()
+        message = mailbox.new_message()
+
+        # Assign message values
+        message.body = body
+        message.subject = subject
+        message.to.add(to)
+        if cc is not None:
+            message.cc.add(cc)
+        if bcc is not None:
+            message.bcc.add(cc)
+
+        message.send()
+
+        output = "Message sent: " + str(message)
+        return output
+
+    async def _arun(
+        self,
+        message: str,
+        to: List[str],
+        subject: str,
+        cc: Optional[List[str]] = None,
+        bcc: Optional[List[str]] = None,
+        run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
+    ) -> str:
+        raise NotImplementedError(f"The tool {self.name} does not support async yet.")
--- a/langchain/tools/office365/utils.py
+++ b/langchain/tools/office365/utils.py
@@ -0,0 +1,74 @@
+"""O365 tool utils."""
+from __future__ import annotations
+
+import logging
+import os
+from typing import TYPE_CHECKING
+
+if TYPE_CHECKING:
+    from O365 import Account
+
+logger = logging.getLogger(__name__)
+
+
+def clean_body(body: str) -> str:
+    """Clean body of a message or event."""
+    try:
+        from bs4 import BeautifulSoup
+
+        try:
+            # Remove HTML
+            soup = BeautifulSoup(str(body), "html.parser")
+            body = soup.get_text()
+
+            # Remove return characters
+            body = "".join(body.splitlines())
+
+            # Remove extra spaces
+            body = " ".join(body.split())
+
+            return str(body)
+        except Exception:
+            return str(body)
+    except ImportError:
+        return str(body)
+
+
+def authenticate() -> Account:
+    """Authenticate using the Microsoft Grah API"""
+    try:
+        from O365 import Account
+    except ImportError as e:
+        raise ImportError(
+            "Cannot import 0365. Please install the package with `pip install O365`."
+        ) from e
+
+    if "CLIENT_ID" in os.environ and "CLIENT_SECRET" in os.environ:
+        client_id = os.environ["CLIENT_ID"]
+        client_secret = os.environ["CLIENT_SECRET"]
+        credentials = (client_id, client_secret)
+    else:
+        logger.error(
+            "Error: The CLIENT_ID and CLIENT_SECRET environmental variables have not "
+            "been set. Visit the following link on how to acquire these authorization "
+            "tokens: https://learn.microsoft.com/en-us/graph/auth/"
+        )
+        return None
+
+    account = Account(credentials)
+
+    if account.is_authenticated is False:
+        if not account.authenticate(
+            scopes=[
+                "https://graph.microsoft.com/Mail.ReadWrite",
+                "https://graph.microsoft.com/Mail.Send",
+                "https://graph.microsoft.com/Calendars.ReadWrite",
+                "https://graph.microsoft.com/MailboxSettings.ReadWrite",
+            ]
+        ):
+            print("Error: Could not authenticate")
+            return None
+        else:
+            return account
+    else:
+        return account
--- a/langchain/utilities/duckduckgo_search.py
+++ b/langchain/utilities/duckduckgo_search.py
@@ -49,14 +49,15 @@ class DuckDuckGoSearchAPIWrapper(BaseModel):
                safesearch=self.safesearch,
                timelimit=self.time,
            )
-            if results is None or next(results, None) is None:
+            if results is None:
                return ["No good DuckDuckGo Search Result was found"]
            snippets = []
            for i, res in enumerate(results, 1):
-                snippets.append(res["body"])
-                if i == self.max_results:
+                if res is not None:
+                    snippets.append(res["body"])
+                if len(snippets) == self.max_results:
                    break
-            return snippets
+        return snippets

    def run(self, query: str) -> str:
        snippets = self.get_snippets(query)
@@ -84,7 +85,7 @@ class DuckDuckGoSearchAPIWrapper(BaseModel):
                safesearch=self.safesearch,
                timelimit=self.time,
            )
-            if results is None or next(results, None) is None:
+            if results is None:
                return [{"Result": "No good DuckDuckGo Search Result was found"}]

            def to_metadata(result: Dict) -> Dict[str, str]:
@@ -96,7 +97,8 @@ class DuckDuckGoSearchAPIWrapper(BaseModel):

            formatted_results = []
            for i, res in enumerate(results, 1):
-                formatted_results.append(to_metadata(res))
-                if i == num_results:
+                if res is not None:
+                    formatted_results.append(to_metadata(res))
+                if len(formatted_results) == num_results:
                    break
-            return formatted_results
+        return formatted_results
--- a/langchain/utilities/jira.py
+++ b/langchain/utilities/jira.py
@@ -5,6 +5,7 @@ from pydantic import BaseModel, Extra, root_validator

 from langchain.tools.jira.prompt import (
    JIRA_CATCH_ALL_PROMPT,
+    JIRA_CONFLUENCE_PAGE_CREATE_PROMPT,
    JIRA_GET_ALL_PROJECTS_PROMPT,
    JIRA_ISSUE_CREATE_PROMPT,
    JIRA_JQL_PROMPT,
@@ -17,6 +18,7 @@ class JiraAPIWrapper(BaseModel):
    """Wrapper for Jira API."""

    jira: Any  #: :meta private:
+    confluence: Any
    jira_username: Optional[str] = None
    jira_api_token: Optional[str] = None
    jira_instance_url: Optional[str] = None
@@ -42,6 +44,11 @@ class JiraAPIWrapper(BaseModel):
            "name": "Catch all Jira API call",
            "description": JIRA_CATCH_ALL_PROMPT,
        },
+        {
+            "mode": "create_page",
+            "name": "Create confluence page",
+            "description": JIRA_CONFLUENCE_PAGE_CREATE_PROMPT,
+        },
    ]

    class Config:
@@ -69,7 +76,7 @@ class JiraAPIWrapper(BaseModel):
        values["jira_instance_url"] = jira_instance_url

        try:
-            from atlassian import Jira
+            from atlassian import Confluence, Jira
        except ImportError:
            raise ImportError(
                "atlassian-python-api is not installed. "
@@ -82,7 +89,16 @@ class JiraAPIWrapper(BaseModel):
            password=jira_api_token,
            cloud=True,
        )
+
+        confluence = Confluence(
+            url=jira_instance_url,
+            username=jira_username,
+            password=jira_api_token,
+            cloud=True,
+        )
+
        values["jira"] = jira
+        values["confluence"] = confluence

        return values

@@ -151,7 +167,7 @@ class JiraAPIWrapper(BaseModel):
        )
        return parsed_projects_str

-    def create(self, query: str) -> str:
+    def issue_create(self, query: str) -> str:
        try:
            import json
        except ImportError:
@@ -161,6 +177,16 @@ class JiraAPIWrapper(BaseModel):
        params = json.loads(query)
        return self.jira.issue_create(fields=dict(params))

+    def page_create(self, query: str) -> str:
+        try:
+            import json
+        except ImportError:
+            raise ImportError(
+                "json is not installed. Please install it with `pip install json`"
+            )
+        params = json.loads(query)
+        return self.confluence.create_page(**dict(params))
+
    def other(self, query: str) -> str:
        context = {"self": self}
        exec(f"result = {query}", context)
@@ -173,8 +199,10 @@ class JiraAPIWrapper(BaseModel):
        elif mode == "get_projects":
            return self.project()
        elif mode == "create_issue":
-            return self.create(query)
+            return self.issue_create(query)
        elif mode == "other":
            return self.other(query)
+        elif mode == "create_page":
+            return self.page_create(query)
        else:
            raise ValueError(f"Got unexpected mode {mode}")
--- a/langchain/vectorstores/analyticdb.py
+++ b/langchain/vectorstores/analyticdb.py
@@ -80,34 +80,34 @@ class AnalyticDB(VectorStore):
            extend_existing=True,
        )
        with self.engine.connect() as conn:
-            # Create the table
-            Base.metadata.create_all(conn)
+            with conn.begin():
+                # Create the table
+                Base.metadata.create_all(conn)

-            # Check if the index exists
-            index_name = f"{self.collection_name}_embedding_idx"
-            index_query = text(
-                f"""
-                SELECT 1
-                FROM pg_indexes
-                WHERE indexname = '{index_name}';
-            """
-            )
-            result = conn.execute(index_query).scalar()
-
-            # Create the index if it doesn't exist
-            if not result:
-                index_statement = text(
+                # Check if the index exists
+                index_name = f"{self.collection_name}_embedding_idx"
+                index_query = text(
                    f"""
-                    CREATE INDEX {index_name}
-                    ON {self.collection_name} USING ann(embedding)
-                    WITH (
-                        "dim" = {self.embedding_dimension},
-                        "hnsw_m" = 100
-                    );
+                    SELECT 1
+                    FROM pg_indexes
+                    WHERE indexname = '{index_name}';
                """
                )
-                conn.execute(index_statement)
-            conn.commit()
+                result = conn.execute(index_query).scalar()
+
+                # Create the index if it doesn't exist
+                if not result:
+                    index_statement = text(
+                        f"""
+                        CREATE INDEX {index_name}
+                        ON {self.collection_name} USING ann(embedding)
+                        WITH (
+                            "dim" = {self.embedding_dimension},
+                            "hnsw_m" = 100
+                        );
+                    """
+                    )
+                    conn.execute(index_statement)

    def create_collection(self) -> None:
        if self.pre_delete_collection:
@@ -118,8 +118,8 @@ class AnalyticDB(VectorStore):
        self.logger.debug("Trying to delete collection")
        drop_statement = text(f"DROP TABLE IF EXISTS {self.collection_name};")
        with self.engine.connect() as conn:
-            conn.execute(drop_statement)
-            conn.commit()
+            with conn.begin():
+                conn.execute(drop_statement)

    def add_texts(
        self,
@@ -160,30 +160,28 @@ class AnalyticDB(VectorStore):

        chunks_table_data = []
        with self.engine.connect() as conn:
-            for document, metadata, chunk_id, embedding in zip(
-                texts, metadatas, ids, embeddings
-            ):
-                chunks_table_data.append(
-                    {
-                        "id": chunk_id,
-                        "embedding": embedding,
-                        "document": document,
-                        "metadata": metadata,
-                    }
-                )
+            with conn.begin():
+                for document, metadata, chunk_id, embedding in zip(
+                    texts, metadatas, ids, embeddings
+                ):
+                    chunks_table_data.append(
+                        {
+                            "id": chunk_id,
+                            "embedding": embedding,
+                            "document": document,
+                            "metadata": metadata,
+                        }
+                    )

-                # Execute the batch insert when the batch size is reached
-                if len(chunks_table_data) == batch_size:
+                    # Execute the batch insert when the batch size is reached
+                    if len(chunks_table_data) == batch_size:
+                        conn.execute(insert(chunks_table).values(chunks_table_data))
+                        # Clear the chunks_table_data list for the next batch
+                        chunks_table_data.clear()
+
+                # Insert any remaining records that didn't make up a full batch
+                if chunks_table_data:
                    conn.execute(insert(chunks_table).values(chunks_table_data))
-                    # Clear the chunks_table_data list for the next batch
-                    chunks_table_data.clear()
-
-            # Insert any remaining records that didn't make up a full batch
-            if chunks_table_data:
-                conn.execute(insert(chunks_table).values(chunks_table_data))
-
-            # Commit the transaction only once after all records have been inserted
-            conn.commit()

        return ids

@@ -333,9 +331,9 @@ class AnalyticDB(VectorStore):
    ) -> AnalyticDB:
        """
        Return VectorStore initialized from texts and embeddings.
-        Postgres connection string is required
+        Postgres Connection string is required
        Either pass it as a parameter
-        or set the PGVECTOR_CONNECTION_STRING environment variable.
+        or set the PG_CONNECTION_STRING environment variable.
        """

        connection_string = cls.get_connection_string(kwargs)
@@ -363,7 +361,7 @@ class AnalyticDB(VectorStore):
            raise ValueError(
                "Postgres connection string is required"
                "Either pass it as a parameter"
-                "or set the PGVECTOR_CONNECTION_STRING environment variable."
+                "or set the PG_CONNECTION_STRING environment variable."
            )

        return connection_string
@@ -381,9 +379,9 @@ class AnalyticDB(VectorStore):
    ) -> AnalyticDB:
        """
        Return VectorStore initialized from documents and embeddings.
-        Postgres connection string is required
+        Postgres Connection string is required
        Either pass it as a parameter
-        or set the PGVECTOR_CONNECTION_STRING environment variable.
+        or set the PG_CONNECTION_STRING environment variable.
        """

        texts = [d.page_content for d in documents]
--- a/langchain/vectorstores/chroma.py
+++ b/langchain/vectorstores/chroma.py
@@ -16,6 +16,7 @@ from langchain.vectorstores.utils import maximal_marginal_relevance
 if TYPE_CHECKING:
    import chromadb
    import chromadb.config
+    from chromadb.api.types import ID, OneOrMany, Where, WhereDocument

 logger = logging.getLogger()
 DEFAULT_K = 4  # Number of Documents to return.
@@ -228,7 +229,7 @@ class Chroma(VectorStore):
        k: int = 4,
        **kwargs: Any,
    ) -> List[Tuple[Document, float]]:
-        return self.similarity_search_with_score(query, k)
+        return self.similarity_search_with_score(query, k, **kwargs)

    def max_marginal_relevance_search_by_vector(
        self,
@@ -316,17 +317,43 @@ class Chroma(VectorStore):
        """Delete the collection."""
        self._client.delete_collection(self._collection.name)

-    def get(self, include: Optional[List[str]] = None) -> Dict[str, Any]:
+    def get(
+        self,
+        ids: Optional[OneOrMany[ID]] = None,
+        where: Optional[Where] = None,
+        limit: Optional[int] = None,
+        offset: Optional[int] = None,
+        where_document: Optional[WhereDocument] = None,
+        include: Optional[List[str]] = None,
+    ) -> Dict[str, Any]:
        """Gets the collection.

        Args:
-            include (Optional[List[str]]): List of fields to include from db.
-                Defaults to None.
+            ids: The ids of the embeddings to get. Optional.
+            where: A Where type dict used to filter results by.
+                   E.g. `{"color" : "red", "price": 4.20}`. Optional.
+            limit: The number of documents to return. Optional.
+            offset: The offset to start returning results from.
+                    Useful for paging results with limit. Optional.
+            where_document: A WhereDocument type dict used to filter by the documents.
+                            E.g. `{$contains: {"text": "hello"}}`. Optional.
+            include: A list of what to include in the results.
+                     Can contain `"embeddings"`, `"metadatas"`, `"documents"`.
+                     Ids are always included.
+                     Defaults to `["metadatas", "documents"]`. Optional.
        """
+        kwargs = {
+            "ids": ids,
+            "where": where,
+            "limit": limit,
+            "offset": offset,
+            "where_document": where_document,
+        }
+
        if include is not None:
-            return self._collection.get(include=include)
-        else:
-            return self._collection.get()
+            kwargs["include"] = include
+
+        return self._collection.get(**kwargs)

    def persist(self) -> None:
        """Persist the collection.
--- a/poetry.lock
+++ b/poetry.lock
@@ -4360,22 +4360,6 @@ dev = ["black", "pre-commit", "ruff"]
 docs = ["mkdocs", "mkdocs-jupyter", "mkdocs-material", "mkdocstrings[python]"]
 tests = ["doctest", "pytest", "pytest-mock"]

-[[package]]
-name = "langchainplus-sdk"
-version = "0.0.17"
-description = "Client library to connect to the LangSmith LLM Tracing and Evaluation Platform."
-optional = false
-python-versions = ">=3.8.1,<4.0"
-files = [
-    {file = "langchainplus_sdk-0.0.17-py3-none-any.whl", hash = "sha256:899675fe850bb0829691ce7643d5c3b4425de1535b6f2d6ce1e5f5457ffb05bf"},
-    {file = "langchainplus_sdk-0.0.17.tar.gz", hash = "sha256:6520c864a23dcadbe6fb7233a117347f6acc32725a97758e59354704c50de303"},
-]
-
-[package.dependencies]
-pydantic = ">=1,<2"
-requests = ">=2,<3"
-tenacity = ">=8.1.0,<9.0.0"
-
 [[package]]
 name = "langcodes"
 version = "3.3.0"
@@ -4409,6 +4393,22 @@ whylogs = "1.1.45.dev6"
 [package.extras]
 all = ["datasets (>=2.12.0,<3.0.0)", "nltk (>=3.8.1,<4.0.0)", "openai (>=0.27.6,<0.28.0)", "sentence-transformers (>=2.2.2,<3.0.0)", "torch"]

+[[package]]
+name = "langsmith"
+version = "0.0.1"
+description = "Client library to connect to the LangSmith LLM Tracing and Evaluation Platform."
+optional = false
+python-versions = ">=3.8.1,<4.0"
+files = [
+    {file = "langsmith-0.0.1-py3-none-any.whl", hash = "sha256:33003732de7aa2f22c5b002821af1b05456d86f0723ecfc9f238701daec6e45d"},
+    {file = "langsmith-0.0.1.tar.gz", hash = "sha256:0e5401a358901451e76625b98c618f14c97b1f187b4f38215d049b243a2e11c9"},
+]
+
+[package.dependencies]
+pydantic = ">=1,<2"
+requests = ">=2,<3"
+tenacity = ">=8.1.0,<9.0.0"
+
 [[package]]
 name = "lark"
 version = "1.1.5"
@@ -11771,4 +11771,4 @@ text-helpers = ["chardet"]
 [metadata]
 lock-version = "2.0"
 python-versions = ">=3.8.1,<4.0"
-content-hash = "6e495e4f58127a5d2001385404b973896e275f5ca71a6ebe856cb114977189d1"
+content-hash = "954c1c58503b17e48de75014771ce8ffe70e84577e0a77ed593f2e8ac290960c"
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "langchain"
-version = "0.0.214"
+version = "0.0.216"
 description = "Building applications with LLMs through composability"
 authors = []
 license = "MIT"
@@ -106,11 +106,12 @@ pyspark = {version = "^3.4.0", optional = true}
 clarifai = {version = "9.1.0", optional = true}
 tigrisdb = {version = "^1.0.0b6", optional = true}
 nebula3-python = {version = "^3.4.0", optional = true}
-langchainplus-sdk = ">=0.0.17"
 awadb = {version = "^0.3.3", optional = true}
 azure-search-documents = {version = "11.4.0a20230509004", source = "azure-sdk-dev", optional = true}
 openllm = {version = ">=0.1.6", optional = true}
 streamlit = {version = "^1.18.0", optional = true, python = ">=3.8.1,<3.9.7 || >3.9.7,<4.0"}
+# TODO: Remove required dep once Run object schea and conversion is fixed for BaseTracer
+langsmith = "^0.0.1"

 [tool.poetry.group.docs.dependencies]
 autodoc_pydantic = "^1.8.0"
--- a/tests/integration_tests/callbacks/test_openai_callback.py
+++ b/tests/integration_tests/callbacks/test_openai_callback.py
@@ -38,6 +38,21 @@ async def test_openai_callback() -> None:
    assert cb.total_tokens == total_tokens


+def test_openai_callback_batch_llm() -> None:
+    llm = OpenAI(temperature=0)
+    with get_openai_callback() as cb:
+        llm.generate(["What is the square root of 4?", "What is the square root of 4?"])
+
+    assert cb.total_tokens > 0
+    total_tokens = cb.total_tokens
+
+    with get_openai_callback() as cb:
+        llm("What is the square root of 4?")
+        llm("What is the square root of 4?")
+
+    assert cb.total_tokens == total_tokens
+
+
 def test_openai_callback_agent() -> None:
    llm = OpenAI(temperature=0)
    tools = load_tools(["serpapi", "llm-math"], llm=llm)
--- a/tests/integration_tests/document_loaders/test_rst.py
+++ b/tests/integration_tests/document_loaders/test_rst.py
@@ -0,0 +1,15 @@
+import os
+from pathlib import Path
+
+from langchain.document_loaders import UnstructuredRSTLoader
+
+EXAMPLE_DIRECTORY = file_path = Path(__file__).parent.parent / "examples"
+
+
+def test_unstructured_rst_loader() -> None:
+    """Test unstructured loader."""
+    file_path = os.path.join(EXAMPLE_DIRECTORY, "README.rst")
+    loader = UnstructuredRSTLoader(str(file_path))
+    docs = loader.load()
+
+    assert len(docs) == 1
--- a/tests/integration_tests/document_loaders/test_whatsapp_chat.py
+++ b/tests/integration_tests/document_loaders/test_whatsapp_chat.py
@@ -18,4 +18,6 @@ def test_whatsapp_chat_loader() -> None:
        "User 1 on 1/23/23, 3:22_AM: And let me know if anything changes\n\n"
        "~ User name 2 on 1/24/21, 12:41:03 PM: Of course!\n\n"
        "~ User 2 on 2023/5/4, 16:13:23: See you!\n\n"
+        "User 1 on 7/19/22, 11:32 PM: Hello\n\n"
+        "User 2 on 7/20/22, 11:32 am: Goodbye\n\n"
    )
--- a/tests/integration_tests/examples/README.rst
+++ b/tests/integration_tests/examples/README.rst
@@ -0,0 +1,28 @@
+Example Docs
+------------
+
+The sample docs directory contains the following files:
+
+-  ``example-10k.html`` - A 10-K SEC filing in HTML format
+-  ``layout-parser-paper.pdf`` - A PDF copy of the layout parser paper
+-  ``factbook.xml``/``factbook.xsl`` - Example XML/XLS files that you
+   can use to test stylesheets
+
+These documents can be used to test out the parsers in the library. In
+addition, here are instructions for pulling in some sample docs that are
+too big to store in the repo.
+
+XBRL 10-K
+^^^^^^^^^
+
+You can get an example 10-K in inline XBRL format using the following
+``curl``. Note, you need to have the user agent set in the header or the
+SEC site will reject your request.
+
+.. code:: bash
+
+   curl -O \
+     -A '${organization} ${email}'
+     https://www.sec.gov/Archives/edgar/data/311094/000117184321001344/0001171843-21-001344.txt
+
+You can parse this document using the HTML parser.
--- a/tests/integration_tests/examples/example.mht
+++ b/tests/integration_tests/examples/example.mht
@@ -0,0 +1,108 @@
+From: <Saved by Blink>
+Snapshot-Content-Location: https://langchain.com/
+Subject: 
+Date: Fri, 16 Jun 2023 19:32:59 -0000
+MIME-Version: 1.0
+Content-Type: multipart/related;
+	type="text/html";
+	boundary="----MultipartBoundary--dYaUgeoeP18TqraaeOwkeZyu1vI09OtkFwH2rcnJMt----"
+
+
+------MultipartBoundary--dYaUgeoeP18TqraaeOwkeZyu1vI09OtkFwH2rcnJMt----
+Content-Type: text/html
+Content-ID: <frame-2F1DB31BBD26C55A7F1EEC7561350515@mhtml.blink>
+Content-Transfer-Encoding: quoted-printable
+Content-Location: https://langchain.com/
+
+<html><head><title>LangChain</title><meta http-equiv=3D"Content-Type" content=3D"text/html; charset=
+=3DUTF-8"><link rel=3D"stylesheet" type=3D"text/css" href=3D"cid:css-c9ac93=
+be-2ab2-46d8-8690-80da3a6d1832@mhtml.blink" /></head><body data-new-gr-c-s-=
+check-loaded=3D"14.1112.0" data-gr-ext-installed=3D""><p align=3D"center">
+	<b><font size=3D"6">L</font><font size=3D"4">ANG </font><font size=3D"6">C=
+</font><font size=3D"4">HAIN </font><font size=3D"2">=F0=9F=A6=9C=EF=B8=8F=
+=F0=9F=94=97</font><br>Official Home Page</b><font size=3D"1">&nbsp;</font>=
+</p>
+
+<hr>
+<center>
+<table border=3D"0" cellspacing=3D"0" width=3D"90%">
+  <tbody>
+  <tr>
+    <td height=3D"55" valign=3D"top" width=3D"50%">
+      <ul>
+        <li><a href=3D"https://langchain.com/integrations.html">Integration=
+s</a>=20
+    </li></ul></td>
+   <td height=3D"45" valign=3D"top" width=3D"50%">
+      <ul>
+        <li><a href=3D"https://langchain.com/features.html">Features</a>=20
+        </li></ul></td></tr>
+    <tr>
+    <td height=3D"55" valign=3D"top" width=3D"50%">
+      <ul>
+        <li><a href=3D"https://blog.langchain.dev/">Blog</a>=20
+    </li></ul></td>
+   <td height=3D"45" valign=3D"top" width=3D"50%">
+      <ul>
+        <li><a href=3D"https://docs.langchain.com/docs/">Conceptual Guide</=
+a>=20
+        </li></ul></td></tr>
+
+  <tr>
+    <td height=3D"45" valign=3D"top" width=3D"50%">
+      <ul>
+        <li><a href=3D"https://github.com/hwchase17/langchain">Python Repo<=
+/a></li></ul></td>
+    <td height=3D"45" valign=3D"top" width=3D"50%">
+		  <ul>
+        <li><a href=3D"https://github.com/hwchase17/langchainjs">JavaScript=
+ Repo</a></li></ul></td></tr>
+ =20
+=09
+  <tr>
+    <td height=3D"45" valign=3D"top" width=3D"50%">
+      <ul>
+        <li><a href=3D"https://python.langchain.com/en/latest/">Python Docu=
+mentation</a> </li></ul></td>
+    <td height=3D"45" valign=3D"top" width=3D"50%">
+      <ul>
+         <li><a href=3D"https://js.langchain.com/docs/">JavaScript Document=
+ation</a>
+					</li></ul></td></tr>
+  <tr>
+    <td height=3D"45" valign=3D"top" width=3D"50%">
+      <ul>
+        <li><a href=3D"https://github.com/hwchase17/chat-langchain">Python =
+ChatLangChain</a> </li></ul></td>
+    <td height=3D"45" valign=3D"top" width=3D"50%">
+      <ul>
+         <li><a href=3D"https://github.com/sullivan-sean/chat-langchainjs">=
+JavaScript ChatLangChain</a>
+					</li></ul></td></tr>
+  <tr>
+    <td height=3D"45" valign=3D"top" width=3D"50%">
+      <ul>
+        <li><a href=3D"https://discord.gg/6adMQxSpJS">Discord</a> </li></ul=
+></td>
+    <td height=3D"55" valign=3D"top" width=3D"50%">
+      <ul>
+        <li><a href=3D"https://twitter.com/langchainai">Twitter</a>
+					</li></ul></td></tr>
+			=09
+
+
+</tbody></table></center>
+<hr>
+<font size=3D"2">
+<p>If you have any comments about our WEB page, you can=20
+write us at the address shown above.  However, due to=20
+the limited number of personnel in our corporate office, we are unable to=
+=20
+provide a direct response.</p></font>
+<hr>
+<p align=3D"left"><font size=3D"2">Copyright =C2=A9 2023-2023<b> LangChain =
+Inc.</b></font><font size=3D"2">=20
+</font></p>
+</body></html>
+
+------MultipartBoundary--dYaUgeoeP18TqraaeOwkeZyu1vI09OtkFwH2rcnJMt------
--- a/tests/integration_tests/examples/whatsapp_chat.txt
+++ b/tests/integration_tests/examples/whatsapp_chat.txt
@@ -3,4 +3,6 @@
 1/23/23, 3:19 AM - User 2: Bye!
 1/23/23, 3:22_AM - User 1: And let me know if anything changes
 [1/24/21, 12:41:03 PM] ~ User name 2: Of course!
-[2023/5/4, 16:13:23] ~ User 2: See you!
+[2023/5/4, 16:13:23] ~ User 2: See you!
+7/19/22, 11:32 PM - User 1: Hello
+7/20/22, 11:32 am - User 2: Goodbye
--- a/tests/integration_tests/llms/test_openai.py
+++ b/tests/integration_tests/llms/test_openai.py
@@ -96,6 +96,15 @@ def test_openai_streaming() -> None:
        assert isinstance(token["choices"][0]["text"], str)


+def test_openai_multiple_prompts() -> None:
+    """Test completion with multiple prompts."""
+    llm = OpenAI(max_tokens=10)
+    output = llm.generate(["I'm Pickle Rick", "I'm Pickle Rick"])
+    assert isinstance(output, LLMResult)
+    assert isinstance(output.generations, list)
+    assert len(output.generations) == 2
+
+
 def test_openai_streaming_error() -> None:
    """Test error handling in stream."""
    llm = OpenAI(best_of=2)
--- a/tests/integration_tests/utilities/test_jira_api.py
+++ b/tests/integration_tests/utilities/test_jira_api.py
@@ -27,3 +27,17 @@ def test_create_ticket() -> None:
    output = jira.run("create_issue", issue_string)
    assert "id" in output
    assert "key" in output
+
+
+def test_create_confluence_page() -> None:
+    """Test for getting projects on JIRA"""
+    jira = JiraAPIWrapper()
+    create_page_dict = (
+        '{"space": "ROC", "title":"This is the title",'
+        '"body":"This is the body. You can use '
+        '<strong>HTML tags</strong>!"}'
+    )
+
+    output = jira.run("create_page", create_page_dict)
+    assert "type" in output
+    assert "page" in output
--- a/tests/unit_tests/agents/test_agent.py
+++ b/tests/unit_tests/agents/test_agent.py
@@ -28,6 +28,10 @@ class FakeListLLM(LLM):
        print(self.responses[self.i])
        return self.responses[self.i]

+    def get_num_tokens(self, text: str) -> int:
+        """Return number of tokens in text."""
+        return len(text.split())
+
    @property
    def _identifying_params(self) -> Mapping[str, Any]:
        return {}
--- a/tests/unit_tests/agents/test_initialize.py
+++ b/tests/unit_tests/agents/test_initialize.py
@@ -0,0 +1,23 @@
+"""Test the initialize module."""
+
+from langchain.agents.agent_types import AgentType
+from langchain.agents.initialize import initialize_agent
+from langchain.tools.base import tool
+from tests.unit_tests.llms.fake_llm import FakeLLM
+
+
+@tool
+def my_tool(query: str) -> str:
+    """A fake tool."""
+    return "fake tool"
+
+
+def test_initialize_agent_with_str_agent_type() -> None:
+    """Test initialize_agent with a string."""
+    fake_llm = FakeLLM()
+    agent_executor = initialize_agent(
+        [my_tool], fake_llm, "zero-shot-react-description"  # type: ignore
+    )
+    assert agent_executor.agent._agent_type == AgentType.ZERO_SHOT_REACT_DESCRIPTION
+    assert isinstance(agent_executor.tags, list)
+    assert "zero-shot-react-description" in agent_executor.tags
--- a/tests/unit_tests/callbacks/test_callback_manager.py
+++ b/tests/unit_tests/callbacks/test_callback_manager.py
@@ -18,11 +18,12 @@ def _test_callback_manager(
    manager: CallbackManager, *handlers: BaseFakeCallbackHandler
 ) -> None:
    """Test the CallbackManager."""
-    run_manager = manager.on_llm_start({}, [])
-    run_manager.on_llm_end(LLMResult(generations=[]))
-    run_manager.on_llm_error(Exception())
-    run_manager.on_llm_new_token("foo")
-    run_manager.on_text("foo")
+    run_managers = manager.on_llm_start({}, ["prompt"])
+    for run_manager in run_managers:
+        run_manager.on_llm_end(LLMResult(generations=[]))
+        run_manager.on_llm_error(Exception())
+        run_manager.on_llm_new_token("foo")
+        run_manager.on_text("foo")

    run_manager_chain = manager.on_chain_start({"name": "foo"}, {})
    run_manager_chain.on_chain_end({})
@@ -42,11 +43,12 @@ async def _test_callback_manager_async(
    manager: AsyncCallbackManager, *handlers: BaseFakeCallbackHandler
 ) -> None:
    """Test the CallbackManager."""
-    run_manager = await manager.on_llm_start({}, [])
-    await run_manager.on_llm_end(LLMResult(generations=[]))
-    await run_manager.on_llm_error(Exception())
-    await run_manager.on_llm_new_token("foo")
-    await run_manager.on_text("foo")
+    run_managers = await manager.on_llm_start({}, ["prompt"])
+    for run_manager in run_managers:
+        await run_manager.on_llm_end(LLMResult(generations=[]))
+        await run_manager.on_llm_error(Exception())
+        await run_manager.on_llm_new_token("foo")
+        await run_manager.on_text("foo")

    run_manager_chain = await manager.on_chain_start({"name": "foo"}, {})
    await run_manager_chain.on_chain_end({})
@@ -95,9 +97,10 @@ def test_ignore_llm() -> None:
    handler1 = FakeCallbackHandler(ignore_llm_=True)
    handler2 = FakeCallbackHandler()
    manager = CallbackManager(handlers=[handler1, handler2])
-    run_manager = manager.on_llm_start({}, [])
-    run_manager.on_llm_end(LLMResult(generations=[]))
-    run_manager.on_llm_error(Exception())
+    run_managers = manager.on_llm_start({}, ["prompt"])
+    for run_manager in run_managers:
+        run_manager.on_llm_end(LLMResult(generations=[]))
+        run_manager.on_llm_error(Exception())
    assert handler1.starts == 0
    assert handler1.ends == 0
    assert handler1.errors == 0
--- a/tests/unit_tests/callbacks/tracers/test_base_tracer.py
+++ b/tests/unit_tests/callbacks/tracers/test_base_tracer.py
@@ -11,7 +11,7 @@ from freezegun import freeze_time
 from langchain.callbacks.manager import CallbackManager
 from langchain.callbacks.tracers.base import BaseTracer, TracerException
 from langchain.callbacks.tracers.schemas import Run
-from langchain.schema import LLMResult
+from langchain.schema import HumanMessage, LLMResult

 SERIALIZED = {"id": ["llm"]}
 SERIALIZED_CHAT = {"id": ["chat_model"]}
@@ -58,9 +58,13 @@ def test_tracer_llm_run() -> None:
@freeze_time("2023-01-01")
 def test_tracer_chat_model_run() -> None:
    """Test tracer on a Chat Model run."""
-    uuid = uuid4()
+    tracer = FakeTracer()
+    manager = CallbackManager(handlers=[tracer])
+    run_managers = manager.on_chat_model_start(
+        serialized=SERIALIZED_CHAT, messages=[[HumanMessage(content="")]]
+    )
    compare_run = Run(
-        id=str(uuid),
+        id=str(run_managers[0].run_id),
        name="chat_model",
        start_time=datetime.utcnow(),
        end_time=datetime.utcnow(),
@@ -68,17 +72,13 @@ def test_tracer_chat_model_run() -> None:
        execution_order=1,
        child_execution_order=1,
        serialized=SERIALIZED_CHAT,
-        inputs=dict(prompts=[""]),
+        inputs=dict(prompts=["Human: "]),
        outputs=LLMResult(generations=[[]]),
        error=None,
        run_type="llm",
    )
-    tracer = FakeTracer()
-    manager = CallbackManager(handlers=[tracer])
-    run_manager = manager.on_chat_model_start(
-        serialized=SERIALIZED_CHAT, messages=[[]], run_id=uuid
-    )
-    run_manager.on_llm_end(response=LLMResult(generations=[[]]))
+    for run_manager in run_managers:
+        run_manager.on_llm_end(response=LLMResult(generations=[[]]))
    assert tracer.runs == [compare_run]


--- a/tests/unit_tests/callbacks/tracers/test_langchain_v1.py
+++ b/tests/unit_tests/callbacks/tracers/test_langchain_v1.py
@@ -18,7 +18,7 @@ from langchain.callbacks.tracers.langchain_v1 import (
    TracerSessionV1,
 )
 from langchain.callbacks.tracers.schemas import Run, RunTypeEnum, TracerSessionV1Base
-from langchain.schema import LLMResult
+from langchain.schema import HumanMessage, LLMResult

 TEST_SESSION_ID = 2023

@@ -127,9 +127,15 @@ def test_tracer_llm_run() -> None:
@freeze_time("2023-01-01")
 def test_tracer_chat_model_run() -> None:
    """Test tracer on a Chat Model run."""
-    uuid = uuid4()
+    tracer = FakeTracer()
+
+    tracer.new_session()
+    manager = CallbackManager(handlers=[tracer])
+    run_managers = manager.on_chat_model_start(
+        serialized=SERIALIZED_CHAT, messages=[[HumanMessage(content="")]]
+    )
    compare_run = LLMRun(
-        uuid=str(uuid),
+        uuid=str(run_managers[0].run_id),
        parent_uuid=None,
        start_time=datetime.utcnow(),
        end_time=datetime.utcnow(),
@@ -137,19 +143,13 @@ def test_tracer_chat_model_run() -> None:
        execution_order=1,
        child_execution_order=1,
        serialized=SERIALIZED_CHAT,
-        prompts=[""],
+        prompts=["Human: "],
        response=LLMResult(generations=[[]]),
        session_id=TEST_SESSION_ID,
        error=None,
    )
-    tracer = FakeTracer()
-
-    tracer.new_session()
-    manager = CallbackManager(handlers=[tracer])
-    run_manager = manager.on_chat_model_start(
-        serialized=SERIALIZED_CHAT, messages=[[]], run_id=uuid
-    )
-    run_manager.on_llm_end(response=LLMResult(generations=[[]]))
+    for run_manager in run_managers:
+        run_manager.on_llm_end(response=LLMResult(generations=[[]]))
    assert tracer.runs == [compare_run]


--- a/tests/unit_tests/chains/test_hyde.py
+++ b/tests/unit_tests/chains/test_hyde.py
@@ -49,6 +49,10 @@ class FakeLLM(BaseLLM):
    ) -> LLMResult:
        return LLMResult(generations=[[Generation(text="foo") for _ in range(self.n)]])

+    def get_num_tokens(self, text: str) -> int:
+        """Return number of tokens."""
+        return len(text.split())
+
    @property
    def _llm_type(self) -> str:
        """Return type of llm."""
--- a/tests/unit_tests/chains/test_natbot.py
+++ b/tests/unit_tests/chains/test_natbot.py
@@ -28,6 +28,9 @@ class FakeLLM(LLM):
        """Return type of llm."""
        return "fake"

+    def get_num_tokens(self, text: str) -> int:
+        return len(text.split())
+
    @property
    def _identifying_params(self) -> Mapping[str, Any]:
        return {}
--- a/tests/unit_tests/client/test_runner_utils.py
+++ b/tests/unit_tests/client/test_runner_utils.py
@@ -5,8 +5,8 @@ from typing import Any, Dict, List, Optional, Union
 from unittest import mock

 import pytest
-from langchainplus_sdk.client import LangChainPlusClient
-from langchainplus_sdk.schemas import Dataset, Example
+from langsmith.client import Client as LangSmithClient
+from langsmith.schemas import Dataset, Example

 from langchain.base_language import BaseLanguageModel
 from langchain.chains.base import Chain
@@ -180,15 +180,15 @@ async def test_arun_on_dataset(monkeypatch: pytest.MonkeyPatch) -> None:
        pass

    with mock.patch.object(
-        LangChainPlusClient, "read_dataset", new=mock_read_dataset
+        LangSmithClient, "read_dataset", new=mock_read_dataset
    ), mock.patch.object(
-        LangChainPlusClient, "list_examples", new=mock_list_examples
+        LangSmithClient, "list_examples", new=mock_list_examples
    ), mock.patch(
        "langchain.client.runner_utils._arun_llm_or_chain", new=mock_arun_chain
    ), mock.patch.object(
-        LangChainPlusClient, "create_project", new=mock_create_project
+        LangSmithClient, "create_project", new=mock_create_project
    ):
-        client = LangChainPlusClient(api_url="http://localhost:1984", api_key="123")
+        client = LangSmithClient(api_url="http://localhost:1984", api_key="123")
        chain = mock.MagicMock()
        num_repetitions = 3
        results = await arun_on_dataset(
--- a/tests/unit_tests/document_loaders/test_mhtml.py
+++ b/tests/unit_tests/document_loaders/test_mhtml.py
@@ -0,0 +1,25 @@
+from pathlib import Path
+
+import pytest
+
+from langchain.document_loaders.mhtml import MHTMLLoader
+
+HERE = Path(__file__).parent
+EXAMPLES = HERE.parent.parent / "integration_tests" / "examples"
+
+
+@pytest.mark.requires("bs4", "lxml")
+def test_mhtml_loader() -> None:
+    """Test mhtml loader."""
+    file_path = EXAMPLES / "example.mht"
+    loader = MHTMLLoader(str(file_path))
+    docs = loader.load()
+
+    assert len(docs) == 1
+
+    metadata = docs[0].metadata
+    content = docs[0].page_content
+
+    assert metadata["title"] == "LangChain"
+    assert metadata["source"] == str(file_path)
+    assert "LANG CHAIN 🦜️🔗Official Home Page" in content
--- a/tests/unit_tests/evaluation/criteria/init.py
+++ b/tests/unit_tests/evaluation/criteria/init.py
--- a/tests/unit_tests/evaluation/criteria/test_eval_chain.py
+++ b/tests/unit_tests/evaluation/criteria/test_eval_chain.py
@@ -0,0 +1,31 @@
+"""Test the criteria eval chain."""
+
+
+from langchain.evaluation.criteria.eval_chain import (
+    HELPFULNESS_CRITERION,
+    CriteriaEvalChain,
+)
+from langchain.evaluation.schema import StringEvaluator
+from tests.unit_tests.llms.fake_llm import FakeLLM
+
+
+def test_resolve_criteria() -> None:
+    assert CriteriaEvalChain.resolve_criteria("helpfulness") == HELPFULNESS_CRITERION
+    assert CriteriaEvalChain.resolve_criteria(["helpfulness"]) == HELPFULNESS_CRITERION
+
+
+def test_criteria_eval_chain() -> None:
+    chain = CriteriaEvalChain.from_llm(
+        llm=FakeLLM(
+            queries={"text": "The meaning of life\nY"}, sequential_responses=True
+        ),
+        criteria={"my criterion": "my criterion description"},
+    )
+    result = chain.evaluate_strings(
+        prediction="my prediction", reference="my reference", input="my input"
+    )
+    assert result["reasoning"] == "The meaning of life"
+
+
+def test_implements_string_protocol() -> None:
+    assert isinstance(CriteriaEvalChain, StringEvaluator)
--- a/tests/unit_tests/evaluation/qa/test_eval_chain.py
+++ b/tests/unit_tests/evaluation/qa/test_eval_chain.py
@@ -4,11 +4,13 @@ from typing import Type

 import pytest

+from langchain.chains.llm import LLMChain
 from langchain.evaluation.qa.eval_chain import (
    ContextQAEvalChain,
    CotQAEvalChain,
    QAEvalChain,
 )
+from langchain.evaluation.schema import StringEvaluator
 from tests.unit_tests.llms.fake_llm import FakeLLM


@@ -44,3 +46,24 @@ def test_context_eval_chain(chain_cls: Type[ContextQAEvalChain]) -> None:
    assert outputs[0] == outputs[1]
    assert "text" in outputs[0]
    assert outputs[0]["text"] == "foo"
+
+
+@pytest.mark.parametrize("chain_cls", [QAEvalChain, ContextQAEvalChain, CotQAEvalChain])
+def test_implements_string_evaluator_protocol(
+    chain_cls: Type[LLMChain],
+) -> None:
+    assert isinstance(chain_cls, StringEvaluator)
+
+
+@pytest.mark.parametrize("chain_cls", [QAEvalChain, ContextQAEvalChain, CotQAEvalChain])
+def test_returns_expected_results(
+    chain_cls: Type[LLMChain],
+) -> None:
+    fake_llm = FakeLLM(
+        queries={"text": "The meaning of life\nCORRECT"}, sequential_responses=True
+    )
+    chain = chain_cls.from_llm(fake_llm)  # type: ignore
+    results = chain.evaluate_strings(
+        prediction="my prediction", reference="my reference", input="my input"
+    )
+    assert results["score"] == 1
--- a/tests/unit_tests/evaluation/run_evaluators/init.py
+++ b/tests/unit_tests/evaluation/run_evaluators/init.py
--- a/tests/unit_tests/evaluation/run_evaluators/test_implementations.py
+++ b/tests/unit_tests/evaluation/run_evaluators/test_implementations.py
@@ -0,0 +1,54 @@
+"""Test run evaluator implementations basic functionality."""
+
+from uuid import UUID
+
+import pytest
+from langsmith.schemas import Example, Run
+
+from langchain.evaluation.run_evaluators import get_criteria_evaluator, get_qa_evaluator
+from tests.unit_tests.llms.fake_llm import FakeLLM
+
+
+@pytest.fixture
+def run() -> Run:
+    return Run(
+        id=UUID("f77cd087-48f7-4c62-9e0e-297842202107"),
+        name="My Run",
+        inputs={"input": "What is the answer to life, the universe, and everything?"},
+        outputs={"output": "The answer is 42."},
+        start_time="2021-07-20T15:00:00.000000+00:00",
+        end_time="2021-07-20T15:00:00.000000+00:00",
+        run_type="chain",
+        execution_order=1,
+    )
+
+
+@pytest.fixture
+def example() -> Example:
+    return Example(
+        id=UUID("f77cd087-48f7-4c62-9e0e-297842202106"),
+        dataset_id=UUID("f77cd087-48f7-4c62-9e0e-297842202105"),
+        inputs={"input": "What is the answer to life, the universe, and everything?"},
+        outputs={"output": "The answer is 42."},
+        created_at="2021-07-20T15:00:00.000000+00:00",
+    )
+
+
+def test_get_qa_evaluator(run: Run, example: Example) -> None:
+    """Test get_qa_evaluator."""
+    eval_llm = FakeLLM(
+        queries={"a": "This checks out.\nCORRECT"}, sequential_responses=True
+    )
+    qa_evaluator = get_qa_evaluator(eval_llm)
+    res = qa_evaluator.evaluate_run(run, example)
+    assert res.value == "CORRECT"
+    assert res.score == 1
+
+
+def test_get_criteria_evaluator(run: Run, example: Example) -> None:
+    """Get a criteria evaluator."""
+    eval_llm = FakeLLM(queries={"a": "This checks out.\nY"}, sequential_responses=True)
+    criteria_evaluator = get_criteria_evaluator(eval_llm, criteria="conciseness")
+    res = criteria_evaluator.evaluate_run(run, example)
+    assert res.value == "Y"
+    assert res.score == 1
--- a/tests/unit_tests/llms/fake_llm.py
+++ b/tests/unit_tests/llms/fake_llm.py
@@ -24,6 +24,10 @@ class FakeLLM(LLM):
            )
        return queries

+    def get_num_tokens(self, text: str) -> int:
+        """Return number of tokens."""
+        return len(text.split())
+
    @property
    def _llm_type(self) -> str:
        """Return type of llm."""
--- a/tests/unit_tests/output_parsers/test_base_output_parser.py
+++ b/tests/unit_tests/output_parsers/test_base_output_parser.py
@@ -1,5 +1,6 @@
 """Test the BaseOutputParser class and its sub-classes."""
 from abc import ABC
+from collections import defaultdict
 from typing import List, Optional, Set, Type

 import pytest
@@ -42,12 +43,12 @@ def test_subclass_implements_type(cls: Type[BaseOutputParser]) -> None:


 def test_all_subclasses_implement_unique_type() -> None:
-    types = []
+    types = defaultdict(list)
    for cls in _NON_ABSTRACT_PARSERS:
        try:
-            types.append(cls._type)
+            types[cls._type].append(cls.__name__)
        except NotImplementedError:
            # This is handled in the previous test
            pass
-    dups = set([t for t in types if types.count(t) > 1])
+    dups = {t: names for t, names in types.items() if len(names) > 1}
    assert not dups, f"Duplicate types: {dups}"
--- a/tests/unit_tests/test_dependencies.py
+++ b/tests/unit_tests/test_dependencies.py
@@ -38,7 +38,7 @@ def test_required_dependencies(poetry_conf: Mapping[str, Any]) -> None:
        "aiohttp",
        "async-timeout",
        "dataclasses-json",
-        "langchainplus-sdk",
+        "langsmith",
        "numexpr",
        "numpy",
        "openapi-schema-pydantic",
Author	SHA1	Message	Date
vowelparrot	9696d9a51c	Change Name to LangSmith	2023-06-26 16:57:14 -07:00
Zander Chase	c460b04c64	Update String Evaluator (#6615 ) - Add protocol for `evaluate_strings` - Move the criteria evaluator out so it's not restricted to being applied on traced runs	2023-06-26 14:16:14 -07:00
AaaCabbage	b3f8324de9	feat: fix the Chinese characters in the solution content will be conv… (#6734 ) fix the Chinese characters in the solution content will be converted to ascii encoding, resulting in an abnormally long number of tokens Co-authored-by: qixin <qixin@fintec.ai>	2023-06-26 13:14:48 -07:00
Chris Pappalardo	70f7c2bb2e	align chroma vectorstore get with chromadb to enable where filtering (#6686 ) allows for where filtering on collection via get - Description: aligns langchain chroma vectorstore get with underlying [chromadb collection get](https://github.com/chroma-core/chroma/blob/main/chromadb/api/models/Collection.py#L103) allowing for where filtering, etc. - Issue: NA - Dependencies: none - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: @pappanaka	2023-06-26 10:51:20 -07:00
Zander Chase	9ca3b4645e	Add support for tags in chain group context manager (#6668 ) Lets you specify local and inheritable tags in the group manager. Also, add more verbose docstrings for our reference docs.	2023-06-26 10:37:33 -07:00
Harrison Chase	d1bcc58beb	bump version to 216 (#6770 )	2023-06-26 09:46:19 -07:00
Zander Chase	6d30acffcb	Fix breaking tags (#6765 ) Fix tags change that broke old way of initializing agent Closes #6756	2023-06-26 09:28:11 -07:00
James Croft	ba622764cb	Improve performance when retrieving Notion DB pages (#6710 )	2023-06-26 05:46:09 -07:00
Richy Wang	ec8247ec59	Fixed bug in AnalyticDB Vector Store caused by upgrade SQLAlchemy version (#6736 )	2023-06-26 05:35:25 -07:00
Santiago Delgado	d84a3bcf7a	Office365 Tool (#6306 ) #### Background With the development of [structured tools](https://blog.langchain.dev/structured-tools/), the LangChain team expanded the platform's functionality to meet the needs of new applications. The GMail tool, empowered by structured tools, now supports multiple arguments and powerful search capabilities, demonstrating LangChain's ability to interact with dynamic data sources like email servers. #### Challenge The current GMail tool only supports GMail, while users often utilize other email services like Outlook in Office365. Additionally, the proposed calendar tool in PR https://github.com/hwchase17/langchain/pull/652 only works with Google Calendar, not Outlook. #### Changes This PR implements an Office365 integration for LangChain, enabling seamless email and calendar functionality with a single authentication process. #### Future Work With the core Office365 integration complete, future work could include integrating other Office365 tools such as Tasks and Address Book. #### Who can review? @hwchase17 or @vowelparrot can review this PR #### Appendix @janscas, I utilized your [O365](https://github.com/O365/python-o365) library extensively. Given the rising popularity of LangChain and similar AI frameworks, the convergence of libraries like O365 and tools like this one is likely. So, I wanted to keep you updated on our progress. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-26 02:59:09 -07:00
Xiaochao Dong	a15afc102c	Relax the action input check for actions that require no input (#6357 ) When the tool requires no input, the LLM often gives something like this: ```json { "action": "just_do_it" } ``` I have attempted to enhance the prompt, but it doesn't appear to be functioning effectively. Therefore, I believe we should consider easing the check a little bit. Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>	2023-06-26 02:30:17 -07:00
Ethan Bowen	cc33bde74f	Confluence added (#6432 ) Adding Confluence to Jira tool. Can create a page in Confluence with this PR. If accepted, will extend functionality to Bitbucket and additional Confluence features. --------- Co-authored-by: Ethan Bowen <ethan.bowen@slalom.com>	2023-06-26 02:28:04 -07:00
Surya Nudurupati	2aeb8e7dbc	Improved Documentation: Eliminating Redundancy in the Introduction.mdx (#6360 ) When the documentation was originally written there was a redundant typing of the word "using the"	2023-06-26 02:27:36 -07:00
rajib	0f6ef048d2	The openai_info.py does not have gpt-35-turbo which is the underlying Azure Open AI model name (#6321 ) Since this model name is not there in the list MODEL_COST_PER_1K_TOKENS, when we use get_openai_callback(), for gpt 3.5 model in Azure AI, we do not get the cost of the tokens. This will fix this issue #### Who can review? @hwchase17 @agola11 Co-authored-by: rajib76 <rajib76@yahoo.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-26 02:16:39 -07:00
ArchimedesFTW	fe941cb54a	Change tags(str) to tags(dict) in mlflow_callback.py docs (#6473 ) Fixes #6472 #### Who can review? @agola11	2023-06-26 02:12:23 -07:00
0xcrusher	9187d2f3a9	Fixed caching bug for Multiple Caching types by correctly checking types (#6746 ) - Fixed an issue where some caching types check the wrong types, hence not allowing caching to work Maintainer responsibilities: - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev	2023-06-26 01:14:32 -07:00
Harrison Chase	e9877ea8b1	Tiktoken override (#6697 )	2023-06-26 00:49:32 -07:00
Gabriel Altay	f9771700e4	prevent DuckDuckGoSearchAPIWrapper from consuming top result (#6727 ) remove the `next` call that checks for None on the results generator	2023-06-25 19:54:15 -07:00
Pau Ramon Revilla	87802c86d9	Added a MHTML document loader (#6311 ) MHTML is a very interesting format since it's used both for emails but also for archived webpages. Some scraping projects want to store pages in disk to process them later, mhtml is perfect for that use case. This is heavily inspired from the beautifulsoup html loader, but extracting the html part from the mhtml file. --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-25 13:12:08 -07:00
Janos Tolgyesi	05eec99269	beautifulsoup get_text kwargs in WebBaseLoader (#6591 ) # beautifulsoup get_text kwargs in WebBaseLoader - Description: this PR introduces an optional `bs_get_text_kwargs` parameter to `WebBaseLoader` constructor. It can be used to pass kwargs to the downstream BeautifulSoup.get_text call. The most common usage might be to pass a custom text separator, as seen also in `BSHTMLLoader`. - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: jtolgyesi	2023-06-25 12:42:27 -07:00
Matt Robinson	be68f6f8ce	feat: Add `UnstructuredRSTLoader` (#6594 ) ### Summary Adds an `UnstructuredRSTLoader` for loading [reStructuredText](https://en.wikipedia.org/wiki/ReStructuredText) file. ### Testing ```python from langchain.document_loaders import UnstructuredRSTLoader loader = UnstructuredRSTLoader( file_path="example_data/README.rst", mode="elements" ) docs = loader.load() print(docs[0]) ``` ### Reviewers - @hwchase17 - @rlancemartin - @eyurtsev	2023-06-25 12:41:57 -07:00
Chip Davis	b32cc01c9f	feat: added tqdm progress bar to UnstructuredURLLoader (#6600 ) - Description: Adds a simple progress bar with tqdm when using UnstructuredURLLoader. Exposes new paramater `show_progress_bar`. Very simple PR. - Issue: N/A - Dependencies: N/A - Tag maintainer: @rlancemartin @eyurtsev --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-25 12:41:25 -07:00
Augustine Theodore	afc292e58d	Fix WhatsAppChatLoader : Enable parsing additional formats (#6663 ) - Description: Updated regex to support a new format that was observed when whatsapp chat was exported. - Issue: #6654 - Dependencies: No new dependencies - Tag maintainer: @rlancemartin, @eyurtsev	2023-06-25 12:08:43 -07:00
Sumanth Donthula	3e30a5d967	updated sql_database.py for returning sorted table names. (#6692 ) Added code to get the tables info in sorted order in methods get_usable_table_names and get_table_info. Linked to Issue: #6640	2023-06-25 12:04:24 -07:00
刘方瑞	9d1b3bab76	Fix Typo in LangChain MyScale Integration Doc (#6705 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> - Description: Fix Typo in LangChain MyScale Integration Doc @hwchase17	2023-06-25 11:54:00 -07:00
sudolong	408c8d0178	fix chroma _similarity_search_with_relevance_scores missing `kwargs` … (#6708 ) Issue: https://github.com/hwchase17/langchain/issues/6707	2023-06-25 11:53:42 -07:00
Zander Chase	d89e10d361	Fix Multi Functions Agent Tracing (#6702 ) Confirmed it works now: https://dev.langchain.plus/public/0dc32ce0-55af-432e-b09e-5a1a220842f5/r	2023-06-25 10:39:04 -07:00
Harrison Chase	1742db0c30	bump version to 215 (#6719 )	2023-06-25 08:52:51 -07:00
Ankush Gola	e1b801be36	split up batch llm calls into separate runs (#5804 )	2023-06-24 21:03:31 -07:00