cr (#1436 )

Harrison/simple memory (#1435 )
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2026-02-04 08:10:25 +00:00 · 2023-03-04 08:38:56 -08:00 · 2023-03-04 08:15:52 -08:00 · 2023-03-04 08:10:15 -08:00 · 2023-03-04 07:56:07 -08:00 · 2023-03-04 00:22:31 -08:00
39 changed files with 639 additions and 86 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -106,6 +106,7 @@ celerybeat.pid

 # Environments
 .env
+.envrc
 .venv
 .venvs
 env/
--- a/docs/ecosystem/searx.md
+++ b/docs/ecosystem/searx.md
@@ -5,21 +5,44 @@ It is broken into two parts: installation and setup, and then references to the

 ## Installation and Setup

- You can find a list of public SearxNG instances [here](https://searx.space/). 
- It recommended to use a self-hosted instance to avoid abuse on the public instances. Also note that public instances often have a limit on the number of requests.
- To run a self-hosted instance see [this page](https://searxng.github.io/searxng/admin/installation.html) for more information.
- To use the tool you need to provide the searx host url by:
-    1. passing the named parameter `searx_host` when creating the instance.
-    2. exporting the environment variable `SEARXNG_HOST`. 
+While it is possible to utilize the wrapper in conjunction with  [public searx
+instances](https://searx.space/) these instances frequently do not permit API
+access (see note on output format below) and have limitations on the frequency
+of requests. It is recommended to opt for a self-hosted instance instead.
+
+### Self Hosted Instance:
+
+See [this page](https://searxng.github.io/searxng/admin/installation.html) for installation instructions.
+
+When you install SearxNG, the only active output format by default is the HTML format.
+You need to activate the `json` format to use the API. This can be done by adding the following line to the `settings.yml` file:
+```yaml
+search:
+    formats:
+        - html
+        - json
+```
+You can make sure that the API is working by issuing a curl request to the API endpoint:
+
+`curl -kLX GET --data-urlencode q='langchain' -d format=json http://localhost:8888`
+
+This should return a JSON object with the results.
+

 ## Wrappers

 ### Utility

+To use the wrapper we need to pass the host of the SearxNG instance to the wrapper with:
+    1. the named parameter `searx_host` when creating the instance.
+    2. exporting the environment variable `SEARXNG_HOST`.
+
 You can use the wrapper to get results from a SearxNG instance. 

 ```python
 from langchain.utilities import SearxSearchWrapper
+s = SearxSearchWrapper(searx_host="http://localhost:8888")
+s.run("what is a large language model?")
 ```

 ### Tool
@@ -29,7 +52,7 @@ You can do this with:

 ```python
 from langchain.agents import load_tools
-tools = load_tools(["searx-search"], searx_host="https://searx.example.com")
+tools = load_tools(["searx-search"], searx_host="http://localhost:8888")
 ```

-For more information on this, see [this page](../modules/agents/tools.md)
+For more information on tools, see [this page](../modules/agents/tools.md)
--- a/docs/getting_started/getting_started.md
+++ b/docs/getting_started/getting_started.md
@@ -66,7 +66,7 @@ llm = OpenAI(temperature=0.9)
 We can now call it on some input!

 ```python
-text = "What would be a good company name a company that makes colorful socks?"
+text = "What would be a good company name for a company that makes colorful socks?"
 print(llm(text))
 ```

--- a/docs/modules/agents/examples/search_tools.ipynb
+++ b/docs/modules/agents/examples/search_tools.ipynb
@@ -12,7 +12,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 2,
   "id": "e6860c2d",
   "metadata": {
    "pycharm": {
@@ -28,7 +28,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 3,
   "id": "dadbcfcd",
   "metadata": {},
   "outputs": [],
@@ -238,6 +238,92 @@
   "source": [
    "agent.run(\"What is the weather in Pomfret?\")"
   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "eabad3af",
+   "metadata": {},
+   "source": [
+    "## SearxNG Meta Search Engine\n",
+    "\n",
+    "Here we will be using a self hosted SearxNG meta search engine."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "b196c704",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tools = load_tools([\"searx-search\"], searx_host=\"http://localhost:8888\", llm=llm)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "9023eeaa",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "agent = initialize_agent(tools, llm, agent=\"zero-shot-react-description\", verbose=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "3aad92c1",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
+      "\u001b[32;1m\u001b[1;3m I should look up the current weather\n",
+      "Action: SearX Search\n",
+      "Action Input: \"weather in Pomfret\"\u001b[0m\n",
+      "Observation: \u001b[36;1m\u001b[1;3mMainly cloudy with snow showers around in the morning. High around 40F. Winds NNW at 5 to 10 mph. Chance of snow 40%. Snow accumulations less than one inch.\n",
+      "\n",
+      "10 Day Weather - Pomfret, MD As of 1:37 pm EST Today 49°/ 41° 52% Mon 27 | Day 49° 52% SE 14 mph Cloudy with occasional rain showers. High 49F. Winds SE at 10 to 20 mph. Chance of rain 50%....\n",
+      "\n",
+      "10 Day Weather - Pomfret, VT As of 3:51 am EST Special Weather Statement Today 39°/ 32° 37% Wed 01 | Day 39° 37% NE 4 mph Cloudy with snow showers developing for the afternoon. High 39F....\n",
+      "\n",
+      "Pomfret, CT ; Current Weather. 1:06 AM. 35°F · RealFeel® 32° ; TODAY'S WEATHER FORECAST. 3/3. 44°Hi. RealFeel® 50° ; TONIGHT'S WEATHER FORECAST. 3/3. 32°Lo.\n",
+      "\n",
+      "Pomfret, MD Forecast Today Hourly Daily Morning 41° 1% Afternoon 43° 0% Evening 35° 3% Overnight 34° 2% Don't Miss Finally, Here’s Why We Get More Colds and Flu When It’s Cold Coast-To-Coast...\n",
+      "\n",
+      "Pomfret, MD Weather Forecast | AccuWeather Current Weather 5:35 PM 35° F RealFeel® 36° RealFeel Shade™ 36° Air Quality Excellent Wind E 3 mph Wind Gusts 5 mph Cloudy More Details WinterCast...\n",
+      "\n",
+      "Pomfret, VT Weather Forecast | AccuWeather Current Weather 11:21 AM 23° F RealFeel® 27° RealFeel Shade™ 25° Air Quality Fair Wind ESE 3 mph Wind Gusts 7 mph Cloudy More Details WinterCast...\n",
+      "\n",
+      "Pomfret Center, CT Weather Forecast | AccuWeather Daily Current Weather 6:50 PM 39° F RealFeel® 36° Air Quality Fair Wind NW 6 mph Wind Gusts 16 mph Mostly clear More Details WinterCast...\n",
+      "\n",
+      "12:00 pm · Feels Like36° · WindN 5 mph · Humidity43% · UV Index3 of 10 · Cloud Cover65% · Rain Amount0 in ...\n",
+      "\n",
+      "Pomfret Center, CT Weather Conditions | Weather Underground star Popular Cities San Francisco, CA 49 °F Clear Manhattan, NY 37 °F Fair Schiller Park, IL (60176) warning39 °F Mostly Cloudy...\u001b[0m\n",
+      "Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
+      "Final Answer: The current weather in Pomfret is mainly cloudy with snow showers around in the morning. The temperature is around 40F with winds NNW at 5 to 10 mph. Chance of snow is 40%.\u001b[0m\n",
+      "\n",
+      "\u001b[1m> Finished chain.\u001b[0m\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "'The current weather in Pomfret is mainly cloudy with snow showers around in the morning. The temperature is around 40F with winds NNW at 5 to 10 mph. Chance of snow is 40%.'"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "agent.run(\"What is the weather in Pomfret\")"
+   ]
  }
 ],
 "metadata": {
@@ -256,7 +342,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.9.11"
  },
  "vscode": {
   "interpreter": {
--- a/docs/modules/chains/generic/sequential_chains.ipynb
+++ b/docs/modules/chains/generic/sequential_chains.ipynb
@@ -36,6 +36,25 @@
  {
   "cell_type": "code",
   "execution_count": 1,
+   "id": "7a886879",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "cannot find .env file\n"
+     ]
+    }
+   ],
+   "source": [
+    "%load_ext dotenv\n",
+    "%dotenv"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
   "id": "3f2f9b8c",
   "metadata": {},
   "outputs": [],
@@ -47,7 +66,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 3,
   "id": "b8237d1a",
   "metadata": {},
   "outputs": [],
@@ -64,7 +83,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 4,
   "id": "4a391730",
   "metadata": {},
   "outputs": [],
@@ -82,7 +101,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 5,
   "id": "9368bd63",
   "metadata": {},
   "outputs": [],
@@ -94,7 +113,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 6,
   "id": "d39e15f5",
   "metadata": {},
   "outputs": [
@@ -107,22 +126,20 @@
      "\u001b[1m> Entering new SimpleSequentialChain chain...\u001b[0m\n",
      "\u001b[36;1m\u001b[1;3m\n",
      "\n",
-      "Tragedy at Sunset on the Beach follows the story of a young couple, Jack and Annie, who have just started to explore the possibility of a relationship together. After a day spent in the sun and sand, they decide to take a romantic stroll down the beach as the sun sets. \n",
+      "Tragedy at Sunset on the Beach is a story of a young couple, Jack and Sarah, who are in love and looking forward to their future together. On the night of their anniversary, they decide to take a walk on the beach at sunset. As they are walking, they come across a mysterious figure, who tells them that their love will be tested in the near future. \n",
      "\n",
-      "However, their romantic evening quickly turns tragic when they stumble upon a body lying in the sand. As they approach to investigate, they are shocked to discover that it is Jack's long-lost brother, who has been missing for several years. \n",
+      "The figure then tells the couple that the sun will soon set, and with it, a tragedy will strike. If Jack and Sarah can stay together and pass the test, they will be granted everlasting love. However, if they fail, their love will be lost forever.\n",
      "\n",
-      "The story follows Jack and Annie as they navigate their way through the tragedy and their newfound relationship. With the help of their friends, family, and the beach's inhabitants, Jack and Annie must come to terms with their deep-seated emotions and the reality of the situation. \n",
-      "\n",
-      "Ultimately, the play explores themes of family, love, and loss, as Jack and Annie's story unfolds against the beautiful backdrop of the beach at sunset.\u001b[0m\n",
+      "The play follows the couple as they struggle to stay together and battle the forces that threaten to tear them apart. Despite the tragedy that awaits them, they remain devoted to one another and fight to keep their love alive. In the end, the couple must decide whether to take a chance on their future together or succumb to the tragedy of the sunset.\u001b[0m\n",
      "\u001b[33;1m\u001b[1;3m\n",
      "\n",
-      "Tragedy at Sunset on the Beach is an emotionally complex tale of family, love, and loss. Told against the beautiful backdrop of a beach at sunset, the story follows Jack and Annie, a young couple just beginning to explore a relationship together. When they stumble upon the body of Jack's long-lost brother on the beach, they must face the reality of the tragedy and come to terms with their deep-seated emotions. \n",
+      "Tragedy at Sunset on the Beach is an emotionally gripping story of love, hope, and sacrifice. Through the story of Jack and Sarah, the audience is taken on a journey of self-discovery and the power of love to overcome even the greatest of obstacles. \n",
      "\n",
-      "The playwright has crafted a heartfelt and thought-provoking story, one that probes into the depths of the human experience. The cast of characters is well-rounded and fully realized, and the dialogue is natural and emotional. The direction and choreography are top-notch, and the scenic design is breathtaking. \n",
+      "The play's talented cast brings the characters to life, allowing us to feel the depths of their emotion and the intensity of their struggle. With its compelling story and captivating performances, this play is sure to draw in audiences and leave them on the edge of their seats. \n",
      "\n",
-      "Overall, Tragedy at Sunset on the Beach is a powerful and moving story about the fragility of life and the strength of love. It is sure to tug at your heartstrings and leave you with a newfound appreciation of life's precious moments. Highly recommended.\u001b[0m\n",
+      "The play's setting of the beach at sunset adds a touch of poignancy and romanticism to the story, while the mysterious figure serves to keep the audience enthralled. Overall, Tragedy at Sunset on the Beach is an engaging and thought-provoking play that is sure to leave audiences feeling inspired and hopeful.\u001b[0m\n",
      "\n",
-      "\u001b[1m> Finished SimpleSequentialChain chain.\u001b[0m\n"
+      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    }
   ],
@@ -132,7 +149,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 7,
   "id": "c6649a01",
   "metadata": {},
   "outputs": [
@@ -142,11 +159,11 @@
     "text": [
      "\n",
      "\n",
-      "Tragedy at Sunset on the Beach is an emotionally complex tale of family, love, and loss. Told against the beautiful backdrop of a beach at sunset, the story follows Jack and Annie, a young couple just beginning to explore a relationship together. When they stumble upon the body of Jack's long-lost brother on the beach, they must face the reality of the tragedy and come to terms with their deep-seated emotions. \n",
+      "Tragedy at Sunset on the Beach is an emotionally gripping story of love, hope, and sacrifice. Through the story of Jack and Sarah, the audience is taken on a journey of self-discovery and the power of love to overcome even the greatest of obstacles. \n",
      "\n",
-      "The playwright has crafted a heartfelt and thought-provoking story, one that probes into the depths of the human experience. The cast of characters is well-rounded and fully realized, and the dialogue is natural and emotional. The direction and choreography are top-notch, and the scenic design is breathtaking. \n",
+      "The play's talented cast brings the characters to life, allowing us to feel the depths of their emotion and the intensity of their struggle. With its compelling story and captivating performances, this play is sure to draw in audiences and leave them on the edge of their seats. \n",
      "\n",
-      "Overall, Tragedy at Sunset on the Beach is a powerful and moving story about the fragility of life and the strength of love. It is sure to tug at your heartstrings and leave you with a newfound appreciation of life's precious moments. Highly recommended.\n"
+      "The play's setting of the beach at sunset adds a touch of poignancy and romanticism to the story, while the mysterious figure serves to keep the audience enthralled. Overall, Tragedy at Sunset on the Beach is an engaging and thought-provoking play that is sure to leave audiences feeling inspired and hopeful.\n"
     ]
    }
   ],
@@ -167,7 +184,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 8,
   "id": "02016a51",
   "metadata": {},
   "outputs": [],
@@ -185,7 +202,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 9,
   "id": "8bd38cc2",
   "metadata": {},
   "outputs": [],
@@ -203,7 +220,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 10,
   "id": "524523af",
   "metadata": {},
   "outputs": [],
@@ -220,7 +237,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 11,
   "id": "3fd3a7be",
   "metadata": {},
   "outputs": [
@@ -231,14 +248,8 @@
      "\n",
      "\n",
      "\u001b[1m> Entering new SequentialChain chain...\u001b[0m\n",
-      "\u001b[1mChain 0\u001b[0m:\n",
-      "{'synopsis': \" \\n\\nTragedy at Sunset on the Beach is a dark and gripping drama set in Victorian England. The play follows the story of two lovers, Emma and Edward, whose passionate relationship is threatened by the strict rules and regulations of the time.\\n\\nThe two are deeply in love, but Edward is from a wealthy family and Emma is from a lower class background. Despite the obstacles, the two are determined to be together and decide to elope.\\n\\nOn the night of their planned escape, Emma and Edward meet at the beach at sunset to declare their love for one another and begin a new life together. However, their plans are disrupted when Emma's father discovers their plan and appears on the beach with a gun.\\n\\nIn a heartbreaking scene, Emma's father orders Edward to leave, but Edward refuses and fights for their love. In a fit of rage, Emma's father shoots Edward, killing him instantly. \\n\\nThe tragedy of the play lies in the fact that Emma and Edward are denied their chance at a happy ending due to the rigid social conventions of Victorian England. The audience is left with a heavy heart as the play ends with Emma standing alone on the beach, mourning the loss of her beloved.\"}\n",
      "\n",
-      "\u001b[1mChain 1\u001b[0m:\n",
-      "{'review': \"\\n\\nTragedy at Sunset on the Beach is an emotionally charged production that will leave audiences heartsick. The play follows the ill-fated love story of Emma and Edward, two star-crossed lovers whose passionate relationship is tragically thwarted by Victorian England's societal conventions. The performance is captivating from start to finish, as the audience is taken on an emotional rollercoaster of love, loss, and heartbreak.\\n\\nThe acting is powerful and sincere, and the performances of the two leads are particularly stirring. Emma and Edward are both portrayed with such tenderness and emotion that it's hard not to feel their pain as they fight for their forbidden love. The climactic scene, in which Edward is shot by Emma's father, is especially heartbreaking and will leave audience members on the edge of their seats.\\n\\nOverall, Tragedy at Sunset on the Beach is a powerful and moving work of theatre. It is a tragedy of impossible love, and a vivid reminder of the devastating consequences of social injustice. The play is sure to leave a lasting impression on anyone who experiences it.\"}\n",
-      "\n",
-      "\n",
-      "\u001b[1m> Finished SequentialChain chain.\u001b[0m\n"
+      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    }
   ],
@@ -246,10 +257,91 @@
    "review = overall_chain({\"title\":\"Tragedy at sunset on the beach\", \"era\": \"Victorian England\"})"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "d2fac817",
+   "metadata": {},
+   "source": [
+    "### Memory in Sequential Chains\n",
+    "Sometimes you may want to pass along some context to use in each step of the chain or in a later part of the chain, but maintaining and chaining together the input/output variables can quickly get messy.  Using `SimpleMemory` is a convenient way to do manage this and clean up your chains.\n",
+    "\n",
+    "For example, using the previous playwright SequentialChain, lets say you wanted to include some context about date, time and location of the play, and using the generated synopsis and review, create some social media post text.  You could add these new context variables as `input_variables`, or we can add a `SimpleMemory` to the chain to manage this context:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b2cf3098",
+   "metadata": {},
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "6b7b3a7a",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\u001b[1m> Entering new SequentialChain chain...\u001b[0m\n",
+      "\n",
+      "\u001b[1m> Finished chain.\u001b[0m\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "{'title': 'Tragedy at sunset on the beach',\n",
+       " 'era': 'Victorian England',\n",
+       " 'time': 'December 25th, 8pm PST',\n",
+       " 'location': 'Theater in the Park',\n",
+       " 'social_post_text': \"\\nSpend your Christmas night with us at Theater in the Park and experience the heartbreaking story of love and loss that is 'A Walk on the Beach'. Set in Victorian England, this romantic tragedy follows the story of Frances and Edward, a young couple whose love is tragically cut short. Don't miss this emotional and thought-provoking production that is sure to leave you in tears. #AWalkOnTheBeach #LoveAndLoss #TheaterInThePark #VictorianEngland\"}"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain.chains import SequentialChain\n",
+    "from langchain.chains.base import SimpleMemory\n",
+    "\n",
+    "llm = OpenAI(temperature=.7)\n",
+    "template = \"\"\"You are a social media manager for a theater company.  Given the title of play, the era it is set in, the date,time and location, the synopsis of the play, and the review of the play, it is your job to write a social media post for that play.\n",
+    "\n",
+    "Here is some context about the time and location of the play:\n",
+    "Date and Time: {time}\n",
+    "Location: {location}\n",
+    "\n",
+    "Play Synopsis:\n",
+    "{synopsis}\n",
+    "Review from a New York Times play critic of the above play:\n",
+    "{review}\n",
+    "\n",
+    "Social Media Post:\n",
+    "\"\"\"\n",
+    "prompt_template = PromptTemplate(input_variables=[\"synopsis\", \"review\", \"time\", \"location\"], template=template)\n",
+    "social_chain = LLMChain(llm=llm, prompt=prompt_template, output_key=\"social_post_text\")\n",
+    "\n",
+    "overall_chain = SequentialChain(\n",
+    "    memory=SimpleMemory(memories={\"time\": \"December 25th, 8pm PST\", \"location\": \"Theater in the Park\"}),\n",
+    "    chains=[synopsis_chain, review_chain, social_chain],\n",
+    "    input_variables=[\"era\", \"title\"],\n",
+    "    # Here we return multiple variables\n",
+    "    output_variables=[\"social_post_text\"],\n",
+    "    verbose=True)\n",
+    "\n",
+    "overall_chain({\"title\":\"Tragedy at sunset on the beach\", \"era\": \"Victorian England\"})"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,
-   "id": "6be70d27",
+   "id": "ee9bc09c",
   "metadata": {},
   "outputs": [],
   "source": []
@@ -271,7 +363,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.9"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/modules/document_loaders/examples/pdf.ipynb
+++ b/docs/modules/document_loaders/examples/pdf.ipynb
@@ -104,10 +104,10 @@
      "Efficient Data AnnotationC u s t o m i z e d  M o d e l  T r a i n i n gModel Cust omizationDI A Model HubDI A Pipeline SharingCommunity PlatformLa y out Detection ModelsDocument Images \n",
      "T h e  C o r e  L a y o u t P a r s e r  L i b r a r yOCR ModuleSt or age & VisualizationLa y out Data Structur e\n",
      "Fig. 1: The overall architecture of LayoutParser . For an input document image,\n",
-      "the core LayoutParser library provides a set of o\u000b",
+      "the core LayoutParser library provides a set of o\u000B",
      "-the-shelf tools for layout\n",
      "detection, OCR, visualization, and storage, backed by a carefully designed layout\n",
-      "data structure. LayoutParser also supports high level customization via e\u000ecient\n",
+      "data structure. LayoutParser also supports high level customization via e\u000Ecient\n",
      "layout annotation and model training functions. These improve model accuracy\n",
      "on the target samples. The community platform enables the easy sharing of DIA\n",
      "models and whole digitization pipelines to promote reusability and reproducibility.\n",
@@ -128,10 +128,10 @@
      "gure layouts) and\n",
      "HJDataset [31](historical Japanese document layouts). A spectrum of models\n",
      "trained on these datasets are currently available in the LayoutParser model zoo\n",
-      "to support di\u000b",
+      "to support di\u000B",
      "erent use cases.\n",
      "3 The Core LayoutParser Library\n",
-      "At the core of LayoutParser is an o\u000b",
+      "At the core of LayoutParser is an o\u000B",
      "-the-shelf toolkit that streamlines DL-\n",
      "based document image analysis. Five components support a simple interface\n",
      "with comprehensive functionalities: 1) The layout detection models enable using\n",
@@ -266,13 +266,87 @@
    "data = loader.load()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "## Using PyMuPDF\n",
+    "\n",
+    "This is the fastest of the PDF parsing options, and contains detailed metadata about the PDF and its pages, as well as returns one document per page."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import PyMuPDFLoader"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "outputs": [],
+   "source": [
+    "loader = PyMuPDFLoader(\"example_data/layout-parser-paper.pdf\")"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "outputs": [],
+   "source": [
+    "data = loader.load()"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "outputs": [
+    {
+     "data": {
+      "text/plain": "Document(page_content='LayoutParser: A Uniﬁed Toolkit for Deep\\nLearning Based Document Image Analysis\\nZejiang Shen1 (<28>), Ruochen Zhang2, Melissa Dell3, Benjamin Charles Germain\\nLee4, Jacob Carlson3, and Weining Li5\\n1 Allen Institute for AI\\nshannons@allenai.org\\n2 Brown University\\nruochen zhang@brown.edu\\n3 Harvard University\\n{melissadell,jacob carlson}@fas.harvard.edu\\n4 University of Washington\\nbcgl@cs.washington.edu\\n5 University of Waterloo\\nw422li@uwaterloo.ca\\nAbstract. Recent advances in document image analysis (DIA) have been\\nprimarily driven by the application of neural networks. Ideally, research\\noutcomes could be easily deployed in production and extended for further\\ninvestigation. However, various factors like loosely organized codebases\\nand sophisticated model conﬁgurations complicate the easy reuse of im-\\nportant innovations by a wide audience. Though there have been on-going\\neﬀorts to improve reusability and simplify deep learning (DL) model\\ndevelopment in disciplines like natural language processing and computer\\nvision, none of them are optimized for challenges in the domain of DIA.\\nThis represents a major gap in the existing toolkit, as DIA is central to\\nacademic research across a wide range of disciplines in the social sciences\\nand humanities. This paper introduces LayoutParser, an open-source\\nlibrary for streamlining the usage of DL in DIA research and applica-\\ntions. The core LayoutParser library comes with a set of simple and\\nintuitive interfaces for applying and customizing DL models for layout de-\\ntection, character recognition, and many other document processing tasks.\\nTo promote extensibility, LayoutParser also incorporates a community\\nplatform for sharing both pre-trained models and full document digiti-\\nzation pipelines. We demonstrate that LayoutParser is helpful for both\\nlightweight and large-scale digitization pipelines in real-word use cases.\\nThe library is publicly available at https://layout-parser.github.io.\\nKeywords: Document Image Analysis · Deep Learning · Layout Analysis\\n· Character Recognition · Open Source library · Toolkit.\\n1\\nIntroduction\\nDeep Learning(DL)-based approaches are the state-of-the-art for a wide range of\\ndocument image analysis (DIA) tasks including document image classiﬁcation [11,\\narXiv:2103.15348v2  [cs.CV]  21 Jun 2021\\n', lookup_str='', metadata={'file_path': 'example_data/layout-parser-paper.pdf', 'page_number': 1, 'total_pages': 16, 'format': 'PDF 1.5', 'title': '', 'author': '', 'subject': '', 'keywords': '', 'creator': 'LaTeX with hyperref', 'producer': 'pdfTeX-1.40.21', 'creationDate': 'D:20210622012710Z', 'modDate': 'D:20210622012710Z', 'trapped': '', 'encryption': None}, lookup_index=0)"
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "data[0]"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "Additionally, you can pass along any of the options from the [PyMuPDF documentation](https://pymupdf.readthedocs.io/en/latest/app1.html#plain-text/) as keyword arguments in the `load` call, and it will be pass along to the `get_text()` call."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
  {
   "cell_type": "code",
   "execution_count": null,
-   "id": "7301c473",
-   "metadata": {},
   "outputs": [],
-   "source": []
+   "source": [],
+   "metadata": {
+    "collapsed": false
+   }
  }
 ],
 "metadata": {
--- a/docs/modules/indexes/chain_examples/vector_db_qa_with_sources.ipynb
+++ b/docs/modules/indexes/chain_examples/vector_db_qa_with_sources.ipynb
@@ -21,7 +21,7 @@
    "from langchain.embeddings.cohere import CohereEmbeddings\n",
    "from langchain.text_splitter import CharacterTextSplitter\n",
    "from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch\n",
-    "from langchain.vectorstores import Chromaoma"
+    "from langchain.vectorstores import Chroma"
   ]
  },
  {
@@ -215,4 +215,4 @@
 },
 "nbformat": 4,
 "nbformat_minor": 5
-}
+}
--- a/docs/modules/indexes/vectorstore_examples/chroma.ipynb
+++ b/docs/modules/indexes/vectorstore_examples/chroma.ipynb
@@ -89,6 +89,46 @@
    "print(docs[0].page_content)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "18152965",
+   "metadata": {},
+   "source": [
+    "## Similarity search with score"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "72aaa9c8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs = db.similarity_search_with_score(query)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "d88e958e",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "(Document(page_content='In state after state, new laws have been passed, not only to suppress the vote, but to subvert entire elections. \\n\\nWe cannot let this happen. \\n\\nTonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', lookup_str='', metadata={'source': '../../state_of_the_union.txt'}, lookup_index=0),\n",
+       " 0.3913410007953644)"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "docs[0]"
+   ]
+  },
  {
   "cell_type": "markdown",
   "id": "8061454b",
--- a/docs/modules/prompts/examples/example_prompt.json
+++ b/docs/modules/prompts/examples/example_prompt.json
@@ -1,4 +1,5 @@
 {
+    "_type": "prompt",
    "input_variables": ["input", "output"],
    "template": "Input: {input}\nOutput: {output}" 
 }
--- a/docs/modules/prompts/examples/few_shot_prompt.json
+++ b/docs/modules/prompts/examples/few_shot_prompt.json
@@ -3,6 +3,7 @@
    "input_variables": ["adjective"],
    "prefix": "Write antonyms for the following words.",
    "example_prompt": {
+        "_type": "prompt",
        "input_variables": ["input", "output"],
        "template": "Input: {input}\nOutput: {output}"
    },
--- a/docs/modules/prompts/examples/few_shot_prompt.yaml
+++ b/docs/modules/prompts/examples/few_shot_prompt.yaml
@@ -4,6 +4,7 @@ input_variables:
 prefix: 
    Write antonyms for the following words.
 example_prompt:
+    _type: prompt
    input_variables:
        ["input", "output"]
    template:
--- a/docs/modules/prompts/examples/few_shot_prompt_examples_in.json
+++ b/docs/modules/prompts/examples/few_shot_prompt_examples_in.json
@@ -3,6 +3,7 @@
    "input_variables": ["adjective"],
    "prefix": "Write antonyms for the following words.",
    "example_prompt": {
+        "_type": "prompt",
        "input_variables": ["input", "output"],
        "template": "Input: {input}\nOutput: {output}"
    },
--- a/docs/modules/prompts/examples/few_shot_prompt_yaml_examples.yaml
+++ b/docs/modules/prompts/examples/few_shot_prompt_yaml_examples.yaml
@@ -4,6 +4,7 @@ input_variables:
 prefix: 
    Write antonyms for the following words.
 example_prompt:
+    _type: prompt
    input_variables:
        ["input", "output"]
    template:
--- a/docs/modules/prompts/examples/prompt_serialization.ipynb
+++ b/docs/modules/prompts/examples/prompt_serialization.ipynb
@@ -58,6 +58,7 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
+      "_type: prompt\r\n",
      "input_variables:\r\n",
      "    [\"adjective\", \"content\"]\r\n",
      "template: \r\n",
@@ -108,6 +109,7 @@
     "output_type": "stream",
     "text": [
      "{\r\n",
+      "    \"_type\": \"prompt\",\r\n",
      "    \"input_variables\": [\"adjective\", \"content\"],\r\n",
      "    \"template\": \"Tell me a {adjective} joke about {content}.\"\r\n",
      "}\r\n"
@@ -156,6 +158,7 @@
     "output_type": "stream",
     "text": [
      "{\r\n",
+      "    \"_type\": \"prompt\",\r\n",
      "    \"input_variables\": [\"adjective\", \"content\"],\r\n",
      "    \"template_path\": \"simple_template.txt\"\r\n",
      "}\r\n"
@@ -279,6 +282,7 @@
      "prefix: \r\n",
      "    Write antonyms for the following words.\r\n",
      "example_prompt:\r\n",
+      "    _type: prompt\r\n",
      "    input_variables:\r\n",
      "        [\"input\", \"output\"]\r\n",
      "    template:\r\n",
@@ -346,6 +350,7 @@
      "prefix: \r\n",
      "    Write antonyms for the following words.\r\n",
      "example_prompt:\r\n",
+      "    _type: prompt\r\n",
      "    input_variables:\r\n",
      "        [\"input\", \"output\"]\r\n",
      "    template:\r\n",
@@ -413,6 +418,7 @@
      "    \"input_variables\": [\"adjective\"],\r\n",
      "    \"prefix\": \"Write antonyms for the following words.\",\r\n",
      "    \"example_prompt\": {\r\n",
+      "        \"_type\": \"prompt\",\r\n",
      "        \"input_variables\": [\"input\", \"output\"],\r\n",
      "        \"template\": \"Input: {input}\\nOutput: {output}\"\r\n",
      "    },\r\n",
@@ -478,6 +484,7 @@
      "    \"input_variables\": [\"adjective\"],\r\n",
      "    \"prefix\": \"Write antonyms for the following words.\",\r\n",
      "    \"example_prompt\": {\r\n",
+      "        \"_type\": \"prompt\",\r\n",
      "        \"input_variables\": [\"input\", \"output\"],\r\n",
      "        \"template\": \"Input: {input}\\nOutput: {output}\"\r\n",
      "    },\r\n",
@@ -542,6 +549,7 @@
     "output_type": "stream",
     "text": [
      "{\r\n",
+      "    \"_type\": \"prompt\",\r\n",
      "    \"input_variables\": [\"input\", \"output\"],\r\n",
      "    \"template\": \"Input: {input}\\nOutput: {output}\" \r\n",
      "}\r\n"
@@ -622,7 +630,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.9"
+   "version": "3.11.2"
  },
  "vscode": {
   "interpreter": {
--- a/docs/modules/prompts/examples/simple_prompt.json
+++ b/docs/modules/prompts/examples/simple_prompt.json
@@ -1,4 +1,5 @@
 {
+    "_type": "prompt",
    "input_variables": ["adjective", "content"],
    "template": "Tell me a {adjective} joke about {content}."
 }
--- a/docs/modules/prompts/examples/simple_prompt.yaml
+++ b/docs/modules/prompts/examples/simple_prompt.yaml
@@ -1,3 +1,4 @@
+_type: prompt
 input_variables:
    ["adjective", "content"]
 template: 
--- a/docs/modules/prompts/examples/simple_prompt_with_template_file.json
+++ b/docs/modules/prompts/examples/simple_prompt_with_template_file.json
@@ -1,4 +1,5 @@
 {
+    "_type": "prompt",
    "input_variables": ["adjective", "content"],
    "template_path": "simple_template.txt"
 }
--- a/docs/use_cases/evaluation/huggingface_datasets.ipynb
+++ b/docs/use_cases/evaluation/huggingface_datasets.ipynb
@@ -5,9 +5,9 @@
   "id": "3cadcf88",
   "metadata": {},
   "source": [
-    "# Using HuggingFace Datasets\n",
+    "# Using Hugging Face Datasets\n",
    "\n",
-    "This example shows how to use HuggingFace datasets to evaluate models. Specifically, we show how to load examples to evaluate models on from HuggingFace's dataset package."
+    "This example shows how to use Hugging Face datasets to evaluate models. Specifically, we show how to load examples to evaluate models on from Hugging Face's dataset package."
   ]
  },
  {
@@ -60,7 +60,7 @@
   "source": [
    "## Examples\n",
    "\n",
-    "Now we load a dataset from HuggingFace, and then convert it to a list of dictionaries for easier usage."
+    "Now we load a dataset from Hugging Face, and then convert it to a list of dictionaries for easier usage."
   ]
  },
  {
--- a/langchain/agents/initialize.py
+++ b/langchain/agents/initialize.py
@@ -17,12 +17,12 @@ def initialize_agent(
    agent_kwargs: Optional[dict] = None,
    **kwargs: Any,
 ) -> AgentExecutor:
-    """Load agent given tools and LLM.
+    """Load an agent executor given tools and LLM.

    Args:
        tools: List of tools this agent has access to.
        llm: Language model to use as the agent.
-        agent: The agent to use. Valid options are:
+        agent: A string that specified the agent type to use. Valid options are:
            `zero-shot-react-description`
            `react-docstore`
            `self-ask-with-search`
@@ -32,10 +32,11 @@ def initialize_agent(
        callback_manager: CallbackManager to use. Global callback manager is used if
            not provided. Defaults to None.
        agent_path: Path to serialized agent to use.
-        **kwargs: Additional key word arguments to pass to the agent.
+        agent_kwargs: Additional key word arguments to pass to the underlying agent
+        **kwargs: Additional key word arguments passed to the agent executor

    Returns:
-        An agent.
+        An agent executor
    """
    if agent is None and agent_path is None:
        agent = "zero-shot-react-description"
--- a/langchain/chains/base.py
+++ b/langchain/chains/base.py
@@ -28,7 +28,10 @@ class Memory(BaseModel, ABC):

    @abstractmethod
    def load_memory_variables(self, inputs: Dict[str, Any]) -> Dict[str, str]:
-        """Return key-value pairs given the text input to the chain."""
+        """Return key-value pairs given the text input to the chain.
+
+        If None, return all memories
+        """

    @abstractmethod
    def save_context(self, inputs: Dict[str, Any], outputs: Dict[str, str]) -> None:
@@ -39,6 +42,29 @@ class Memory(BaseModel, ABC):
        """Clear memory contents."""


+class SimpleMemory(Memory, BaseModel):
+    """Simple memory for storing context or other bits of information that shouldn't
+    ever change between prompts.
+    """
+
+    memories: Dict[str, Any] = dict()
+
+    @property
+    def memory_variables(self) -> List[str]:
+        return list(self.memories.keys())
+
+    def load_memory_variables(self, inputs: Dict[str, Any]) -> Dict[str, str]:
+        return self.memories
+
+    def save_context(self, inputs: Dict[str, Any], outputs: Dict[str, str]) -> None:
+        """Nothing should be saved or changed, my memory is set in stone."""
+        pass
+
+    def clear(self) -> None:
+        """Nothing to clear, got a memory like a vault."""
+        pass
+
+
 def _get_verbosity() -> bool:
    return langchain.verbose

--- a/langchain/chains/conversation/memory.py
+++ b/langchain/chains/conversation/memory.py
@@ -276,6 +276,12 @@ class ConversationEntityMemory(Memory, BaseModel):
        }

    def save_context(self, inputs: Dict[str, Any], outputs: Dict[str, str]) -> None:
+        if inputs is None:
+            raise ValueError(
+                "Inputs must be provided to save context from "
+                "ConversationEntityMemory."
+            )
+
        """Save context from this conversation to buffer."""
        if self.input_key is None:
            prompt_input_key = _get_prompt_input_key(inputs, self.memory_variables)
--- a/langchain/chains/llm.py
+++ b/langchain/chains/llm.py
@@ -131,10 +131,12 @@ class LLMChain(Chain, BaseModel):
        ]

    def _call(self, inputs: Dict[str, Any]) -> Dict[str, str]:
-        return self.apply([inputs])[0]
+        known_values = self.prep_inputs(inputs.copy())
+        return self.apply([known_values])[0]

    async def _acall(self, inputs: Dict[str, Any]) -> Dict[str, str]:
-        return (await self.aapply([inputs]))[0]
+        known_values = self.prep_inputs(inputs.copy())
+        return (await self.aapply([known_values]))[0]

    def predict(self, **kwargs: Any) -> str:
        """Format prompt with kwargs and pass to LLM.
--- a/langchain/chains/llm_math/base.py
+++ b/langchain/chains/llm_math/base.py
@@ -62,6 +62,8 @@ class LLMMathChain(Chain, BaseModel):
            answer = "Answer: " + output
        elif t.startswith("Answer:"):
            answer = t
+        elif "Answer:" in t:
+            answer = "Answer: " + t.split("Answer:")[-1]
        else:
            raise ValueError(f"unknown format from LLM: {t}")
        return {self.output_key: answer}
--- a/langchain/chains/sequential.py
+++ b/langchain/chains/sequential.py
@@ -1,5 +1,4 @@
 """Chain pipeline where the outputs of one step feed directly into next."""
-
 from typing import Dict, List

 from pydantic import BaseModel, Extra, root_validator
@@ -9,7 +8,7 @@ from langchain.input import get_color_mapping


 class SequentialChain(Chain, BaseModel):
-    """Chain where the outputs of one step feed directly into next."""
+    """Chain where the outputs of one chain feed directly into next."""

    chains: List[Chain]
    input_variables: List[str]
@@ -24,7 +23,7 @@ class SequentialChain(Chain, BaseModel):

    @property
    def input_keys(self) -> List[str]:
-        """Expect input key.
+        """Return expected input keys to the chain.

        :meta private:
        """
@@ -43,7 +42,20 @@ class SequentialChain(Chain, BaseModel):
        """Validate that the correct inputs exist for all chains."""
        chains = values["chains"]
        input_variables = values["input_variables"]
-        known_variables = set(input_variables)
+        memory_keys = list()
+        if "memory" in values and values["memory"] is not None:
+            """Validate that prompt input variables are consistent."""
+            memory_keys = values["memory"].memory_variables
+            if any(input_variables) in memory_keys:
+                overlapping_keys = input_variables & memory_keys
+                raise ValueError(
+                    f"The the input key(s) {''.join(overlapping_keys)} are found "
+                    f"in the Memory keys ({memory_keys}) - please use input and "
+                    f"memory keys that don't overlap."
+                )
+
+        known_variables = set(input_variables + memory_keys)
+
        for chain in chains:
            missing_vars = set(chain.input_keys).difference(known_variables)
            if missing_vars:
@@ -56,6 +68,7 @@ class SequentialChain(Chain, BaseModel):
                raise ValueError(
                    f"Chain returned keys that already exist: {overlapping_keys}"
                )
+
            known_variables |= set(chain.output_keys)

        if "output_variables" not in values:
@@ -70,6 +83,7 @@ class SequentialChain(Chain, BaseModel):
                raise ValueError(
                    f"Expected output variables that were not found: {missing_vars}."
                )
+
        return values

    def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
--- a/langchain/document_loaders/init.py
+++ b/langchain/document_loaders/init.py
@@ -24,7 +24,11 @@ from langchain.document_loaders.notion import NotionDirectoryLoader
 from langchain.document_loaders.obsidian import ObsidianLoader
 from langchain.document_loaders.online_pdf import OnlinePDFLoader
 from langchain.document_loaders.paged_pdf import PagedPDFSplitter
-from langchain.document_loaders.pdf import PDFMinerLoader, UnstructuredPDFLoader
+from langchain.document_loaders.pdf import (
+    PDFMinerLoader,
+    PyMuPDFLoader,
+    UnstructuredPDFLoader,
+)
 from langchain.document_loaders.powerpoint import UnstructuredPowerPointLoader
 from langchain.document_loaders.readthedocs import ReadTheDocsLoader
 from langchain.document_loaders.roam import RoamLoader
@@ -78,6 +82,7 @@ __all__ = [
    "AirbyteJSONLoader",
    "OnlinePDFLoader",
    "PDFMinerLoader",
+    "PyMuPDFLoader",
    "TelegramChatLoader",
    "SRTLoader",
    "FacebookChatLoader",
--- a/langchain/document_loaders/pdf.py
+++ b/langchain/document_loaders/pdf.py
@@ -1,5 +1,5 @@
 """Loader that loads PDF files."""
-from typing import List
+from typing import Any, List, Optional

 from langchain.docstore.document import Document
 from langchain.document_loaders.base import BaseLoader
@@ -27,6 +27,7 @@ class PDFMinerLoader(BaseLoader):
                "pdfminer package not found, please install it with "
                "`pip install pdfminer.six`"
            )
+
        self.file_path = file_path

    def load(self) -> List[Document]:
@@ -36,3 +37,37 @@ class PDFMinerLoader(BaseLoader):
        text = extract_text(self.file_path)
        metadata = {"source": self.file_path}
        return [Document(page_content=text, metadata=metadata)]
+
+
+class PyMuPDFLoader(BaseLoader):
+    """Loader that uses PyMuPDF to load PDF files."""
+
+    def __init__(self, file_path: str):
+        """Initialize with file path."""
+        try:
+            import fitz  # noqa:F401
+        except ImportError:
+            raise ValueError(
+                "PyMuPDF package not found, please install it with "
+                "`pip install pymupdf`"
+            )
+
+        self.file_path = file_path
+
+    def load(self, **kwargs: Optional[Any]) -> List[Document]:
+        """Load file."""
+        import fitz
+
+        doc = fitz.open(self.file_path)  # open document
+        return [
+            Document(
+                page_content=page.get_text(**kwargs).encode("utf-8"),
+                metadata={
+                    "file_path": self.file_path,
+                    "page_number": page.number + 1,
+                    "total_pages": len(doc),
+                }
+                | doc.metadata,
+            )
+            for page in doc
+        ]
--- a/langchain/embeddings/cohere.py
+++ b/langchain/embeddings/cohere.py
@@ -64,7 +64,7 @@ class CohereEmbeddings(BaseModel, Embeddings):
        embeddings = self.client.embed(
            model=self.model, texts=texts, truncate=self.truncate
        ).embeddings
-        return embeddings
+        return [list(map(float, e)) for e in embeddings]

    def embed_query(self, text: str) -> List[float]:
        """Call out to Cohere's embedding endpoint.
@@ -78,4 +78,4 @@ class CohereEmbeddings(BaseModel, Embeddings):
        embedding = self.client.embed(
            model=self.model, texts=[text], truncate=self.truncate
        ).embeddings[0]
-        return embedding
+        return list(map(float, embedding))
--- a/langchain/llms/openai.py
+++ b/langchain/llms/openai.py
@@ -161,6 +161,12 @@ class BaseOpenAI(BaseLLM, BaseModel):
    streaming: bool = False
    """Whether to stream the results or not."""

+    def __new__(cls, **data: Any) -> Union[OpenAIChat, BaseOpenAI]:  # type: ignore
+        """Initialize the OpenAI object."""
+        if data.get("model_name", "").startswith("gpt-3.5-turbo"):
+            return OpenAIChat(**data)
+        return super().__new__(cls)
+
    class Config:
        """Configuration for this pydantic object."""

--- a/langchain/prompts/example_selector/semantic_similarity.py
+++ b/langchain/prompts/example_selector/semantic_similarity.py
@@ -1,7 +1,7 @@
 """Example selector that selects examples based on SemanticSimilarity."""
 from __future__ import annotations

-from typing import Any, Dict, List, Optional
+from typing import Any, Dict, List, Optional, Type

 from pydantic import BaseModel, Extra

@@ -65,7 +65,7 @@ class SemanticSimilarityExampleSelector(BaseExampleSelector, BaseModel):
        cls,
        examples: List[dict],
        embeddings: Embeddings,
-        vectorstore_cls: VectorStore,
+        vectorstore_cls: Type[VectorStore],
        k: int = 4,
        input_keys: Optional[List[str]] = None,
        **vectorstore_cls_kwargs: Any,
@@ -131,7 +131,7 @@ class MaxMarginalRelevanceExampleSelector(SemanticSimilarityExampleSelector, Bas
        cls,
        examples: List[dict],
        embeddings: Embeddings,
-        vectorstore_cls: VectorStore,
+        vectorstore_cls: Type[VectorStore],
        k: int = 4,
        input_keys: Optional[List[str]] = None,
        fetch_k: int = 20,
--- a/langchain/sql_database.py
+++ b/langchain/sql_database.py
@@ -3,7 +3,7 @@ from __future__ import annotations

 from typing import Any, Iterable, List, Optional

-from sqlalchemy import MetaData, create_engine, inspect, select
+from sqlalchemy import MetaData, create_engine, inspect, select, text
 from sqlalchemy.engine import Engine
 from sqlalchemy.exc import ProgrammingError, SQLAlchemyError
 from sqlalchemy.schema import CreateTable
@@ -177,7 +177,7 @@ class SQLDatabase:
        with self._engine.begin() as connection:
            if self._schema is not None:
                connection.exec_driver_sql(f"SET search_path TO {self._schema}")
-            cursor = connection.exec_driver_sql(command)
+            cursor = connection.execute(text(command))
            if cursor.returns_rows:
                if fetch == "all":
                    result = cursor.fetchall()
--- a/langchain/utilities/searx_search.py
+++ b/langchain/utilities/searx_search.py
@@ -1,7 +1,13 @@
 """Chain that calls SearxNG meta search API.

 SearxNG is a privacy-friendly free metasearch engine that aggregates results from
-multiple search engines and databases.
+`multiple search engines
+<https://docs.searxng.org/admin/engines/configured_engines.html>`_ and databases and
+supports the `OpenSearch 
+<https://github.com/dewitt/opensearch/blob/master/opensearch-1-1-draft-6.md>`_
+specification.
+
+More detailes on the installtion instructions `here. <../../ecosystem/searx.html>`_

 For the search API refer to https://docs.searxng.org/dev/search_api.html

@@ -176,7 +182,7 @@ class SearxSearchWrapper(BaseModel):
        .. code-block:: python

            from langchain.utilities import SearxSearchWrapper
-            searx = SearxSearchWrapper(searx_host="https://searx.example.com")
+            searx = SearxSearchWrapper(searx_host="http://localhost:8888")

    Example with SSL disabled:
        .. code-block:: python
@@ -184,7 +190,7 @@ class SearxSearchWrapper(BaseModel):
            from langchain.utilities import SearxSearchWrapper
            # note the unsecure parameter is not needed if you pass the url scheme as
            # http
-            searx = SearxSearchWrapper(searx_host="http://searx.example.com",
+            searx = SearxSearchWrapper(searx_host="http://localhost:8888",
                                                    unsecure=True)


--- a/langchain/vectorstores/chroma.py
+++ b/langchain/vectorstores/chroma.py
@@ -3,7 +3,7 @@ from __future__ import annotations

 import logging
 import uuid
-from typing import Any, Dict, Iterable, List, Optional
+from typing import Any, Dict, Iterable, List, Optional, Tuple

 from langchain.docstore.document import Document
 from langchain.embeddings.base import Embeddings
@@ -116,6 +116,27 @@ class Chroma(VectorStore):
        Returns:
            List[Document]: List of documents most simmilar to the query text.
        """
+        docs_and_scores = self.similarity_search_with_score(query, k)
+        return [doc for doc, _ in docs_and_scores]
+
+    def similarity_search_with_score(
+        self,
+        query: str,
+        k: int = 4,
+        filter: Optional[Dict[str, str]] = None,
+        **kwargs: Any,
+    ) -> List[Tuple[Document, float]]:
+        """Run similarity search with Chroma with distance.
+
+        Args:
+            query (str): Query text to search for.
+            k (int): Number of results to return. Defaults to 4.
+            filter (Optional[Dict[str, str]]): Filter by metadata. Defaults to None.
+
+        Returns:
+            List[Tuple[Document, float]]: List of documents most similar to the query
+                text with distance in float.
+        """
        if self._embedding_function is None:
            results = self._collection.query(
                query_texts=[query], n_results=k, where=filter
@@ -129,8 +150,12 @@ class Chroma(VectorStore):
        docs = [
            # TODO: Chroma can do batch querying,
            # we shouldn't hard code to the 1st result
-            Document(page_content=result[0], metadata=result[1])
-            for result in zip(results["documents"][0], results["metadatas"][0])
+            (Document(page_content=result[0], metadata=result[1]), result[2])
+            for result in zip(
+                results["documents"][0],
+                results["metadatas"][0],
+                results["distances"][0],
+            )
        ]
        return docs

--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "langchain"
-version = "0.0.99"
+version = "0.0.101"
 description = "Building applications with LLMs through composability"
 authors = []
 license = "MIT"
@@ -96,7 +96,7 @@ playwright = "^1.28.0"

 [tool.poetry.extras]
 llms = ["anthropic", "cohere", "openai", "nlpcloud", "huggingface_hub", "manifest-ml", "torch", "transformers"]
-all = ["anthropic", "cohere", "openai", "nlpcloud", "huggingface_hub", "manifest-ml", "elasticsearch", "opensearch-py", "google-search-results", "faiss-cpu", "sentence_transformers", "transformers", "spacy", "nltk", "wikipedia", "beautifulsoup4", "tiktoken", "torch", "jinja2", "pinecone-client", "weaviate-client", "redis", "google-api-python-client", "wolframalpha", "qdrant-client", "tensorflow-text", "pypdf", "networkx", "nomic"]
+all = ["anthropic", "cohere", "openai", "nlpcloud", "huggingface_hub", "manifest-ml", "elasticsearch", "opensearch-py", "google-search-results", "faiss-cpu", "sentence_transformers", "transformers", "spacy", "nltk", "wikipedia", "beautifulsoup4", "tiktoken", "torch", "jinja2", "pinecone-client", "weaviate-client", "redis", "google-api-python-client", "wolframalpha", "qdrant-client", "tensorflow-text", "pypdf", "networkx", "nomic", "aleph-alpha-client", "deeplake", ]

 [tool.ruff]
 select = [
--- a/tests/integration_tests/document_loaders/test_pdf.py
+++ b/tests/integration_tests/document_loaders/test_pdf.py
@@ -0,0 +1,46 @@
+from pathlib import Path
+
+from langchain.document_loaders import (
+    PDFMinerLoader,
+    PyMuPDFLoader,
+    UnstructuredPDFLoader,
+)
+
+
+def test_unstructured_pdf_loader() -> None:
+    """Test unstructured loader."""
+    file_path = Path(__file__).parent.parent / "examples/hello.pdf"
+    loader = UnstructuredPDFLoader(str(file_path))
+    docs = loader.load()
+
+    assert len(docs) == 1
+
+
+def test_pdfminer_loader() -> None:
+    """Test PDFMiner loader."""
+    file_path = Path(__file__).parent.parent / "examples/hello.pdf"
+    loader = PDFMinerLoader(str(file_path))
+    docs = loader.load()
+
+    assert len(docs) == 1
+
+    file_path = Path(__file__).parent.parent / "examples/layout-parser-paper.pdf"
+    loader = PDFMinerLoader(str(file_path))
+
+    docs = loader.load()
+    assert len(docs) == 1
+
+
+def test_pymupdf_loader() -> None:
+    """Test PyMuPDF loader."""
+    file_path = Path(__file__).parent.parent / "examples/hello.pdf"
+    loader = PyMuPDFLoader(str(file_path))
+
+    docs = loader.load()
+    assert len(docs) == 1
+
+    file_path = Path(__file__).parent.parent / "examples/layout-parser-paper.pdf"
+    loader = PyMuPDFLoader(str(file_path))
+
+    docs = loader.load()
+    assert len(docs) == 16
--- a/tests/integration_tests/llms/test_openai.py
+++ b/tests/integration_tests/llms/test_openai.py
@@ -144,6 +144,13 @@ async def test_openai_async_streaming_callback() -> None:
    assert isinstance(result, LLMResult)


+def test_openai_chat_wrong_class() -> None:
+    """Test OpenAIChat with wrong class still works."""
+    llm = OpenAI(model_name="gpt-3.5-turbo")
+    output = llm("Say foo:")
+    assert isinstance(output, str)
+
+
 def test_openai_chat() -> None:
    """Test OpenAIChat."""
    llm = OpenAIChat(max_tokens=10)
--- a/tests/integration_tests/vectorstores/test_chroma.py
+++ b/tests/integration_tests/vectorstores/test_chroma.py
@@ -28,6 +28,20 @@ def test_chroma_with_metadatas() -> None:
    assert output == [Document(page_content="foo", metadata={"page": "0"})]


+def test_chroma_with_metadatas_with_scores() -> None:
+    """Test end to end construction and search."""
+    texts = ["foo", "bar", "baz"]
+    metadatas = [{"page": str(i)} for i in range(len(texts))]
+    docsearch = Chroma.from_texts(
+        collection_name="test_collection",
+        texts=texts,
+        embedding=FakeEmbeddings(),
+        metadatas=metadatas,
+    )
+    output = docsearch.similarity_search_with_score("foo", k=1)
+    assert output == [(Document(page_content="foo", metadata={"page": "0"}), 1.0)]
+
+
 def test_chroma_with_persistence() -> None:
    """Test end to end construction and search, with persistence."""
    chroma_persist_dir = "./tests/persist_dir"
--- a/tests/unit_tests/chains/test_base.py
+++ b/tests/unit_tests/chains/test_base.py
@@ -1,5 +1,5 @@
 """Test logic on base chain class."""
-from typing import Any, Dict, List
+from typing import Any, Dict, List, Optional

 import pytest
 from pydantic import BaseModel
@@ -17,7 +17,9 @@ class FakeMemory(Memory, BaseModel):
        """Return baz variable."""
        return ["baz"]

-    def load_memory_variables(self, inputs: Dict[str, Any]) -> Dict[str, str]:
+    def load_memory_variables(
+        self, inputs: Optional[Dict[str, Any]] = None
+    ) -> Dict[str, str]:
        """Return baz variable."""
        return {"baz": "foo"}

--- a/tests/unit_tests/chains/test_memory.py
+++ b/tests/unit_tests/chains/test_memory.py
@@ -0,0 +1,11 @@
+from langchain.chains.base import SimpleMemory
+
+
+def test_simple_memory() -> None:
+    """Test SimpleMemory."""
+    memory = SimpleMemory(memories={"baz": "foo"})
+
+    output = memory.load_memory_variables({})
+
+    assert output == {"baz": "foo"}
+    assert ["baz"] == memory.memory_variables
--- a/tests/unit_tests/chains/test_sequential.py
+++ b/tests/unit_tests/chains/test_sequential.py
@@ -4,7 +4,7 @@ from typing import Dict, List
 import pytest
 from pydantic import BaseModel

-from langchain.chains.base import Chain
+from langchain.chains.base import Chain, SimpleMemory
 from langchain.chains.sequential import SequentialChain, SimpleSequentialChain


@@ -56,6 +56,19 @@ def test_sequential_usage_multiple_inputs() -> None:
    assert output == expected_output


+def test_sequential_usage_memory() -> None:
+    """Test sequential usage with memory."""
+    memory = SimpleMemory(memories={"zab": "rab"})
+    chain_1 = FakeChain(input_variables=["foo"], output_variables=["bar"])
+    chain_2 = FakeChain(input_variables=["bar"], output_variables=["baz"])
+    chain = SequentialChain(
+        memory=memory, chains=[chain_1, chain_2], input_variables=["foo"]
+    )
+    output = chain({"foo": "123"})
+    expected_output = {"baz": "123foofoo", "foo": "123", "zab": "rab"}
+    assert output == expected_output
+
+
 def test_sequential_usage_multiple_outputs() -> None:
    """Test sequential usage on multiple output chains."""
    chain_1 = FakeChain(input_variables=["foo"], output_variables=["bar", "test"])
Author	SHA1	Message	Date
Harrison Chase	56b850648f	cr (#1436 )	2023-03-04 08:38:56 -08:00
Harrison Chase	63a5614d23	Harrison/simple memory (#1435 ) Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>	2023-03-04 08:15:52 -08:00
Harrison Chase	a1b9dfc099	Harrison/similarity search chroma (#1434 ) Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>	2023-03-04 08:10:15 -08:00
Peng Qu	68ce68f290	Fix an unusual issue that occurs when using OpenAIChat for llm_math (#1410 ) Fix an issue that occurs when using OpenAIChat for llm_math, refer to the code style of the "Final Answer:" in Mrkl。 the reason is I found a issue when I try OpenAIChat for llm_math, when I try the question in Chinese, the model generate the format like "\n\nQuestion: What is the square of 29?\nAnswer: 841", it translate the question first , then answer. below is my snapshot: <img width="945" alt="snapshot" src="https://user-images.githubusercontent.com/82029664/222642193-10ecca77-db7b-4759-bc46-32a8f8ddc48f.png">	2023-03-04 07:56:07 -08:00
Ikko Eltociear Ashimine	b8a7828d1f	Update huggingface_datasets.ipynb (#1417 ) HuggingFace -> Hugging Face	2023-03-04 00:22:31 -08:00
Kentaro Tanaka	6a4ee07e4f	Fix type hint of 'vectorstore_cls' arg in `SemanticSimilarityExampleSelector` (#1427 ) Hello! Thank you for the amazing library you've created! While following the tutorial at [the link(`Using an example selector`)](https://langchain.readthedocs.io/en/latest/modules/prompts/examples/few_shot_examples.html#using-an-example-selector), I noticed that passing Chroma as an argument to from_examples results in a type hint error. Error message(mypy): ``` Argument 3 to "from_examples" of "SemanticSimilarityExampleSelector" has incompatible type "Type[Chroma]"; expected "VectorStore" [arg-type]mypy(error) ``` This pull request fixes the type hint and allows the VectorStore class to be specified as an argument.	2023-03-04 00:20:18 -08:00
Tim Asp	23231d65a9	Add PyMuPDF PDF loader (#1426 ) Different PDF libraries have different strengths and weaknesses. PyMuPDF does a good job at extracting the most amount of content from the doc, regardless of the source quality, extremely fast (especially compared to Unstructured). https://pymupdf.readthedocs.io/en/latest/index.html	2023-03-03 20:59:28 -08:00
blob42	3d54b05863	searx: add install instructions, update doc and notebooks (#1420 ) - Added instructions on setting up self hosted searx - Add notebook example with agent - Use `localhost:8888` as example url to stay consistent since public instances are not really usable. Co-authored-by: blob42 <spike@w530>	2023-03-03 20:57:50 -08:00
Tim Asp	bca0935d90	[docs] fix minor import error (#1425 )	2023-03-03 16:10:07 -08:00
Jon Luo	882f7964fb	fix sql misinterpretation of % in query (#1408 ) % is being misinterpreted by sqlalchemy as parameter passing, so any `LIKE 'asdf%'` will result in a value error with mysql, mariadb, and maybe some others. This is one way to fix it - the alternative is to simply double up %, like `LIKE 'asdf%%'` but this seemed cleaner in terms of output. Fixes #1383	2023-03-02 16:03:16 -08:00
JonLuca De Caro	443992c4d5	[Docs] Add missing word from prompt docs (#1406 ) The prompt in the first example of the quickstart guide was missing `for `	2023-03-02 16:02:54 -08:00
Eugene Yurtsev	a83a371069	Minor documentation update in initialize_agent (#1397 ) Updating documentation in initialize_agent. One thing that could benefit from further clarification is the responsibility breakdown by between an AgentExecutor vs. an Agent. The documentation for an AgentExecutor does not clarify that. From the class attributes, it appears that executor has access to the tools, while the agent is only aware of the tool names. Anyway, additional clarification would be beneficial on the AgentExecutor class.	2023-03-02 11:46:35 -08:00
Nuno Campos	499e76b199	Allow the regular openai class to be used for ChatGPT models (#1393 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-03-02 09:04:18 -08:00
Kacper Łukawski	8947797250	Return Cohere embeddings as lists of floats (#1394 ) This PR fixes the types returned by Cohere embeddings. Currently, Cohere client returns instances of `cohere.embeddings.Embeddings`. Since the transport layer relies on JSON, some numbers might be represented as ints, not floats, which happens quite often. While that doesn't seem to be an issue, it breaks some pydantic models if they require strict floats.	2023-03-02 09:02:10 -08:00
Jason Gill	1989e7d4c2	Update examples to prevent confusing missing _type warning (#1391 ) The YAML and JSON examples of prompt serialization now give a strange `No '_type' key found, defaulting to 'prompt'` message when you try to run them yourself or copy the format of the files. The reason for this harmless warning is that the _type key was not in the config files, which means they are parsed as a standard prompt. This could be confusing to new users (like it was confusing to me after upgrading from 0.0.85 to 0.0.86+ for my few_shot prompts that needed a _type added to the example_prompt config), so this update includes the _type key just for clarity. Obviously this is not critical as the warning is harmless, but it could be confusing to track down or be interpreted as an error by a new user, so this update should resolve that.	2023-03-02 07:39:57 -08:00