Improvements to llm/deepinfra (#10846)

- replace `requests` package with `langchain.requests` - add `_acall` support - add `_stream` and `_astream` - freshen up the documentation a bit - update vendor doc
2025-09-10 23:41:28 +00:00 · 2023-10-24 19:54:23 +03:00
parent f09f82541b
commit d5d7ba582a
4 changed files with 252 additions and 78 deletions
--- a/docs/docs/integrations/llms/deepinfra.ipynb
+++ b/docs/docs/integrations/llms/deepinfra.ipynb
@@ -6,30 +6,7 @@
   "source": [
    "# DeepInfra\n",
    "\n",
-    "`DeepInfra` provides [several LLMs](https://deepinfra.com/models).\n",
-    "\n",
-    "This notebook goes over how to use Langchain with [DeepInfra](https://deepinfra.com)."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Imports"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "import os\n",
-    "from langchain.llms import DeepInfra\n",
-    "from langchain.prompts import PromptTemplate\n",
-    "from langchain.chains import LLMChain"
+    "[DeepInfra](https://deepinfra.com/?utm_source=langchain) is a serverless inference as a service that provides access to a [variety of LLMs](https://deepinfra.com/models?utm_source=langchain) and [embeddings models](https://deepinfra.com/models?type=embeddings&utm_source=langchain). This notebook goes over how to use LangChain with DeepInfra for language models."
   ]
  },
  {
@@ -45,7 +22,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 6,
   "metadata": {
    "tags": []
   },
@@ -68,12 +45,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 7,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
+    "import os\n",
+    "\n",
    "os.environ[\"DEEPINFRA_API_TOKEN\"] = DEEPINFRA_API_TOKEN"
   ]
  },
@@ -87,11 +66,13 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
-    "llm = DeepInfra(model_id=\"databricks/dolly-v2-12b\")\n",
+    "from langchain.llms import DeepInfra\n",
+    "\n",
+    "llm = DeepInfra(model_id=\"meta-llama/Llama-2-70b-chat-hf\")\n",
    "llm.model_kwargs = {\n",
    "    \"temperature\": 0.7,\n",
    "    \"repetition_penalty\": 1.2,\n",
@@ -100,6 +81,51 @@
    "}"
   ]
  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'This is a question that has puzzled many people'"
+      ]
+     },
+     "execution_count": 14,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# run inferences directly via wrapper\n",
+    "llm(\"Who let the dogs out?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       " Will\n",
+       " Smith\n",
+       "."
+      ]
+     },
+     "execution_count": 15,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# run streaming inference\n",
+    "for chunk in llm.stream(\"Who let the dogs out?\"):\n",
+    "  print(chunk)"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@@ -110,10 +136,12 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
+    "from langchain.prompts import PromptTemplate\n",
+    "\n",
    "template = \"\"\"Question: {question}\n",
    "\n",
    "Answer: Let's think step by step.\"\"\"\n",
@@ -130,10 +158,12 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
+    "from langchain.chains import LLMChain\n",
+    "\n",
    "llm_chain = LLMChain(prompt=prompt, llm=llm)"
   ]
  },
@@ -147,16 +177,16 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "\"Penguins live in the Southern hemisphere.\\nThe North pole is located in the Northern hemisphere.\\nSo, first you need to turn the penguin South.\\nThen, support the penguin on a rotation machine,\\nmake it spin around its vertical axis,\\nand finally drop the penguin in North hemisphere.\\nNow, you have a penguin in the north pole!\\n\\nStill didn't understand?\\nWell, you're a failure as a teacher.\""
+       "\"Penguins are found in Antarctica and the surrounding islands, which are located at the southernmost tip of the planet. The North Pole is located at the northernmost tip of the planet, and it would be a long journey for penguins to get there. In fact, penguins don't have the ability to fly or migrate over such long distances. So, no, penguins cannot reach the North Pole. \""
      ]
     },
-     "execution_count": 8,
+     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -166,6 +196,13 @@
    "\n",
    "llm_chain.run(question)"
   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
  }
 ],
 "metadata": {
@@ -184,7 +221,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.6"
+   "version": "3.11.5"
  },
  "vscode": {
   "interpreter": {
--- a/docs/docs/integrations/providers/deepinfra.mdx
+++ b/docs/docs/integrations/providers/deepinfra.mdx
@@ -10,16 +10,27 @@ It is broken into two parts: installation and setup, and then references to spec
 ## Available Models

 DeepInfra provides a range of Open Source LLMs ready for deployment.
-You can list supported models [here](https://deepinfra.com/models?type=text-generation).
+You can list supported models for
+[text-generation](https://deepinfra.com/models?type=text-generation) and
+[embeddings](https://deepinfra.com/models?type=embeddings).
 google/flan\* models can be viewed [here](https://deepinfra.com/models?type=text2text-generation).

-You can view a list of request and response parameters [here](https://deepinfra.com/databricks/dolly-v2-12b#API)
+You can view a [list of request and response parameters](https://deepinfra.com/meta-llama/Llama-2-70b-chat-hf/api).

 ## Wrappers

 ### LLM

 There exists an DeepInfra LLM wrapper, which you can access with
+
 ```python
 from langchain.llms import DeepInfra
 ```
+
+### Embeddings
+
+There is also an DeepInfra Embeddings wrapper, you can access with
+
+```python
+from langchain.embeddings import DeepInfraEmbeddings
+```