Adds DeepSparse as an LLM (#9184)

Adds [DeepSparse](https://github.com/neuralmagic/deepsparse) as an LLM backend. DeepSparse supports running various open-source sparsified models hosted on [SparseZoo](https://sparsezoo.neuralmagic.com/) for performance gains on CPUs. Twitter handles: @mgoin_ @neuralmagic --------- Co-authored-by: Bagatur <baskaryan@gmail.com>
2025-09-02 19:47:13 +00:00 · 2023-08-14 01:35:58 -04:00
parent 0fa69d8988
commit 621da3c164
5 changed files with 220 additions and 0 deletions
--- a/docs/extras/integrations/llms/deepsparse.ipynb
+++ b/docs/extras/integrations/llms/deepsparse.ipynb
@@ -0,0 +1,78 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "15d7ce70-8879-42a0-86d9-a3d604a3ec83",
+   "metadata": {},
+   "source": [
+    "# DeepSparse\n",
+    "\n",
+    "This page covers how to use the [DeepSparse](https://github.com/neuralmagic/deepsparse) inference runtime within LangChain.\n",
+    "It is broken into two parts: installation and setup, and then examples of DeepSparse usage.\n",
+    "\n",
+    "## Installation and Setup\n",
+    "\n",
+    "- Install the Python package with `pip install deepsparse`\n",
+    "- Choose a [SparseZoo model](https://sparsezoo.neuralmagic.com/?useCase=text_generation) or export a support model to ONNX [using Optimum](https://github.com/neuralmagic/notebooks/blob/main/notebooks/opt-text-generation-deepsparse-quickstart/OPT_Text_Generation_DeepSparse_Quickstart.ipynb)\n",
+    "\n",
+    "\n",
+    "There exists a DeepSparse LLM wrapper, that provides a unified interface for all models:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "79d24d37-737a-428c-b6c5-84c1633070d7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.llms import DeepSparse\n",
+    "\n",
+    "llm = DeepSparse(model='zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none')\n",
+    "\n",
+    "print(llm('def fib():'))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ea7ea674-d6b0-49d9-9c2b-014032973be6",
+   "metadata": {},
+   "source": [
+    "Additional parameters can be passed using the `config` parameter:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ff61b845-41e6-4457-8625-6e21a11bfe7c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "config = {'max_generated_tokens': 256}\n",
+    "\n",
+    "llm = DeepSparse(model='zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none', config=config)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/providers/deepsparse.mdx
+++ b/docs/extras/integrations/providers/deepsparse.mdx
@@ -0,0 +1,35 @@
+# DeepSparse
+
+This page covers how to use the [DeepSparse](https://github.com/neuralmagic/deepsparse) inference runtime within LangChain.
+It is broken into two parts: installation and setup, and then examples of DeepSparse usage.
+
+## Installation and Setup
+
+- Install the Python package with `pip install deepsparse`
+- Choose a [SparseZoo model](https://sparsezoo.neuralmagic.com/?useCase=text_generation) or export a support model to ONNX [using Optimum](https://github.com/neuralmagic/notebooks/blob/main/notebooks/opt-text-generation-deepsparse-quickstart/OPT_Text_Generation_DeepSparse_Quickstart.ipynb)
+
+## Wrappers
+
+### LLM
+
+There exists a DeepSparse LLM wrapper, which you can access with:
+
+```python
+from langchain.llms import DeepSparse
+```
+
+It provides a unified interface for all models:
+
+```python
+llm = DeepSparse(model='zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none')
+
+print(llm('def fib():'))
+```
+
+Additional parameters can be passed using the `config` parameter:
+
+```python
+config = {'max_generated_tokens': 256}
+
+llm = DeepSparse(model='zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none', config=config)
+```