Compare commits

...

19 Commits

Author SHA1 Message Date
Harrison Chase
ce73db2fec cr 2023-12-04 16:37:01 -08:00
Harrison Chase
95b415daf3 cr 2023-12-04 13:51:39 -08:00
Harrison Chase
9a8ebd3616 cr 2023-12-04 12:57:10 -08:00
Harrison Chase
2037386156 cr 2023-12-04 12:48:56 -08:00
Harrison Chase
358d11bdab Merge branch 'neuralmagic-master' into harrison/deepsparse 2023-12-04 12:48:47 -08:00
Harrison Chase
17ead06c2e cr 2023-12-04 12:48:31 -08:00
Derrick Mwiti
072a6a0e79 Merge branch 'master' into master 2023-12-04 14:00:39 +03:00
Derrick Mwiti
3efdd03ff7 Merge branch 'master' into master 2023-12-01 08:06:44 +03:00
Derrick Mwiti
1a60ad109e Update pyproject.toml 2023-11-30 07:22:44 +03:00
Derrick Mwiti
b73883d12a Merge branch 'master' into master 2023-11-30 07:11:41 +03:00
Derrick Mwiti
6c07ae4a54 Merge branch 'master' into master 2023-11-29 08:18:39 +03:00
Derrick Mwiti
224d7a47d6 Merge branch 'master' into master 2023-11-28 20:47:44 +03:00
Derrick Mwiti
e76b5e5fd9 Merge branch 'master' into master 2023-11-27 22:11:09 +03:00
Derrick Mwiti
e632e44bb0 Merge branch 'master' into master 2023-11-27 18:01:55 +03:00
Derrick Mwiti
92d0a95e33 Merge branch 'master' into master 2023-11-27 08:02:52 +03:00
Derrick Mwiti
aa2b827b29 Merge branch 'master' into master 2023-11-24 13:33:17 +03:00
Derrick Mwiti
bc4c9a51f7 Merge branch 'master' into master 2023-11-23 09:40:06 +03:00
Derrick Mwiti
e02ea5832c add DeepSparse package 2023-11-22 11:43:19 +03:00
Derrick Mwiti
ceb6976920 Add DeepSparse Docs 2023-11-22 11:17:53 +03:00
5 changed files with 381 additions and 51 deletions

View File

@@ -5,31 +5,117 @@ It is broken into two parts: installation and setup, and then examples of DeepSp
## Installation and Setup
- Install the Python package with `pip install deepsparse`
- Install the Python packages with `pip install deepsparse-nightly[llm] langchain`
- Choose a [SparseZoo model](https://sparsezoo.neuralmagic.com/?useCase=text_generation) or export a support model to ONNX [using Optimum](https://github.com/neuralmagic/notebooks/blob/main/notebooks/opt-text-generation-deepsparse-quickstart/OPT_Text_Generation_DeepSparse_Quickstart.ipynb)
- Models hosted on HuggingFace are also supported by prepending `"hf:"` to the model id, such as [`"hf:mgoin/TinyStories-33M-quant-deepsparse"`](https://huggingface.co/mgoin/TinyStories-33M-quant-deepsparse)
## Wrappers
## Using DeepSparse With LangChain
### LLM
There exists a DeepSparse LLM wrapper, which you can access with:
There is a DeepSparse LLM wrapper, which you can access with:
```python
from langchain.llms import DeepSparse
```
It provides a unified interface for all models:
It provides a simple, unified interface for all models:
```python
from langchain.llms import DeepSparse
llm = DeepSparse(model='zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none')
print(llm('def fib():'))
```
"""
a, b = 0, 1
while True:
yield a
a, b = b, a + b
Additional parameters can be passed using the `config` parameter:
def fib2(n):
a, b = 0, 1
while a < n:
yield a
a, b = b, a + b
def primes():
yield 2
it = fib()
while True:
try:
yield next(it)
except StopIteration:
return
"""
```
## Streaming
The DeepSparse LangChain wrapper also supports per token output streaming:
```python
config = {'max_generated_tokens': 256}
llm = DeepSparse(model='zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none', config=config)
from langchain.llms import DeepSparse
llm = DeepSparse(
model="hf:neuralmagic/mpt-7b-chat-pruned50-quant",
streaming=True
)
for chunk in llm.stream("Tell me a joke", stop=["'","\n"]):
print(chunk, end='', flush=True)
```
## Using Instruction Fine-tune Models With DeepSparse
Here's an example of how to prompt an instruction fine-tuned model using DeepSparse and the MPT-Instruct model:
```python
prompt="""
Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: what is quantization? ### Response:
"""
llm = DeepSparse(model='zoo:mpt-7b-dolly_mpt_pretrain-pruned50_quantized')
print(llm(prompt))
"""
In physics, the term "quantization" refers to the process of transforming a continuous variable into a set of discrete values. In the context of quantum mechanics, this process is used to describe the restriction of the degrees of freedom of a system to a set of discrete values. In other words, it is the process of transforming the continuous spectrum of a physical quantity into a set of discrete, or "quantized", values.
"""
```
You can also do all the other things you are used to doing in LangChain such as using `PromptTemplete`s and parsing outputs:
```python
from langchain.prompts import PromptTemplate
from langchain.output_parsers import CommaSeparatedListOutputParser
llm_parser = CommaSeparatedListOutputParser()
llm = DeepSparse(model='hf:neuralmagic/mpt-7b-chat-pruned50-quant')
prompt = PromptTemplate(
template="List how to {do}",
input_variables=["do"])
output = llm.predict(text=prompt.format(do="Become a great software engineer"))
print(output)
"""
List how to Become a great software engineer
By TechRadar Staff
Here are some tips on how to become a great software engineer:
1. Develop good programming skills: To become a great software engineer, you need to have a strong understanding of programming concepts and techniques. You should be able to write clean, efficient code that meets the requirements of the project.
2. Learn new technologies: To stay up-to in the field, you should be familiar with new technologies and programming languages. You should also be able to adapt to new environments and work with different tools and platforms.
3. Build a portfolio: To showcase your skills, you should build a portfolio of your work. This will help you showcase your skills and abilities to potential employers.
4. Network: Networking is an important aspect of your career. You should attend industry events and conferences to meet other professionals in the field.
5. Stay up-to-date with industry trends: Stay up-to-date with industry trends and developments. This will help you stay relevant in your field and help you stay ahead of your competition.
6. Take courses and certifications: Taking courses and certifications can help you gain new skills and knowledge. This will help you stay ahead of your competition and help you grow in your career.
7. Practice and refine your skills: Practice and refine your skills by working on projects and solving problems. This will help you develop your skills and help you grow in your career.
"""
```
## Configuration
The DeepSparse LangChain integration has arguments to control the model loaded, any configs for how the model should be loaded, configs to control how tokens are generated, and then whether to return all tokens at once or to stream them one-by-one.
```python
model: str
"""The path to a model file or directory or the name of a SparseZoo model stub."""
model_config: Optional[Dict[str, Any]] = None
"""Keyword arguments passed to the pipeline construction.
Common parameters are sequence_length, prompt_sequence_length"""
generation_config: Union[None, str, Dict] = None
"""GenerationConfig dictionary consisting of parameters used to control
sequences generated for each prompt. Common parameters are:
max_length, max_new_tokens, num_return_sequences, output_scores,
top_p, top_k, repetition_penalty."""
streaming: bool = False
"""Whether to stream the results, token by token."""
```

View File

@@ -1891,26 +1891,25 @@ typing-inspect = ">=0.4.0,<1"
[[package]]
name = "datasets"
version = "2.15.0"
version = "2.14.6"
description = "HuggingFace community-driven open-source library of datasets"
optional = true
python-versions = ">=3.8.0"
files = [
{file = "datasets-2.15.0-py3-none-any.whl", hash = "sha256:6d658d23811393dfc982d026082e1650bdaaae28f6a86e651966cb072229a228"},
{file = "datasets-2.15.0.tar.gz", hash = "sha256:a26d059370bd7503bd60e9337977199a13117a83f72fb61eda7e66f0c4d50b2b"},
{file = "datasets-2.14.6-py3-none-any.whl", hash = "sha256:4de857ffce21cfc847236745c69f102e33cd1f0fa8398e7be9964525fd4cd5db"},
{file = "datasets-2.14.6.tar.gz", hash = "sha256:97ebbace8ec7af11434a87d1215379927f8fee2beab2c4a674003756ecfe920c"},
]
[package.dependencies]
aiohttp = "*"
dill = ">=0.3.0,<0.3.8"
fsspec = {version = ">=2023.1.0,<=2023.10.0", extras = ["http"]}
huggingface-hub = ">=0.18.0"
huggingface-hub = ">=0.14.0,<1.0.0"
multiprocess = "*"
numpy = ">=1.17"
packaging = "*"
pandas = "*"
pyarrow = ">=8.0.0"
pyarrow-hotfix = "*"
pyyaml = ">=5.1"
requests = ">=2.19.0"
tqdm = ">=4.62.1"
@@ -1920,15 +1919,15 @@ xxhash = "*"
apache-beam = ["apache-beam (>=2.26.0,<2.44.0)"]
audio = ["librosa", "soundfile (>=0.12.1)"]
benchmarks = ["tensorflow (==2.12.0)", "torch (==2.0.1)", "transformers (==4.30.1)"]
dev = ["Pillow (>=6.2.1)", "absl-py", "apache-beam (>=2.26.0,<2.44.0)", "black (>=23.1,<24.0)", "elasticsearch (<8.0.0)", "faiss-cpu (>=1.6.4)", "jax (>=0.3.14)", "jaxlib (>=0.3.14)", "joblib (<1.3.0)", "joblibspark", "librosa", "lz4", "py7zr", "pyspark (>=3.4)", "pytest", "pytest-datadir", "pytest-xdist", "pyyaml (>=5.3.1)", "rarfile (>=4.0)", "ruff (>=0.0.241)", "s3fs", "s3fs (>=2021.11.1)", "soundfile (>=0.12.1)", "sqlalchemy (<2.0.0)", "tensorflow (>=2.2.0,!=2.6.0,!=2.6.1)", "tensorflow (>=2.3,!=2.6.0,!=2.6.1)", "tensorflow-macos", "tiktoken", "torch", "transformers", "typing-extensions (>=4.6.1)", "zstandard"]
dev = ["Pillow (>=6.2.1)", "absl-py", "apache-beam (>=2.26.0,<2.44.0)", "black (>=23.1,<24.0)", "elasticsearch (<8.0.0)", "faiss-cpu (>=1.6.4)", "joblib (<1.3.0)", "joblibspark", "librosa", "lz4", "py7zr", "pyspark (>=3.4)", "pytest", "pytest-datadir", "pytest-xdist", "pyyaml (>=5.3.1)", "rarfile (>=4.0)", "ruff (>=0.0.241)", "s3fs", "s3fs (>=2021.11.1)", "soundfile (>=0.12.1)", "sqlalchemy (<2.0.0)", "tensorflow (>=2.2.0,!=2.6.0,!=2.6.1)", "tensorflow (>=2.3,!=2.6.0,!=2.6.1)", "tensorflow-macos", "tiktoken", "torch", "transformers", "zstandard"]
docs = ["s3fs", "tensorflow (>=2.2.0,!=2.6.0,!=2.6.1)", "tensorflow-macos", "torch", "transformers"]
jax = ["jax (>=0.3.14)", "jaxlib (>=0.3.14)"]
jax = ["jax (>=0.2.8,!=0.3.2,<=0.3.25)", "jaxlib (>=0.1.65,<=0.3.25)"]
metrics-tests = ["Werkzeug (>=1.0.1)", "accelerate", "bert-score (>=0.3.6)", "jiwer", "langdetect", "mauve-text", "nltk", "requests-file (>=1.5.1)", "rouge-score", "sacrebleu", "sacremoses", "scikit-learn", "scipy", "sentencepiece", "seqeval", "six (>=1.15.0,<1.16.0)", "spacy (>=3.0.0)", "texttable (>=1.6.3)", "tldextract", "tldextract (>=3.1.0)", "toml (>=0.10.1)", "typer (<0.5.0)"]
quality = ["black (>=23.1,<24.0)", "pyyaml (>=5.3.1)", "ruff (>=0.0.241)"]
s3 = ["s3fs"]
tensorflow = ["tensorflow (>=2.2.0,!=2.6.0,!=2.6.1)", "tensorflow-macos"]
tensorflow-gpu = ["tensorflow-gpu (>=2.2.0,!=2.6.0,!=2.6.1)"]
tests = ["Pillow (>=6.2.1)", "absl-py", "apache-beam (>=2.26.0,<2.44.0)", "elasticsearch (<8.0.0)", "faiss-cpu (>=1.6.4)", "jax (>=0.3.14)", "jaxlib (>=0.3.14)", "joblib (<1.3.0)", "joblibspark", "librosa", "lz4", "py7zr", "pyspark (>=3.4)", "pytest", "pytest-datadir", "pytest-xdist", "rarfile (>=4.0)", "s3fs (>=2021.11.1)", "soundfile (>=0.12.1)", "sqlalchemy (<2.0.0)", "tensorflow (>=2.3,!=2.6.0,!=2.6.1)", "tensorflow-macos", "tiktoken", "torch", "transformers", "typing-extensions (>=4.6.1)", "zstandard"]
tests = ["Pillow (>=6.2.1)", "absl-py", "apache-beam (>=2.26.0,<2.44.0)", "elasticsearch (<8.0.0)", "faiss-cpu (>=1.6.4)", "joblib (<1.3.0)", "joblibspark", "librosa", "lz4", "py7zr", "pyspark (>=3.4)", "pytest", "pytest-datadir", "pytest-xdist", "rarfile (>=4.0)", "s3fs (>=2021.11.1)", "soundfile (>=0.12.1)", "sqlalchemy (<2.0.0)", "tensorflow (>=2.3,!=2.6.0,!=2.6.1)", "tensorflow-macos", "tiktoken", "torch", "transformers", "zstandard"]
torch = ["torch"]
vision = ["Pillow (>=6.2.1)"]
@@ -2008,6 +2007,55 @@ point-cloud = ["laspy"]
video = ["av (>=8.1.0)"]
visualizer = ["IPython", "flask"]
[[package]]
name = "deepsparse-nightly"
version = "1.6.0.20231201"
description = "An inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application"
optional = true
python-versions = ">=3.8, <3.12"
files = [
{file = "deepsparse-nightly-1.6.0.20231201.tar.gz", hash = "sha256:b551f5ce1f8e7cae0635288baa25c84d57386a41d358c01162d052934a1f0c1b"},
{file = "deepsparse_nightly-1.6.0.20231201-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d87fd27a2e6686c4c84b6377651fc17cbe8975fe7ab40d0abdec9b7a73fb4e00"},
{file = "deepsparse_nightly-1.6.0.20231201-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5662f2a9e120a1f0dd1af8ba2f10c097bb6b1bf90e35136eb8b6b52e3fc7d024"},
{file = "deepsparse_nightly-1.6.0.20231201-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:05077cc071b6d630df627b1ef7516f80d56cbcd8214afe0c8d251cb2f2483437"},
{file = "deepsparse_nightly-1.6.0.20231201-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:05cd68df68fefcbb027b095ca929afd0f100dc658ae11e003f80e4a83928e921"},
{file = "deepsparse_nightly-1.6.0.20231201-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:6f4f2c8ae9ff8da100179978ff4a014708fdab054543ea071d1bf64fcc2b5c38"},
{file = "deepsparse_nightly-1.6.0.20231201-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b0a7532743e313b61a00c4ebc5b4c310fcc5ef7c6e8f3a9de687ef8e2398ae60"},
{file = "deepsparse_nightly-1.6.0.20231201-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:98e20a05b9e551122844d4a58540df00674b527e5d34f4cd6277c5c63038b58f"},
{file = "deepsparse_nightly-1.6.0.20231201-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4e68308249fecee70c7328f820a9e2173b053ce32553191f915e8892c4125bba"},
]
[package.dependencies]
click = ">=7.1.2,<8.0.0 || >8.0.0"
datasets = {version = "<=2.14.6", optional = true, markers = "extra == \"llm\""}
numpy = ">=1.16.3"
onnx = ">=1.5.0,<1.15.0"
protobuf = ">=3.12.2"
pydantic = ">=1.8.2,<2.0.0"
requests = ">=2.0.0"
scikit-learn = {version = "*", optional = true, markers = "extra == \"llm\""}
seqeval = {version = "*", optional = true, markers = "extra == \"llm\""}
sparsezoo-nightly = ">=1.6.0,<1.7.0"
tqdm = ">=4.0.0"
transformers = {version = "<4.35", optional = true, markers = "extra == \"llm\""}
[package.extras]
clip = ["open-clip-torch (==2.20.0)", "scipy (>=1.8,<1.10)", "transformers (<4.35)"]
dev = ["Pillow (>=8.3.2)", "beautifulsoup4 (>=4.9.3)", "black (==22.12.0)", "flake8 (>=3.8.3)", "flaky (>=3.7.0,<3.8.0)", "flask (>=1.0.0)", "flask-cors (>=3.0.0)", "isort (>=5.7.0)", "ndjson (>=0.3.1)", "onnxruntime (>=1.7.0)", "pytest (>=6.0.0)", "wheel (>=0.36.2)"]
docs = ["m2r2 (>=0.2.7,<0.3.0)", "mistune (==0.8.4)", "myst-parser (>=0.14.0,<0.15.0)", "rinohtype (>=0.4.2)", "sphinx (>=3.4.0)", "sphinx-copybutton (>=0.3.0)", "sphinx-markdown-tables (>=0.0.15)", "sphinx-multiversion (==0.2.4)", "sphinx-rtd-theme"]
haystack = ["SPARQLWrapper", "aiorwlock (>=1.3.0,<2)", "azure-ai-formrecognizer (>=3.2.0b2)", "azure-core (<1.23)", "beautifulsoup4", "beir", "black[jupyter]", "coverage", "dill", "elastic-apm", "elasticsearch (>=7.7,<=7.10)", "faiss-cpu (==1.7.2)", "grpcio (==1.43.0)", "importlib-metadata", "jsonschema", "jupytercontrib", "langdetect", "markdown", "mkdocs", "mlflow", "mmh3", "more-itertools", "mypy", "networkx", "nltk", "onnxruntime", "onnxruntime-tools", "pandas", "pdf2image (==1.14.0)", "pillow", "pinecone-client", "posthog", "psutil", "psycopg2-binary", "pydantic", "pylint", "pymilvus (<2.0.0)", "pytesseract (==0.3.7)", "pytest", "python-docx", "python-magic", "python-multipart", "quantulum3", "rapidfuzz", "ray", "requests", "requests-cache", "responses", "scikit-learn (>=1.0.0)", "scipy (>=1.3.2)", "selenium", "sentence-transformers (>=2.2.0)", "seqeval", "sqlalchemy (>=1.4.2,<2)", "sqlalchemy-utils", "tika", "torch (>=1.12.1)", "tox", "tqdm", "typing-extensions", "watchdog", "weaviate-client (==3.3.3)", "webdriver-manager"]
image-classification = ["opencv-python (<=4.6.0.66)", "torchvision (>=0.3.0,<0.17)"]
llm = ["datasets (<=2.14.6)", "scikit-learn", "seqeval", "transformers (<4.35)"]
onnxruntime = ["onnxruntime (>=1.7.0)"]
openpifpaf = ["opencv-python (<=4.6.0.66)", "openpifpaf (==0.13.11)", "pycocotools (>=2.0.6)", "scipy (==1.10.1)"]
sentence-transformers = ["optimum-deepsparse", "torch (>=1.7.0,<2.2)"]
server = ["anyio (<4.0.0)", "fastapi (>=0.70.0,<0.87.0)", "prometheus-client (>=0.14.1)", "psutil (>=5.9.4)", "python-multipart (>=0.0.5)", "requests (>=2.26.0)", "uvicorn (>=0.15.0)"]
torch = ["torch (>=1.7.0,<2.2)"]
transformers = ["datasets (<=2.14.6)", "scikit-learn", "seqeval", "transformers (<4.35)"]
yolo = ["opencv-python (<=4.6.0.66)", "torchvision (>=0.3.0,<0.17)"]
yolov5 = ["opencv-python (<=4.6.0.66)", "torchvision (>=0.3.0,<0.17)"]
yolov8 = ["opencv-python (<=4.6.0.66)", "torchvision (>=0.3.0,<0.17)", "ultralytics (==8.0.124)"]
[[package]]
name = "defusedxml"
version = "0.7.1"
@@ -2708,6 +2756,24 @@ files = [
{file = "gast-0.4.0.tar.gz", hash = "sha256:40feb7b8b8434785585ab224d1568b857edb18297e5a3047f1ba012bc83b42c1"},
]
[[package]]
name = "geocoder"
version = "1.38.1"
description = "Geocoder is a simple and consistent geocoding library."
optional = true
python-versions = "*"
files = [
{file = "geocoder-1.38.1-py2.py3-none-any.whl", hash = "sha256:a733e1dfbce3f4e1a526cac03aadcedb8ed1239cf55bd7f3a23c60075121a834"},
{file = "geocoder-1.38.1.tar.gz", hash = "sha256:c9925374c961577d0aee403b09e6f8ea1971d913f011f00ca70c76beaf7a77e7"},
]
[package.dependencies]
click = "*"
future = "*"
ratelim = "*"
requests = "*"
six = "*"
[[package]]
name = "geojson"
version = "2.5.0"
@@ -5919,6 +5985,44 @@ rsa = ["cryptography (>=3.0.0)"]
signals = ["blinker (>=1.4.0)"]
signedtoken = ["cryptography (>=3.0.0)", "pyjwt (>=2.0.0,<3)"]
[[package]]
name = "onnx"
version = "1.12.0"
description = "Open Neural Network Exchange"
optional = true
python-versions = "*"
files = [
{file = "onnx-1.12.0-cp310-cp310-macosx_10_12_x86_64.whl", hash = "sha256:bdbd2578424c70836f4d0f9dda16c21868ddb07cc8192f9e8a176908b43d694b"},
{file = "onnx-1.12.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:213e73610173f6b2e99f99a4b0636f80b379c417312079d603806e48ada4ca8b"},
{file = "onnx-1.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9fd2f4e23078df197bb76a59b9cd8f5a43a6ad2edc035edb3ecfb9042093e05a"},
{file = "onnx-1.12.0-cp310-cp310-win32.whl", hash = "sha256:23781594bb8b7ee985de1005b3c601648d5b0568a81e01365c48f91d1f5648e4"},
{file = "onnx-1.12.0-cp310-cp310-win_amd64.whl", hash = "sha256:81a3555fd67be2518bf86096299b48fb9154652596219890abfe90bd43a9ec13"},
{file = "onnx-1.12.0-cp37-cp37m-macosx_10_12_x86_64.whl", hash = "sha256:5578b93dc6c918cec4dee7fb7d9dd3b09d338301ee64ca8b4f28bc217ed42dca"},
{file = "onnx-1.12.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c11162ffc487167da140f1112f49c4f82d815824f06e58bc3095407699f05863"},
{file = "onnx-1.12.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:341c7016e23273e9ffa9b6e301eee95b8c37d0f04df7cedbdb169d2c39524c96"},
{file = "onnx-1.12.0-cp37-cp37m-win32.whl", hash = "sha256:3c6e6bcffc3f5c1e148df3837dc667fa4c51999788c1b76b0b8fbba607e02da8"},
{file = "onnx-1.12.0-cp37-cp37m-win_amd64.whl", hash = "sha256:8a7aa61aea339bd28f310f4af4f52ce6c4b876386228760b16308efd58f95059"},
{file = "onnx-1.12.0-cp38-cp38-macosx_10_12_x86_64.whl", hash = "sha256:56ceb7e094c43882b723cfaa107d85ad673cfdf91faeb28d7dcadacca4f43a07"},
{file = "onnx-1.12.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:b3629e8258db15d4e2c9b7f1be91a3186719dd94661c218c6f5fde3cc7de3d4d"},
{file = "onnx-1.12.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2d9a7db54e75529160337232282a4816cc50667dc7dc34be178fd6f6b79d4705"},
{file = "onnx-1.12.0-cp38-cp38-win32.whl", hash = "sha256:fea5156a03398fe0e23248042d8651c1eaac5f6637d4dd683b4c1f1320b9f7b4"},
{file = "onnx-1.12.0-cp38-cp38-win_amd64.whl", hash = "sha256:f66d2996e65f490a57b3ae952e4e9189b53cc9fe3f75e601d50d4db2dc1b1cd9"},
{file = "onnx-1.12.0-cp39-cp39-macosx_10_12_x86_64.whl", hash = "sha256:c39a7a0352c856f1df30dccf527eb6cb4909052e5eaf6fa2772a637324c526aa"},
{file = "onnx-1.12.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:fab13feb4d94342aae6d357d480f2e47d41b9f4e584367542b21ca6defda9e0a"},
{file = "onnx-1.12.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c7a9b3ea02c30efc1d2662337e280266aca491a8e86be0d8a657f874b7cccd1e"},
{file = "onnx-1.12.0-cp39-cp39-win32.whl", hash = "sha256:f8800f28c746ab06e51ef8449fd1215621f4ddba91be3ffc264658937d38a2af"},
{file = "onnx-1.12.0-cp39-cp39-win_amd64.whl", hash = "sha256:af90427ca04c6b7b8107c2021e1273227a3ef1a7a01f3073039cae7855a59833"},
{file = "onnx-1.12.0.tar.gz", hash = "sha256:13b3e77d27523b9dbf4f30dfc9c959455859d5e34e921c44f712d69b8369eff9"},
]
[package.dependencies]
numpy = ">=1.16.6"
protobuf = ">=3.12.2,<=3.20.1"
typing-extensions = ">=3.6.2.1"
[package.extras]
lint = ["clang-format (==13.0.0)", "flake8", "mypy (==0.782)", "types-protobuf (==3.18.4)"]
[[package]]
name = "onnxruntime"
version = "1.16.1"
@@ -6879,6 +6983,20 @@ files = [
{file = "py-1.11.0.tar.gz", hash = "sha256:51c75c4126074b472f746a24399ad32f6053d1b34b68d2fa41e558e6f4a98719"},
]
[[package]]
name = "py-machineid"
version = "0.4.6"
description = "Get the unique machine ID of any host (without admin privileges)"
optional = true
python-versions = "*"
files = [
{file = "py-machineid-0.4.6.tar.gz", hash = "sha256:d3d9cd85aae31d2f172f27833e5fd17dffd2cf7c4918390ec06300702d02cd8e"},
{file = "py_machineid-0.4.6-py3-none-any.whl", hash = "sha256:5f92d8be8a68632b29d1297853f92b50347504e5b41022abe9de9ee5f75fae9e"},
]
[package.dependencies]
winregistry = {version = "*", markers = "sys_platform == \"win32\""}
[[package]]
name = "py-trello"
version = "0.19.0"
@@ -6964,17 +7082,6 @@ files = [
[package.dependencies]
numpy = ">=1.16.6"
[[package]]
name = "pyarrow-hotfix"
version = "0.6"
description = ""
optional = true
python-versions = ">=3.5"
files = [
{file = "pyarrow_hotfix-0.6-py3-none-any.whl", hash = "sha256:dcc9ae2d220dff0083be6a9aa8e0cdee5182ad358d4931fce825c545e5c89178"},
{file = "pyarrow_hotfix-0.6.tar.gz", hash = "sha256:79d3e030f7ff890d408a100ac16d6f00b14d44a502d7897cd9fc3e3a534e9945"},
]
[[package]]
name = "pyasn1"
version = "0.5.0"
@@ -8338,6 +8445,20 @@ PyYAML = "*"
Shapely = ">=1.7.1"
six = ">=1.15.0"
[[package]]
name = "ratelim"
version = "0.1.6"
description = "Makes it easy to respect rate limits."
optional = true
python-versions = "*"
files = [
{file = "ratelim-0.1.6-py2.py3-none-any.whl", hash = "sha256:e1a7dd39e6b552b7cc7f52169cd66cdb826a1a30198e355d7016012987c9ad08"},
{file = "ratelim-0.1.6.tar.gz", hash = "sha256:826d32177e11f9a12831901c9fda6679fd5bbea3605910820167088f5acbb11d"},
]
[package.dependencies]
decorator = "*"
[[package]]
name = "ratelimiter"
version = "1.2.0.post0"
@@ -9157,6 +9278,20 @@ files = [
{file = "sentencepiece-0.1.99.tar.gz", hash = "sha256:189c48f5cb2949288f97ccdb97f0473098d9c3dcf5a3d99d4eabe719ec27297f"},
]
[[package]]
name = "seqeval"
version = "1.2.2"
description = "Testing framework for sequence labeling"
optional = true
python-versions = "*"
files = [
{file = "seqeval-1.2.2.tar.gz", hash = "sha256:f28e97c3ab96d6fcd32b648f6438ff2e09cfba87f05939da9b3970713ec56e6f"},
]
[package.dependencies]
numpy = ">=1.14.0"
scikit-learn = ">=0.21.3"
[[package]]
name = "setuptools"
version = "67.8.0"
@@ -9396,6 +9531,34 @@ numpy = "*"
docs = ["linkify-it-py", "myst-parser", "sphinx", "sphinx-book-theme"]
test = ["pytest"]
[[package]]
name = "sparsezoo-nightly"
version = "1.6.0.20231201"
description = "Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes"
optional = true
python-versions = ">=3.8.0"
files = [
{file = "sparsezoo_nightly-1.6.0.20231201-py3-none-any.whl", hash = "sha256:392f1ae7d4d9900756c31161de145796f093c8b226ab0232b73c939c644a6324"},
]
[package.dependencies]
click = ">=7.1.2,<8.0.0 || >8.0.0"
geocoder = ">=1.38.0"
numpy = ">=1.0.0"
onnx = ">=1.5.0,<1.15.0"
pandas = ">1.3"
protobuf = ">=3.12.2"
py-machineid = ">=0.3.0"
pydantic = ">=1.8.2,<2.0.0"
pyyaml = ">=5.1.0"
requests = ">=2.0.0"
tqdm = ">=4.0.0"
[package.extras]
dev = ["beautifulsoup4 (==4.9.3)", "black (==22.12.0)", "flake8 (>=3.8.3)", "flaky (>=3.7.0)", "isort (>=5.7.0)", "matplotlib (>=3.0.0)", "onnxruntime (>=1.0.0)", "pytest (>=6.0.0)", "wheel (>=0.36.2)"]
docs = ["m2r2 (>=0.2.7,<0.3.0)", "mistune (==0.8.4)", "myst-parser (>=0.14.0,<0.15.0)", "rinohtype (>=0.4.2)", "sphinx (>=3.4.0)", "sphinx-copybutton (>=0.3.0)", "sphinx-markdown-tables (>=0.0.15)", "sphinx-multiversion (==0.2.4)", "sphinx-rtd-theme"]
nb = ["ipywidgets (>=7.0.0)", "jupyter (>=1.0.0)"]
[[package]]
name = "sqlalchemy"
version = "2.0.22"
@@ -11055,6 +11218,17 @@ files = [
[package.extras]
dev = ["black (>=19.3b0)", "pytest (>=4.6.2)"]
[[package]]
name = "winregistry"
version = "1.1.1"
description = "Library aimed at working with Windows registry"
optional = true
python-versions = ">=3.7,<4.0"
files = [
{file = "winregistry-1.1.1-py3-none-any.whl", hash = "sha256:ad4be5a488838266b4bf826712d640db3daadd1f97ba46820f834a98868b3bc1"},
{file = "winregistry-1.1.1.tar.gz", hash = "sha256:942fecad3751c1b78b9e6b0a520266903c3023f104668ce1bdbf381ec993ad8b"},
]
[[package]]
name = "wolframalpha"
version = "5.0.0"
@@ -11509,7 +11683,7 @@ cli = ["typer"]
cohere = ["cohere"]
docarray = ["docarray"]
embeddings = ["sentence-transformers"]
extended-testing = ["aiosqlite", "aleph-alpha-client", "anthropic", "arxiv", "assemblyai", "atlassian-python-api", "beautifulsoup4", "bibtexparser", "cassio", "chardet", "cohere", "couchbase", "dashvector", "databricks-vectorsearch", "datasets", "dgml-utils", "esprima", "faiss-cpu", "feedparser", "fireworks-ai", "geopandas", "gitpython", "google-cloud-documentai", "gql", "hologres-vector", "html2text", "javelin-sdk", "jinja2", "jq", "jsonschema", "lxml", "markdownify", "motor", "msal", "mwparserfromhell", "mwxml", "newspaper3k", "numexpr", "openai", "openai", "openapi-pydantic", "pandas", "pdfminer-six", "pgvector", "praw", "psychicapi", "py-trello", "pymupdf", "pypdf", "pypdfium2", "pyspark", "rank-bm25", "rapidfuzz", "rapidocr-onnxruntime", "requests-toolbelt", "rspace_client", "scikit-learn", "sqlite-vss", "streamlit", "sympy", "telethon", "timescale-vector", "tqdm", "upstash-redis", "xata", "xmltodict"]
extended-testing = ["aiosqlite", "aleph-alpha-client", "anthropic", "arxiv", "assemblyai", "atlassian-python-api", "beautifulsoup4", "bibtexparser", "cassio", "chardet", "cohere", "couchbase", "dashvector", "databricks-vectorsearch", "datasets", "deepsparse-nightly", "dgml-utils", "esprima", "faiss-cpu", "feedparser", "fireworks-ai", "geopandas", "gitpython", "google-cloud-documentai", "gql", "hologres-vector", "html2text", "javelin-sdk", "jinja2", "jq", "jsonschema", "lxml", "markdownify", "motor", "msal", "mwparserfromhell", "mwxml", "newspaper3k", "numexpr", "openai", "openai", "openapi-pydantic", "pandas", "pdfminer-six", "pgvector", "praw", "psychicapi", "py-trello", "pymupdf", "pypdf", "pypdfium2", "pyspark", "rank-bm25", "rapidfuzz", "rapidocr-onnxruntime", "requests-toolbelt", "rspace_client", "scikit-learn", "sqlite-vss", "streamlit", "sympy", "telethon", "timescale-vector", "tqdm", "upstash-redis", "xata", "xmltodict"]
javascript = ["esprima"]
llms = ["clarifai", "cohere", "huggingface_hub", "manifest-ml", "nlpcloud", "openai", "openlm", "torch", "transformers"]
openai = ["openai", "tiktoken"]
@@ -11519,4 +11693,4 @@ text-helpers = ["chardet"]
[metadata]
lock-version = "2.0"
python-versions = ">=3.8.1,<4.0"
content-hash = "f4791327aca4bf3db1b46731d987347b537e638a1be85b2a6a771e52f95d3f29"
content-hash = "22f6e3c372ae813f1d833ea1b87cc53acd74cce5f3ee749e8f2d7a8c8bb1fc3e"

View File

@@ -143,13 +143,14 @@ azure-ai-textanalytics = {version = "^5.3.0", optional = true}
google-cloud-documentai = {version = "^2.20.1", optional = true}
fireworks-ai = {version = "^0.6.0", optional = true, python = ">=3.9,<4.0"}
javelin-sdk = {version = "^0.1.8", optional = true}
deepsparse-nightly = {version = "^1.6.0.20231120", extras=["llm"], optional = true, python = ">=3.8.1,<3.12"}
hologres-vector = {version = "^0.0.6", optional = true}
praw = {version = "^7.7.1", optional = true}
msal = {version = "^1.25.0", optional = true}
databricks-vectorsearch = {version = "^0.21", optional = true}
couchbase = {version = "^4.1.9", optional = true}
dgml-utils = {version = "^0.3.0", optional = true}
datasets = {version = "^2.15.0", optional = true}
datasets = {version = "^2.14.0", optional = true}
[tool.poetry.group.test.dependencies]
# The only dependencies that should be added are
@@ -389,6 +390,7 @@ extended_testing = [
"rspace_client",
"fireworks-ai",
"javelin-sdk",
"deepsparse-nightly",
"hologres-vector",
"praw",
"databricks-vectorsearch",

View File

@@ -1,17 +0,0 @@
"""Test DeepSparse wrapper."""
from langchain.llms import DeepSparse
def test_deepsparse_call() -> None:
"""Test valid call to DeepSparse."""
config = {"max_generated_tokens": 5, "use_deepsparse_cache": False}
llm = DeepSparse(
model="zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none",
config=config,
)
output = llm("def ")
assert isinstance(output, str)
assert len(output) > 1
assert output == "ids_to_names"

View File

@@ -0,0 +1,85 @@
import pytest
from langchain.llms import DeepSparse
generation_config = {"max_new_tokens": 5}
@pytest.mark.requires("deepsparse")
def test_deepsparse_call() -> None:
"""Test valid call to DeepSparse."""
llm = DeepSparse(
model="zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none",
generation_config=generation_config,
)
output = llm("def ")
assert isinstance(output, str)
assert len(output) > 1
@pytest.mark.requires("deepsparse")
def test_deepsparse_streaming() -> None:
"""Test valid call to DeepSparse with streaming."""
llm = DeepSparse(
model="hf:neuralmagic/mpt-7b-chat-pruned50-quant",
generation_config=generation_config,
streaming=True,
)
output = " "
for chunk in llm.stream("Tell me a joke", stop=["'", "\n"]):
output += chunk
assert isinstance(output, str)
assert len(output) > 1
@pytest.mark.requires("deepsparse")
@pytest.mark.scheduled
@pytest.mark.asyncio
async def test_deepsparse_astream() -> None:
llm = DeepSparse(
model="hf:neuralmagic/mpt-7b-chat-pruned50-quant",
generation_config=generation_config,
)
async for token in llm.astream("I'm Pickle Rick"):
assert isinstance(token, str)
@pytest.mark.scheduled
@pytest.mark.asyncio
@pytest.mark.requires("deepsparse")
async def test_deepsparse_abatch() -> None:
llm = DeepSparse(
model="hf:neuralmagic/mpt-7b-chat-pruned50-quant",
generation_config=generation_config,
)
result = await llm.abatch(["I'm Pickle Rick", "I'm not Pickle Rick"])
for token in result:
assert isinstance(token, str)
@pytest.mark.asyncio
@pytest.mark.requires("deepsparse")
async def test_deepsparse_abatch_tags() -> None:
llm = DeepSparse(
model="hf:neuralmagic/mpt-7b-chat-pruned50-quant",
generation_config=generation_config,
)
result = await llm.abatch(
["I'm Pickle Rick", "I'm not Pickle Rick"], config={"tags": ["foo"]}
)
for token in result:
assert isinstance(token, str)
@pytest.mark.scheduled
@pytest.mark.asyncio
@pytest.mark.requires("deepsparse")
async def test_deepsparse_ainvoke() -> None:
llm = DeepSparse(
model="hf:neuralmagic/mpt-7b-chat-pruned50-quant",
generation_config=generation_config,
)
result = await llm.ainvoke("I'm Pickle Rick", config={"tags": ["foo"]})
assert isinstance(result, str)