mirror of
https://github.com/nomic-ai/gpt4all.git
synced 2025-04-30 20:54:15 +00:00
.. | ||
converter | ||
README.md | ||
requirements.txt |
Converting From A Trained Huggingface Model to GGML Quantized Model
Currently, converting from a Huggingface Model to a GGML Quantized model is a tedious process that involves a few different steps. Here we will outline the current process.
convert_llama_hf_to_ggml.py
is from llama.cpp and doesn't rely on Huggingface or PyTorch.
The other scripts rely on Huggingface and PyTorch and are adapted from ggml.
For the following example, we will use a LLaMa style model.
-
Install the depenedencies
pip install -r requirements.txt
-
Convert the model to
ggml
format
python converter/convert_llama_hf_to_ggml.py <model_name> <output_dir> --outtype=<output_type>
-
Navigate to the
llama.cpp
directory -
Build
llama.cpp
mkdir build cd build cmake .. cmake --build . --config Release
-
Run the
quantize
binary./quantize <ggmlfp32.bin> <output_model.bin> <quantization_level>