mirror of https://github.com/nomic-ai/gpt4all.git synced 2025-07-13 07:04:20 +00:00

History

Zach Nussbaum b8658c489a chore: reqs.txt		2023-07-14 17:50:05 -04:00
..
converter	feat: converter scripts from hf	2023-07-14 17:49:46 -04:00
README.md	docs: steps to convert	2023-07-14 17:49:56 -04:00
requirements.txt	chore: reqs.txt	2023-07-14 17:50:05 -04:00

README.md

Converting From A Trained Huggingface Model to GGML Quantized Model

Currently, converting from a Huggingface Model to a GGML Quantized model is a tedious process that involves a few different steps. Here we will outline the current process.

convert_llama_hf_to_ggml.py is from llama.cpp and doesn't rely on Huggingface or PyTorch.

The other scripts rely on Huggingface and PyTorch and are adapted from ggml.

For the following example, we will use a LLaMa style model.

Install the depenedencies
```
pip install -r requirements.txt
```
Convert the model to ggml format

python converter/convert_llama_hf_to_ggml.py <model_name> <output_dir> --outtype=<output_type>

Navigate to the llama.cpp directory

Build llama.cpp

mkdir build
cd build
cmake ..
cmake --build . --config Release

Run the quantize binary

./quantize <ggmlfp32.bin>  <output_model.bin> <quantization_level>