mirror of
https://github.com/nomic-ai/gpt4all.git
synced 2025-06-23 22:18:38 +00:00
docs: steps to convert
This commit is contained in:
parent
c7f0cf0cd2
commit
1b3b32c630
38
gpt4all-quantizer/README.md
Normal file
38
gpt4all-quantizer/README.md
Normal file
@ -0,0 +1,38 @@
|
|||||||
|
# Converting From A Trained Huggingface Model to GGML Quantized Model
|
||||||
|
|
||||||
|
Currently, converting from a Huggingface Model to a GGML Quantized model is a tedious process that involves a few different steps. Here we will outline the current process.
|
||||||
|
|
||||||
|
`convert_llama_hf_to_ggml.py` is from [llama.cpp](https://github.com/ggerganov/llama.cpp/blob/master/convert.py) and doesn't rely on Huggingface or PyTorch.
|
||||||
|
|
||||||
|
The other scripts rely on Huggingface and PyTorch and are adapted from [ggml](https://github.com/ggerganov/ggml).
|
||||||
|
|
||||||
|
For the following example, we will use a LLaMa style model.
|
||||||
|
|
||||||
|
1. Install the depenedencies
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install -r requirements.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Convert the model to `ggml` format
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python converter/convert_llama_hf_to_ggml.py <model_name> <output_dir> --outtype=<output_type>
|
||||||
|
```
|
||||||
|
|
||||||
|
1. Navigate to the `llama.cpp` directory
|
||||||
|
|
||||||
|
1. Build `llama.cpp`
|
||||||
|
|
||||||
|
```bash
|
||||||
|
mkdir build
|
||||||
|
cd build
|
||||||
|
cmake ..
|
||||||
|
cmake --build . --config Release
|
||||||
|
```
|
||||||
|
|
||||||
|
1. Run the `quantize` binary
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./quantize <ggmlfp32.bin> <output_model.bin> <quantization_level>
|
||||||
|
```
|
Loading…
Reference in New Issue
Block a user