fix: add numpy issue to troubleshooting (#2048)

* docs: add numpy issue to troubleshooting

* fix: troubleshooting link

...
This commit is contained in:
Javier Martinez 2024-08-07 12:16:03 +02:00 committed by GitHub
parent b16abbefe4
commit 4ca6d0cb55
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 22 additions and 5 deletions

View File

@ -307,11 +307,12 @@ If you have all required dependencies properly configured running the
following powershell command should succeed. following powershell command should succeed.
```powershell ```powershell
$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python $env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
``` ```
If your installation was correct, you should see a message similar to the following next If your installation was correct, you should see a message similar to the following next
time you start the server `BLAS = 1`. time you start the server `BLAS = 1`. If there is some issue, please refer to the
[troubleshooting](/installation/getting-started/troubleshooting#building-llama-cpp-with-nvidia-gpu-support) section.
```console ```console
llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB) llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)
@ -339,11 +340,12 @@ Some tips:
After that running the following command in the repository will install llama.cpp with GPU support: After that running the following command in the repository will install llama.cpp with GPU support:
```bash ```bash
CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
``` ```
If your installation was correct, you should see a message similar to the following next If your installation was correct, you should see a message similar to the following next
time you start the server `BLAS = 1`. time you start the server `BLAS = 1`. If there is some issue, please refer to the
[troubleshooting](/installation/getting-started/troubleshooting#building-llama-cpp-with-nvidia-gpu-support) section.
``` ```
llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB) llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)

View File

@ -46,4 +46,19 @@ huggingface:
embedding: embedding:
embed_dim: 384 embed_dim: 384
``` ```
</Callout> </Callout>
# Building Llama-cpp with NVIDIA GPU support
## Out-of-memory error
If you encounter an out-of-memory error while running `llama-cpp` with CUDA, you can try the following steps to resolve the issue:
1. **Set the next environment:**
```bash
TOKENIZERS_PARALLELISM=true
```
2. **Run PrivateGPT:**
```bash
poetry run python -m privategpt
```
Give thanks to [MarioRossiGithub](https://github.com/MarioRossiGithub) for providing the following solution.