mirror of
https://github.com/imartinez/privateGPT.git
synced 2025-04-27 11:21:34 +00:00
fix: add numpy issue to troubleshooting (#2048)
* docs: add numpy issue to troubleshooting * fix: troubleshooting link ...
This commit is contained in:
parent
b16abbefe4
commit
4ca6d0cb55
@ -307,11 +307,12 @@ If you have all required dependencies properly configured running the
|
|||||||
following powershell command should succeed.
|
following powershell command should succeed.
|
||||||
|
|
||||||
```powershell
|
```powershell
|
||||||
$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python
|
$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
|
||||||
```
|
```
|
||||||
|
|
||||||
If your installation was correct, you should see a message similar to the following next
|
If your installation was correct, you should see a message similar to the following next
|
||||||
time you start the server `BLAS = 1`.
|
time you start the server `BLAS = 1`. If there is some issue, please refer to the
|
||||||
|
[troubleshooting](/installation/getting-started/troubleshooting#building-llama-cpp-with-nvidia-gpu-support) section.
|
||||||
|
|
||||||
```console
|
```console
|
||||||
llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)
|
llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)
|
||||||
@ -339,11 +340,12 @@ Some tips:
|
|||||||
After that running the following command in the repository will install llama.cpp with GPU support:
|
After that running the following command in the repository will install llama.cpp with GPU support:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python
|
CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
|
||||||
```
|
```
|
||||||
|
|
||||||
If your installation was correct, you should see a message similar to the following next
|
If your installation was correct, you should see a message similar to the following next
|
||||||
time you start the server `BLAS = 1`.
|
time you start the server `BLAS = 1`. If there is some issue, please refer to the
|
||||||
|
[troubleshooting](/installation/getting-started/troubleshooting#building-llama-cpp-with-nvidia-gpu-support) section.
|
||||||
|
|
||||||
```
|
```
|
||||||
llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)
|
llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)
|
||||||
|
@ -46,4 +46,19 @@ huggingface:
|
|||||||
embedding:
|
embedding:
|
||||||
embed_dim: 384
|
embed_dim: 384
|
||||||
```
|
```
|
||||||
</Callout>
|
</Callout>
|
||||||
|
|
||||||
|
# Building Llama-cpp with NVIDIA GPU support
|
||||||
|
|
||||||
|
## Out-of-memory error
|
||||||
|
|
||||||
|
If you encounter an out-of-memory error while running `llama-cpp` with CUDA, you can try the following steps to resolve the issue:
|
||||||
|
1. **Set the next environment:**
|
||||||
|
```bash
|
||||||
|
TOKENIZERS_PARALLELISM=true
|
||||||
|
```
|
||||||
|
2. **Run PrivateGPT:**
|
||||||
|
```bash
|
||||||
|
poetry run python -m privategpt
|
||||||
|
```
|
||||||
|
Give thanks to [MarioRossiGithub](https://github.com/MarioRossiGithub) for providing the following solution.
|
Loading…
Reference in New Issue
Block a user