support the llama.cpp CUDA backend (#2310)

mirror of https://github.com/nomic-ai/gpt4all.git synced 2025-11-03 15:37:43 +00:00

* rebase onto llama.cpp commit ggerganov/llama.cpp@d46dbc76f
* support for CUDA backend (enabled by default)
* partial support for Occam's Vulkan backend (disabled by default)
* partial support for HIP/ROCm backend (disabled by default)
* sync llama.cpp.cmake with upstream llama.cpp CMakeLists.txt
* changes to GPT4All backend, bindings, and chat UI to handle choice of llama.cpp backend (Kompute or CUDA)
* ship CUDA runtime with installed version
* make device selection in the UI on macOS actually do something
* model whitelist: remove dbrx, mamba, persimmon, plamo; add internlm and starcoder2

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

This commit is contained in:

Jared Van Bortel

2024-05-15 15:27:50 -04:00

committed by

GitHub

parent a618ca5699

commit d2a99d9bc6

22 changed files with 1360 additions and 773 deletions

1338

gpt4all-backend/llama.cpp.cmake

View File

File diff suppressed because it is too large Load Diff

support the llama.cpp CUDA backend (#2310)

1338 gpt4all-backend/llama.cpp.cmake View File

1338

gpt4all-backend/llama.cpp.cmake

View File