expose n_gpu_layers parameter of llama.cpp (#1890)

Also dynamically limit the GPU layers and context length fields to the maximum supported by the model. Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2025-09-14 06:49:09 +00:00 · 2024-01-31 14:17:44 -05:00
parent f549d5a70a
commit 061d1969f8
31 changed files with 381 additions and 157 deletions
--- a/gpt4all-chat/main.cpp
+++ b/gpt4all-chat/main.cpp
@@ -63,5 +63,9 @@ int main(int argc, char *argv[])
    }
 #endif

+    // Make sure ChatLLM threads are joined before global destructors run.
+    // Otherwise, we can get a heap-use-after-free inside of llama.cpp.
+    ChatListModel::globalInstance()->clearChats();
+
    return app.exec();
 }