mirror of
https://github.com/nomic-ai/gpt4all.git
synced 2025-09-02 17:15:18 +00:00
vulkan support for typescript bindings, gguf support (#1390)
* adding some native methods to cpp wrapper * gpu seems to work * typings and add availibleGpus method * fix spelling * fix syntax * more * normalize methods to conform to py * remove extra dynamic linker deps when building with vulkan * bump python version (library linking fix) * Don't link against libvulkan. * vulkan python bindings on windows fixes * Bring the vulkan backend to the GUI. * When device is Auto (the default) then we will only consider discrete GPU's otherwise fallback to CPU. * Show the device we're currently using. * Fix up the name and formatting. * init at most one vulkan device, submodule update fixes issues w/ multiple of the same gpu * Update the submodule. * Add version 2.4.15 and bump the version number. * Fix a bug where we're not properly falling back to CPU. * Sync to a newer version of llama.cpp with bugfix for vulkan. * Report the actual device we're using. * Only show GPU when we're actually using it. * Bump to new llama with new bugfix. * Release notes for v2.4.16 and bump the version. * Fallback to CPU more robustly. * Release notes for v2.4.17 and bump the version. * Bump the Python version to python-v1.0.12 to restrict the quants that vulkan recognizes. * Link against ggml in bin so we can get the available devices without loading a model. * Send actual and requested device info for those who have opt-in. * Actually bump the version. * Release notes for v2.4.18 and bump the version. * Fix for crashes on systems where vulkan is not installed properly. * Release notes for v2.4.19 and bump the version. * fix typings and vulkan build works on win * Add flatpak manifest * Remove unnecessary stuffs from manifest * Update to 2.4.19 * appdata: update software description * Latest rebase on llama.cpp with gguf support. * macos build fixes * llamamodel: metal supports all quantization types now * gpt4all.py: GGUF * pyllmodel: print specific error message * backend: port BERT to GGUF * backend: port MPT to GGUF * backend: port Replit to GGUF * backend: use gguf branch of llama.cpp-mainline * backend: use llamamodel.cpp for StarCoder * conversion scripts: cleanup * convert scripts: load model as late as possible * convert_mpt_hf_to_gguf.py: better tokenizer decoding * backend: use llamamodel.cpp for Falcon * convert scripts: make them directly executable * fix references to removed model types * modellist: fix the system prompt * backend: port GPT-J to GGUF * gpt-j: update inference to match latest llama.cpp insights - Use F16 KV cache - Store transposed V in the cache - Avoid unnecessary Q copy Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> ggml upstream commit 0265f0813492602fec0e1159fe61de1bf0ccaf78 * chatllm: grammar fix * convert scripts: use bytes_to_unicode from transformers * convert scripts: make gptj script executable * convert scripts: add feed-forward length for better compatiblilty This GGUF key is used by all llama.cpp models with upstream support. * gptj: remove unused variables * Refactor for subgroups on mat * vec kernel. * Add q6_k kernels for vulkan. * python binding: print debug message to stderr * Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf. * Bump to the latest fixes for vulkan in llama. * llamamodel: fix static vector in LLamaModel::endTokens * Switch to new models2.json for new gguf release and bump our version to 2.5.0. * Bump to latest llama/gguf branch. * chat: report reason for fallback to CPU * chat: make sure to clear fallback reason on success * more accurate fallback descriptions * differentiate between init failure and unsupported models * backend: do not use Vulkan with non-LLaMA models * Add q8_0 kernels to kompute shaders and bump to latest llama/gguf. * backend: fix build with Visual Studio generator Use the $<CONFIG> generator expression instead of CMAKE_BUILD_TYPE. This is needed because Visual Studio is a multi-configuration generator, so we do not know what the build type will be until `cmake --build` is called. Fixes #1470 * remove old llama.cpp submodules * Reorder and refresh our models2.json. * rebase on newer llama.cpp * python/embed4all: use gguf model, allow passing kwargs/overriding model * Add starcoder, rift and sbert to our models2.json. * Push a new version number for llmodel backend now that it is based on gguf. * fix stray comma in models2.json Signed-off-by: Aaron Miller <apage43@ninjawhale.com> * Speculative fix for build on mac. * chat: clearer CPU fallback messages * Fix crasher with an empty string for prompt template. * Update the language here to avoid misunderstanding. * added EM German Mistral Model * make codespell happy * issue template: remove "Related Components" section * cmake: install the GPT-J plugin (#1487) * Do not delete saved chats if we fail to serialize properly. * Restore state from text if necessary. * Another codespell attempted fix. * llmodel: do not call magic_match unless build variant is correct (#1488) * chatllm: do not write uninitialized data to stream (#1486) * mat*mat for q4_0, q8_0 * do not process prompts on gpu yet * python: support Path in GPT4All.__init__ (#1462) * llmodel: print an error if the CPU does not support AVX (#1499) * python bindings should be quiet by default * disable llama.cpp logging unless GPT4ALL_VERBOSE_LLAMACPP envvar is nonempty * make verbose flag for retrieve_model default false (but also be overridable via gpt4all constructor) should be able to run a basic test: ```python import gpt4all model = gpt4all.GPT4All('/Users/aaron/Downloads/rift-coder-v0-7b-q4_0.gguf') print(model.generate('def fib(n):')) ``` and see no non-model output when successful * python: always check status code of HTTP responses (#1502) * Always save chats to disk, but save them as text by default. This also changes the UI behavior to always open a 'New Chat' and setting it as current instead of setting a restored chat as current. This improves usability by not requiring the user to wait if they want to immediately start chatting. * Update README.md Signed-off-by: umarmnaq <102142660+umarmnaq@users.noreply.github.com> * fix embed4all filename https://discordapp.com/channels/1076964370942267462/1093558720690143283/1161778216462192692 Signed-off-by: Aaron Miller <apage43@ninjawhale.com> * Improves Java API signatures maintaining back compatibility * python: replace deprecated pkg_resources with importlib (#1505) * Updated chat wishlist (#1351) * q6k, q4_1 mat*mat * update mini-orca 3b to gguf2, license Signed-off-by: Aaron Miller <apage43@ninjawhale.com> * convert scripts: fix AutoConfig typo (#1512) * publish config https://docs.npmjs.com/cli/v9/configuring-npm/package-json#publishconfig (#1375) merge into my branch * fix appendBin * fix gpu not initializing first * sync up * progress, still wip on destructor * some detection work * untested dispose method * add js side of dispose * Update gpt4all-bindings/typescript/index.cc Co-authored-by: cebtenzzre <cebtenzzre@gmail.com> Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * Update gpt4all-bindings/typescript/index.cc Co-authored-by: cebtenzzre <cebtenzzre@gmail.com> Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * Update gpt4all-bindings/typescript/index.cc Co-authored-by: cebtenzzre <cebtenzzre@gmail.com> Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * Update gpt4all-bindings/typescript/src/gpt4all.d.ts Co-authored-by: cebtenzzre <cebtenzzre@gmail.com> Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * Update gpt4all-bindings/typescript/src/gpt4all.js Co-authored-by: cebtenzzre <cebtenzzre@gmail.com> Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * Update gpt4all-bindings/typescript/src/util.js Co-authored-by: cebtenzzre <cebtenzzre@gmail.com> Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * fix tests * fix circleci for nodejs * bump version --------- Signed-off-by: Aaron Miller <apage43@ninjawhale.com> Signed-off-by: umarmnaq <102142660+umarmnaq@users.noreply.github.com> Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> Co-authored-by: Aaron Miller <apage43@ninjawhale.com> Co-authored-by: Adam Treat <treat.adam@gmail.com> Co-authored-by: Akarshan Biswas <akarshan.biswas@gmail.com> Co-authored-by: Cebtenzzre <cebtenzzre@gmail.com> Co-authored-by: Jan Philipp Harries <jpdus@users.noreply.github.com> Co-authored-by: umarmnaq <102142660+umarmnaq@users.noreply.github.com> Co-authored-by: Alex Soto <asotobu@gmail.com> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>
This commit is contained in:
@@ -856,6 +856,7 @@ jobs:
|
|||||||
- node/install-packages:
|
- node/install-packages:
|
||||||
app-dir: gpt4all-bindings/typescript
|
app-dir: gpt4all-bindings/typescript
|
||||||
pkg-manager: yarn
|
pkg-manager: yarn
|
||||||
|
override-ci-command: yarn install
|
||||||
- run:
|
- run:
|
||||||
command: |
|
command: |
|
||||||
cd gpt4all-bindings/typescript
|
cd gpt4all-bindings/typescript
|
||||||
@@ -885,6 +886,7 @@ jobs:
|
|||||||
- node/install-packages:
|
- node/install-packages:
|
||||||
app-dir: gpt4all-bindings/typescript
|
app-dir: gpt4all-bindings/typescript
|
||||||
pkg-manager: yarn
|
pkg-manager: yarn
|
||||||
|
override-ci-command: yarn install
|
||||||
- run:
|
- run:
|
||||||
command: |
|
command: |
|
||||||
cd gpt4all-bindings/typescript
|
cd gpt4all-bindings/typescript
|
||||||
@@ -994,7 +996,7 @@ jobs:
|
|||||||
command: |
|
command: |
|
||||||
cd gpt4all-bindings/typescript
|
cd gpt4all-bindings/typescript
|
||||||
npm set //registry.npmjs.org/:_authToken=$NPM_TOKEN
|
npm set //registry.npmjs.org/:_authToken=$NPM_TOKEN
|
||||||
npm publish --access public --tag alpha
|
npm publish
|
||||||
|
|
||||||
workflows:
|
workflows:
|
||||||
version: 2
|
version: 2
|
||||||
|
1
gpt4all-bindings/typescript/.yarnrc.yml
Normal file
1
gpt4all-bindings/typescript/.yarnrc.yml
Normal file
@@ -0,0 +1 @@
|
|||||||
|
nodeLinker: node-modules
|
@@ -75,15 +75,12 @@ cd gpt4all-bindings/typescript
|
|||||||
```sh
|
```sh
|
||||||
yarn
|
yarn
|
||||||
```
|
```
|
||||||
|
|
||||||
* llama.cpp git submodule for gpt4all can be possibly absent. If this is the case, make sure to run in llama.cpp parent directory
|
* llama.cpp git submodule for gpt4all can be possibly absent. If this is the case, make sure to run in llama.cpp parent directory
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
git submodule update --init --depth 1 --recursive
|
git submodule update --init --depth 1 --recursive
|
||||||
```
|
```
|
||||||
|
|
||||||
**AS OF NEW BACKEND** to build the backend,
|
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
yarn build:backend
|
yarn build:backend
|
||||||
```
|
```
|
||||||
|
@@ -1,6 +1,5 @@
|
|||||||
#include "index.h"
|
#include "index.h"
|
||||||
|
|
||||||
Napi::FunctionReference NodeModelWrapper::constructor;
|
|
||||||
|
|
||||||
Napi::Function NodeModelWrapper::GetClass(Napi::Env env) {
|
Napi::Function NodeModelWrapper::GetClass(Napi::Env env) {
|
||||||
Napi::Function self = DefineClass(env, "LLModel", {
|
Napi::Function self = DefineClass(env, "LLModel", {
|
||||||
@@ -13,14 +12,64 @@ Napi::Function NodeModelWrapper::GetClass(Napi::Env env) {
|
|||||||
InstanceMethod("embed", &NodeModelWrapper::GenerateEmbedding),
|
InstanceMethod("embed", &NodeModelWrapper::GenerateEmbedding),
|
||||||
InstanceMethod("threadCount", &NodeModelWrapper::ThreadCount),
|
InstanceMethod("threadCount", &NodeModelWrapper::ThreadCount),
|
||||||
InstanceMethod("getLibraryPath", &NodeModelWrapper::GetLibraryPath),
|
InstanceMethod("getLibraryPath", &NodeModelWrapper::GetLibraryPath),
|
||||||
|
InstanceMethod("initGpuByString", &NodeModelWrapper::InitGpuByString),
|
||||||
|
InstanceMethod("hasGpuDevice", &NodeModelWrapper::HasGpuDevice),
|
||||||
|
InstanceMethod("listGpu", &NodeModelWrapper::GetGpuDevices),
|
||||||
|
InstanceMethod("memoryNeeded", &NodeModelWrapper::GetRequiredMemory),
|
||||||
|
InstanceMethod("dispose", &NodeModelWrapper::Dispose)
|
||||||
});
|
});
|
||||||
// Keep a static reference to the constructor
|
// Keep a static reference to the constructor
|
||||||
//
|
//
|
||||||
constructor = Napi::Persistent(self);
|
Napi::FunctionReference* constructor = new Napi::FunctionReference();
|
||||||
constructor.SuppressDestruct();
|
*constructor = Napi::Persistent(self);
|
||||||
|
env.SetInstanceData(constructor);
|
||||||
return self;
|
return self;
|
||||||
|
}
|
||||||
|
Napi::Value NodeModelWrapper::GetRequiredMemory(const Napi::CallbackInfo& info)
|
||||||
|
{
|
||||||
|
auto env = info.Env();
|
||||||
|
return Napi::Number::New(env, static_cast<uint32_t>( llmodel_required_mem(GetInference(), full_model_path.c_str()) ));
|
||||||
|
|
||||||
|
}
|
||||||
|
Napi::Value NodeModelWrapper::GetGpuDevices(const Napi::CallbackInfo& info)
|
||||||
|
{
|
||||||
|
auto env = info.Env();
|
||||||
|
int num_devices = 0;
|
||||||
|
auto mem_size = llmodel_required_mem(GetInference(), full_model_path.c_str());
|
||||||
|
llmodel_gpu_device* all_devices = llmodel_available_gpu_devices(GetInference(), mem_size, &num_devices);
|
||||||
|
if(all_devices == nullptr) {
|
||||||
|
Napi::Error::New(
|
||||||
|
env,
|
||||||
|
"Unable to retrieve list of all GPU devices"
|
||||||
|
).ThrowAsJavaScriptException();
|
||||||
|
return env.Undefined();
|
||||||
|
}
|
||||||
|
auto js_array = Napi::Array::New(env, num_devices);
|
||||||
|
for(int i = 0; i < num_devices; ++i) {
|
||||||
|
auto gpu_device = all_devices[i];
|
||||||
|
/*
|
||||||
|
*
|
||||||
|
* struct llmodel_gpu_device {
|
||||||
|
int index = 0;
|
||||||
|
int type = 0; // same as VkPhysicalDeviceType
|
||||||
|
size_t heapSize = 0;
|
||||||
|
const char * name;
|
||||||
|
const char * vendor;
|
||||||
|
};
|
||||||
|
*
|
||||||
|
*/
|
||||||
|
Napi::Object js_gpu_device = Napi::Object::New(env);
|
||||||
|
js_gpu_device["index"] = uint32_t(gpu_device.index);
|
||||||
|
js_gpu_device["type"] = uint32_t(gpu_device.type);
|
||||||
|
js_gpu_device["heapSize"] = static_cast<uint32_t>( gpu_device.heapSize );
|
||||||
|
js_gpu_device["name"]= gpu_device.name;
|
||||||
|
js_gpu_device["vendor"] = gpu_device.vendor;
|
||||||
|
|
||||||
|
js_array[i] = js_gpu_device;
|
||||||
|
}
|
||||||
|
return js_array;
|
||||||
}
|
}
|
||||||
|
|
||||||
Napi::Value NodeModelWrapper::getType(const Napi::CallbackInfo& info)
|
Napi::Value NodeModelWrapper::getType(const Napi::CallbackInfo& info)
|
||||||
{
|
{
|
||||||
if(type.empty()) {
|
if(type.empty()) {
|
||||||
@@ -29,15 +78,41 @@ Napi::Function NodeModelWrapper::GetClass(Napi::Env env) {
|
|||||||
return Napi::String::New(info.Env(), type);
|
return Napi::String::New(info.Env(), type);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
Napi::Value NodeModelWrapper::InitGpuByString(const Napi::CallbackInfo& info)
|
||||||
|
{
|
||||||
|
auto env = info.Env();
|
||||||
|
uint32_t memory_required = info[0].As<Napi::Number>();
|
||||||
|
|
||||||
|
std::string gpu_device_identifier = info[1].As<Napi::String>();
|
||||||
|
|
||||||
|
size_t converted_value;
|
||||||
|
if(memory_required <= std::numeric_limits<size_t>::max()) {
|
||||||
|
converted_value = static_cast<size_t>(memory_required);
|
||||||
|
} else {
|
||||||
|
Napi::Error::New(
|
||||||
|
env,
|
||||||
|
"invalid number for memory size. Exceeded bounds for memory."
|
||||||
|
).ThrowAsJavaScriptException();
|
||||||
|
return env.Undefined();
|
||||||
|
}
|
||||||
|
|
||||||
|
auto result = llmodel_gpu_init_gpu_device_by_string(GetInference(), converted_value, gpu_device_identifier.c_str());
|
||||||
|
return Napi::Boolean::New(env, result);
|
||||||
|
}
|
||||||
|
Napi::Value NodeModelWrapper::HasGpuDevice(const Napi::CallbackInfo& info)
|
||||||
|
{
|
||||||
|
return Napi::Boolean::New(info.Env(), llmodel_has_gpu_device(GetInference()));
|
||||||
|
}
|
||||||
|
|
||||||
NodeModelWrapper::NodeModelWrapper(const Napi::CallbackInfo& info) : Napi::ObjectWrap<NodeModelWrapper>(info)
|
NodeModelWrapper::NodeModelWrapper(const Napi::CallbackInfo& info) : Napi::ObjectWrap<NodeModelWrapper>(info)
|
||||||
{
|
{
|
||||||
auto env = info.Env();
|
auto env = info.Env();
|
||||||
fs::path model_path;
|
fs::path model_path;
|
||||||
|
|
||||||
std::string full_weight_path;
|
std::string full_weight_path,
|
||||||
//todo
|
library_path = ".",
|
||||||
std::string library_path = ".";
|
model_name,
|
||||||
std::string model_name;
|
device;
|
||||||
if(info[0].IsString()) {
|
if(info[0].IsString()) {
|
||||||
model_path = info[0].As<Napi::String>().Utf8Value();
|
model_path = info[0].As<Napi::String>().Utf8Value();
|
||||||
full_weight_path = model_path.string();
|
full_weight_path = model_path.string();
|
||||||
@@ -56,13 +131,14 @@ Napi::Function NodeModelWrapper::GetClass(Napi::Env env) {
|
|||||||
} else {
|
} else {
|
||||||
library_path = ".";
|
library_path = ".";
|
||||||
}
|
}
|
||||||
|
device = config_object.Get("device").As<Napi::String>();
|
||||||
}
|
}
|
||||||
llmodel_set_implementation_search_path(library_path.c_str());
|
llmodel_set_implementation_search_path(library_path.c_str());
|
||||||
llmodel_error e = {
|
llmodel_error e = {
|
||||||
.message="looks good to me",
|
.message="looks good to me",
|
||||||
.code=0,
|
.code=0,
|
||||||
};
|
};
|
||||||
inference_ = std::make_shared<llmodel_model>(llmodel_model_create2(full_weight_path.c_str(), "auto", &e));
|
inference_ = llmodel_model_create2(full_weight_path.c_str(), "auto", &e);
|
||||||
if(e.code != 0) {
|
if(e.code != 0) {
|
||||||
Napi::Error::New(env, e.message).ThrowAsJavaScriptException();
|
Napi::Error::New(env, e.message).ThrowAsJavaScriptException();
|
||||||
return;
|
return;
|
||||||
@@ -74,18 +150,45 @@ Napi::Function NodeModelWrapper::GetClass(Napi::Env env) {
|
|||||||
Napi::Error::New(env, "Had an issue creating llmodel object, inference is null").ThrowAsJavaScriptException();
|
Napi::Error::New(env, "Had an issue creating llmodel object, inference is null").ThrowAsJavaScriptException();
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
if(device != "cpu") {
|
||||||
|
size_t mem = llmodel_required_mem(GetInference(), full_weight_path.c_str());
|
||||||
|
if(mem == 0) {
|
||||||
|
std::cout << "WARNING: no memory needed. does this model support gpu?\n";
|
||||||
|
}
|
||||||
|
std::cout << "Initiating GPU\n";
|
||||||
|
std::cout << "Memory required estimation: " << mem << "\n";
|
||||||
|
|
||||||
|
auto success = llmodel_gpu_init_gpu_device_by_string(GetInference(), mem, device.c_str());
|
||||||
|
if(success) {
|
||||||
|
std::cout << "GPU init successfully\n";
|
||||||
|
} else {
|
||||||
|
std::cout << "WARNING: Failed to init GPU\n";
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
auto success = llmodel_loadModel(GetInference(), full_weight_path.c_str());
|
auto success = llmodel_loadModel(GetInference(), full_weight_path.c_str());
|
||||||
if(!success) {
|
if(!success) {
|
||||||
Napi::Error::New(env, "Failed to load model at given path").ThrowAsJavaScriptException();
|
Napi::Error::New(env, "Failed to load model at given path").ThrowAsJavaScriptException();
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
name = model_name.empty() ? model_path.filename().string() : model_name;
|
|
||||||
};
|
|
||||||
//NodeModelWrapper::~NodeModelWrapper() {
|
|
||||||
//GetInference().reset();
|
|
||||||
//}
|
|
||||||
|
|
||||||
|
name = model_name.empty() ? model_path.filename().string() : model_name;
|
||||||
|
full_model_path = full_weight_path;
|
||||||
|
};
|
||||||
|
|
||||||
|
// NodeModelWrapper::~NodeModelWrapper() {
|
||||||
|
// if(GetInference() != nullptr) {
|
||||||
|
// std::cout << "Debug: deleting model\n";
|
||||||
|
// llmodel_model_destroy(inference_);
|
||||||
|
// std::cout << (inference_ == nullptr);
|
||||||
|
// }
|
||||||
|
// }
|
||||||
|
// void NodeModelWrapper::Finalize(Napi::Env env) {
|
||||||
|
// if(inference_ != nullptr) {
|
||||||
|
// std::cout << "Debug: deleting model\n";
|
||||||
|
//
|
||||||
|
// }
|
||||||
|
// }
|
||||||
Napi::Value NodeModelWrapper::IsModelLoaded(const Napi::CallbackInfo& info) {
|
Napi::Value NodeModelWrapper::IsModelLoaded(const Napi::CallbackInfo& info) {
|
||||||
return Napi::Boolean::New(info.Env(), llmodel_isModelLoaded(GetInference()));
|
return Napi::Boolean::New(info.Env(), llmodel_isModelLoaded(GetInference()));
|
||||||
}
|
}
|
||||||
@@ -193,8 +296,9 @@ Napi::Function NodeModelWrapper::GetClass(Napi::Env env) {
|
|||||||
std::string copiedQuestion = question;
|
std::string copiedQuestion = question;
|
||||||
PromptWorkContext pc = {
|
PromptWorkContext pc = {
|
||||||
copiedQuestion,
|
copiedQuestion,
|
||||||
std::ref(inference_),
|
inference_,
|
||||||
copiedPrompt,
|
copiedPrompt,
|
||||||
|
""
|
||||||
};
|
};
|
||||||
auto threadSafeContext = new TsfnContext(env, pc);
|
auto threadSafeContext = new TsfnContext(env, pc);
|
||||||
threadSafeContext->tsfn = Napi::ThreadSafeFunction::New(
|
threadSafeContext->tsfn = Napi::ThreadSafeFunction::New(
|
||||||
@@ -210,7 +314,9 @@ Napi::Function NodeModelWrapper::GetClass(Napi::Env env) {
|
|||||||
threadSafeContext->nativeThread = std::thread(threadEntry, threadSafeContext);
|
threadSafeContext->nativeThread = std::thread(threadEntry, threadSafeContext);
|
||||||
return threadSafeContext->deferred_.Promise();
|
return threadSafeContext->deferred_.Promise();
|
||||||
}
|
}
|
||||||
|
void NodeModelWrapper::Dispose(const Napi::CallbackInfo& info) {
|
||||||
|
llmodel_model_destroy(inference_);
|
||||||
|
}
|
||||||
void NodeModelWrapper::SetThreadCount(const Napi::CallbackInfo& info) {
|
void NodeModelWrapper::SetThreadCount(const Napi::CallbackInfo& info) {
|
||||||
if(info[0].IsNumber()) {
|
if(info[0].IsNumber()) {
|
||||||
llmodel_setThreadCount(GetInference(), info[0].As<Napi::Number>().Int64Value());
|
llmodel_setThreadCount(GetInference(), info[0].As<Napi::Number>().Int64Value());
|
||||||
@@ -233,7 +339,7 @@ Napi::Function NodeModelWrapper::GetClass(Napi::Env env) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
llmodel_model NodeModelWrapper::GetInference() {
|
llmodel_model NodeModelWrapper::GetInference() {
|
||||||
return *inference_;
|
return inference_;
|
||||||
}
|
}
|
||||||
|
|
||||||
//Exports Bindings
|
//Exports Bindings
|
||||||
|
@@ -6,24 +6,33 @@
|
|||||||
#include <atomic>
|
#include <atomic>
|
||||||
#include <memory>
|
#include <memory>
|
||||||
#include <filesystem>
|
#include <filesystem>
|
||||||
|
#include <set>
|
||||||
namespace fs = std::filesystem;
|
namespace fs = std::filesystem;
|
||||||
|
|
||||||
|
|
||||||
class NodeModelWrapper: public Napi::ObjectWrap<NodeModelWrapper> {
|
class NodeModelWrapper: public Napi::ObjectWrap<NodeModelWrapper> {
|
||||||
public:
|
public:
|
||||||
NodeModelWrapper(const Napi::CallbackInfo &);
|
NodeModelWrapper(const Napi::CallbackInfo &);
|
||||||
//~NodeModelWrapper();
|
//virtual ~NodeModelWrapper();
|
||||||
Napi::Value getType(const Napi::CallbackInfo& info);
|
Napi::Value getType(const Napi::CallbackInfo& info);
|
||||||
Napi::Value IsModelLoaded(const Napi::CallbackInfo& info);
|
Napi::Value IsModelLoaded(const Napi::CallbackInfo& info);
|
||||||
Napi::Value StateSize(const Napi::CallbackInfo& info);
|
Napi::Value StateSize(const Napi::CallbackInfo& info);
|
||||||
|
//void Finalize(Napi::Env env) override;
|
||||||
/**
|
/**
|
||||||
* Prompting the model. This entails spawning a new thread and adding the response tokens
|
* Prompting the model. This entails spawning a new thread and adding the response tokens
|
||||||
* into a thread local string variable.
|
* into a thread local string variable.
|
||||||
*/
|
*/
|
||||||
Napi::Value Prompt(const Napi::CallbackInfo& info);
|
Napi::Value Prompt(const Napi::CallbackInfo& info);
|
||||||
void SetThreadCount(const Napi::CallbackInfo& info);
|
void SetThreadCount(const Napi::CallbackInfo& info);
|
||||||
|
void Dispose(const Napi::CallbackInfo& info);
|
||||||
Napi::Value getName(const Napi::CallbackInfo& info);
|
Napi::Value getName(const Napi::CallbackInfo& info);
|
||||||
Napi::Value ThreadCount(const Napi::CallbackInfo& info);
|
Napi::Value ThreadCount(const Napi::CallbackInfo& info);
|
||||||
Napi::Value GenerateEmbedding(const Napi::CallbackInfo& info);
|
Napi::Value GenerateEmbedding(const Napi::CallbackInfo& info);
|
||||||
|
Napi::Value HasGpuDevice(const Napi::CallbackInfo& info);
|
||||||
|
Napi::Value ListGpus(const Napi::CallbackInfo& info);
|
||||||
|
Napi::Value InitGpuByString(const Napi::CallbackInfo& info);
|
||||||
|
Napi::Value GetRequiredMemory(const Napi::CallbackInfo& info);
|
||||||
|
Napi::Value GetGpuDevices(const Napi::CallbackInfo& info);
|
||||||
/*
|
/*
|
||||||
* The path that is used to search for the dynamic libraries
|
* The path that is used to search for the dynamic libraries
|
||||||
*/
|
*/
|
||||||
@@ -37,10 +46,10 @@ private:
|
|||||||
/**
|
/**
|
||||||
* The underlying inference that interfaces with the C interface
|
* The underlying inference that interfaces with the C interface
|
||||||
*/
|
*/
|
||||||
std::shared_ptr<llmodel_model> inference_;
|
llmodel_model inference_;
|
||||||
|
|
||||||
std::string type;
|
std::string type;
|
||||||
// corresponds to LLModel::name() in typescript
|
// corresponds to LLModel::name() in typescript
|
||||||
std::string name;
|
std::string name;
|
||||||
static Napi::FunctionReference constructor;
|
std::string full_model_path;
|
||||||
};
|
};
|
||||||
|
@@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "gpt4all",
|
"name": "gpt4all",
|
||||||
"version": "2.2.0",
|
"version": "3.0.0",
|
||||||
"packageManager": "yarn@3.6.1",
|
"packageManager": "yarn@3.6.1",
|
||||||
"main": "src/gpt4all.js",
|
"main": "src/gpt4all.js",
|
||||||
"repository": "nomic-ai/gpt4all",
|
"repository": "nomic-ai/gpt4all",
|
||||||
@@ -47,5 +47,10 @@
|
|||||||
},
|
},
|
||||||
"jest": {
|
"jest": {
|
||||||
"verbose": true
|
"verbose": true
|
||||||
|
},
|
||||||
|
"publishConfig": {
|
||||||
|
"registry": "https://registry.npmjs.org/",
|
||||||
|
"access": "public",
|
||||||
|
"tag": "latest"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@@ -30,7 +30,7 @@ void threadEntry(TsfnContext* context) {
|
|||||||
context->tsfn.BlockingCall(&context->pc,
|
context->tsfn.BlockingCall(&context->pc,
|
||||||
[](Napi::Env env, Napi::Function jsCallback, PromptWorkContext* pc) {
|
[](Napi::Env env, Napi::Function jsCallback, PromptWorkContext* pc) {
|
||||||
llmodel_prompt(
|
llmodel_prompt(
|
||||||
*pc->inference_,
|
pc->inference_,
|
||||||
pc->question.c_str(),
|
pc->question.c_str(),
|
||||||
&prompt_callback,
|
&prompt_callback,
|
||||||
&response_callback,
|
&response_callback,
|
||||||
|
@@ -10,7 +10,7 @@
|
|||||||
#include <memory>
|
#include <memory>
|
||||||
struct PromptWorkContext {
|
struct PromptWorkContext {
|
||||||
std::string question;
|
std::string question;
|
||||||
std::shared_ptr<llmodel_model>& inference_;
|
llmodel_model inference_;
|
||||||
llmodel_prompt_context prompt_params;
|
llmodel_prompt_context prompt_params;
|
||||||
std::string res;
|
std::string res;
|
||||||
|
|
||||||
|
@@ -1,8 +1,8 @@
|
|||||||
import { LLModel, createCompletion, DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY, loadModel } from '../src/gpt4all.js'
|
import { LLModel, createCompletion, DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY, loadModel } from '../src/gpt4all.js'
|
||||||
|
|
||||||
const model = await loadModel(
|
const model = await loadModel(
|
||||||
'orca-mini-3b-gguf2-q4_0.gguf',
|
'mistral-7b-openorca.Q4_0.gguf',
|
||||||
{ verbose: true }
|
{ verbose: true, device: 'gpu' }
|
||||||
);
|
);
|
||||||
const ll = model.llm;
|
const ll = model.llm;
|
||||||
|
|
||||||
@@ -26,7 +26,9 @@ console.log("name " + ll.name());
|
|||||||
console.log("type: " + ll.type());
|
console.log("type: " + ll.type());
|
||||||
console.log("Default directory for models", DEFAULT_DIRECTORY);
|
console.log("Default directory for models", DEFAULT_DIRECTORY);
|
||||||
console.log("Default directory for libraries", DEFAULT_LIBRARIES_DIRECTORY);
|
console.log("Default directory for libraries", DEFAULT_LIBRARIES_DIRECTORY);
|
||||||
|
console.log("Has GPU", ll.hasGpuDevice());
|
||||||
|
console.log("gpu devices", ll.listGpu())
|
||||||
|
console.log("Required Mem in bytes", ll.memoryNeeded())
|
||||||
const completion1 = await createCompletion(model, [
|
const completion1 = await createCompletion(model, [
|
||||||
{ role : 'system', content: 'You are an advanced mathematician.' },
|
{ role : 'system', content: 'You are an advanced mathematician.' },
|
||||||
{ role : 'user', content: 'What is 1 + 1?' },
|
{ role : 'user', content: 'What is 1 + 1?' },
|
||||||
@@ -40,6 +42,8 @@ const completion2 = await createCompletion(model, [
|
|||||||
|
|
||||||
console.log(completion2.choices[0].message)
|
console.log(completion2.choices[0].message)
|
||||||
|
|
||||||
|
//CALLING DISPOSE WILL INVALID THE NATIVE MODEL. USE THIS TO CLEANUP
|
||||||
|
model.dispose()
|
||||||
// At the moment, from testing this code, concurrent model prompting is not possible.
|
// At the moment, from testing this code, concurrent model prompting is not possible.
|
||||||
// Behavior: The last prompt gets answered, but the rest are cancelled
|
// Behavior: The last prompt gets answered, but the rest are cancelled
|
||||||
// my experience with threading is not the best, so if anyone who is good is willing to give this a shot,
|
// my experience with threading is not the best, so if anyone who is good is willing to give this a shot,
|
||||||
@@ -47,16 +51,16 @@ console.log(completion2.choices[0].message)
|
|||||||
// INFO: threading with llama.cpp is not the best maybe not even possible, so this will be left here as reference
|
// INFO: threading with llama.cpp is not the best maybe not even possible, so this will be left here as reference
|
||||||
|
|
||||||
//const responses = await Promise.all([
|
//const responses = await Promise.all([
|
||||||
// createCompletion(ll, [
|
// createCompletion(model, [
|
||||||
// { role : 'system', content: 'You are an advanced mathematician.' },
|
// { role : 'system', content: 'You are an advanced mathematician.' },
|
||||||
// { role : 'user', content: 'What is 1 + 1?' },
|
// { role : 'user', content: 'What is 1 + 1?' },
|
||||||
// ], { verbose: true }),
|
// ], { verbose: true }),
|
||||||
// createCompletion(ll, [
|
// createCompletion(model, [
|
||||||
// { role : 'system', content: 'You are an advanced mathematician.' },
|
// { role : 'system', content: 'You are an advanced mathematician.' },
|
||||||
// { role : 'user', content: 'What is 1 + 1?' },
|
// { role : 'user', content: 'What is 1 + 1?' },
|
||||||
// ], { verbose: true }),
|
// ], { verbose: true }),
|
||||||
//
|
//
|
||||||
//createCompletion(ll, [
|
//createCompletion(model, [
|
||||||
// { role : 'system', content: 'You are an advanced mathematician.' },
|
// { role : 'system', content: 'You are an advanced mathematician.' },
|
||||||
// { role : 'user', content: 'What is 1 + 1?' },
|
// { role : 'user', content: 'What is 1 + 1?' },
|
||||||
//], { verbose: true })
|
//], { verbose: true })
|
||||||
|
@@ -1,8 +1,6 @@
|
|||||||
import { loadModel, createEmbedding } from '../src/gpt4all.js'
|
import { loadModel, createEmbedding } from '../src/gpt4all.js'
|
||||||
|
|
||||||
const embedder = await loadModel("ggml-all-MiniLM-L6-v2-f16.bin", { verbose: true })
|
const embedder = await loadModel("ggml-all-MiniLM-L6-v2-f16.bin", { verbose: true, type: 'embedding'})
|
||||||
|
|
||||||
console.log(
|
console.log(createEmbedding(embedder, "Accept your current situation"))
|
||||||
createEmbedding(embedder, "Accept your current situation")
|
|
||||||
)
|
|
||||||
|
|
||||||
|
68
gpt4all-bindings/typescript/src/gpt4all.d.ts
vendored
68
gpt4all-bindings/typescript/src/gpt4all.d.ts
vendored
@@ -61,6 +61,11 @@ declare class InferenceModel {
|
|||||||
prompt: string,
|
prompt: string,
|
||||||
options?: Partial<LLModelPromptContext>
|
options?: Partial<LLModelPromptContext>
|
||||||
): Promise<string>;
|
): Promise<string>;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* delete and cleanup the native model
|
||||||
|
*/
|
||||||
|
dispose(): void
|
||||||
}
|
}
|
||||||
|
|
||||||
declare class EmbeddingModel {
|
declare class EmbeddingModel {
|
||||||
@@ -69,6 +74,12 @@ declare class EmbeddingModel {
|
|||||||
config: ModelConfig;
|
config: ModelConfig;
|
||||||
|
|
||||||
embed(text: string): Float32Array;
|
embed(text: string): Float32Array;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* delete and cleanup the native model
|
||||||
|
*/
|
||||||
|
dispose(): void
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@@ -146,6 +157,41 @@ declare class LLModel {
|
|||||||
* Where to get the pluggable backend libraries
|
* Where to get the pluggable backend libraries
|
||||||
*/
|
*/
|
||||||
getLibraryPath(): string;
|
getLibraryPath(): string;
|
||||||
|
/**
|
||||||
|
* Initiate a GPU by a string identifier.
|
||||||
|
* @param {number} memory_required Should be in the range size_t or will throw
|
||||||
|
* @param {string} device_name 'amd' | 'nvidia' | 'intel' | 'gpu' | gpu name.
|
||||||
|
* read LoadModelOptions.device for more information
|
||||||
|
*/
|
||||||
|
initGpuByString(memory_required: number, device_name: string): boolean
|
||||||
|
/**
|
||||||
|
* From C documentation
|
||||||
|
* @returns True if a GPU device is successfully initialized, false otherwise.
|
||||||
|
*/
|
||||||
|
hasGpuDevice(): boolean
|
||||||
|
/**
|
||||||
|
* GPUs that are usable for this LLModel
|
||||||
|
* @returns
|
||||||
|
*/
|
||||||
|
listGpu() : GpuDevice[]
|
||||||
|
|
||||||
|
/**
|
||||||
|
* delete and cleanup the native model
|
||||||
|
*/
|
||||||
|
dispose(): void
|
||||||
|
}
|
||||||
|
/**
|
||||||
|
* an object that contains gpu data on this machine.
|
||||||
|
*/
|
||||||
|
interface GpuDevice {
|
||||||
|
index: number;
|
||||||
|
/**
|
||||||
|
* same as VkPhysicalDeviceType
|
||||||
|
*/
|
||||||
|
type: number;
|
||||||
|
heapSize : number;
|
||||||
|
name: string;
|
||||||
|
vendor: string;
|
||||||
}
|
}
|
||||||
|
|
||||||
interface LoadModelOptions {
|
interface LoadModelOptions {
|
||||||
@@ -154,6 +200,21 @@ interface LoadModelOptions {
|
|||||||
modelConfigFile?: string;
|
modelConfigFile?: string;
|
||||||
allowDownload?: boolean;
|
allowDownload?: boolean;
|
||||||
verbose?: boolean;
|
verbose?: boolean;
|
||||||
|
/* The processing unit on which the model will run. It can be set to
|
||||||
|
* - "cpu": Model will run on the central processing unit.
|
||||||
|
* - "gpu": Model will run on the best available graphics processing unit, irrespective of its vendor.
|
||||||
|
* - "amd", "nvidia", "intel": Model will run on the best available GPU from the specified vendor.
|
||||||
|
|
||||||
|
Alternatively, a specific GPU name can also be provided, and the model will run on the GPU that matches the name
|
||||||
|
if it's available.
|
||||||
|
|
||||||
|
Default is "cpu".
|
||||||
|
|
||||||
|
Note: If a GPU device lacks sufficient RAM to accommodate the model, an error will be thrown, and the GPT4All
|
||||||
|
instance will be rendered invalid. It's advised to ensure the device has enough memory before initiating the
|
||||||
|
model.
|
||||||
|
*/
|
||||||
|
device?: string;
|
||||||
}
|
}
|
||||||
|
|
||||||
interface InferenceModelOptions extends LoadModelOptions {
|
interface InferenceModelOptions extends LoadModelOptions {
|
||||||
@@ -184,7 +245,7 @@ declare function loadModel(
|
|||||||
|
|
||||||
declare function loadModel(
|
declare function loadModel(
|
||||||
modelName: string,
|
modelName: string,
|
||||||
options?: EmbeddingOptions | InferenceOptions
|
options?: EmbeddingModelOptions | InferenceModelOptions
|
||||||
): Promise<InferenceModel | EmbeddingModel>;
|
): Promise<InferenceModel | EmbeddingModel>;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@@ -401,7 +462,7 @@ declare const DEFAULT_MODEL_CONFIG: ModelConfig;
|
|||||||
/**
|
/**
|
||||||
* Default prompt context.
|
* Default prompt context.
|
||||||
*/
|
*/
|
||||||
declare const DEFAULT_PROMT_CONTEXT: LLModelPromptContext;
|
declare const DEFAULT_PROMPT_CONTEXT: LLModelPromptContext;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Default model list url.
|
* Default model list url.
|
||||||
@@ -502,7 +563,7 @@ export {
|
|||||||
DEFAULT_DIRECTORY,
|
DEFAULT_DIRECTORY,
|
||||||
DEFAULT_LIBRARIES_DIRECTORY,
|
DEFAULT_LIBRARIES_DIRECTORY,
|
||||||
DEFAULT_MODEL_CONFIG,
|
DEFAULT_MODEL_CONFIG,
|
||||||
DEFAULT_PROMT_CONTEXT,
|
DEFAULT_PROMPT_CONTEXT,
|
||||||
DEFAULT_MODEL_LIST_URL,
|
DEFAULT_MODEL_LIST_URL,
|
||||||
downloadModel,
|
downloadModel,
|
||||||
retrieveModel,
|
retrieveModel,
|
||||||
@@ -510,4 +571,5 @@ export {
|
|||||||
DownloadController,
|
DownloadController,
|
||||||
RetrieveModelOptions,
|
RetrieveModelOptions,
|
||||||
DownloadModelOptions,
|
DownloadModelOptions,
|
||||||
|
GpuDevice
|
||||||
};
|
};
|
||||||
|
@@ -34,6 +34,7 @@ async function loadModel(modelName, options = {}) {
|
|||||||
type: "inference",
|
type: "inference",
|
||||||
allowDownload: true,
|
allowDownload: true,
|
||||||
verbose: true,
|
verbose: true,
|
||||||
|
device: 'cpu',
|
||||||
...options,
|
...options,
|
||||||
};
|
};
|
||||||
|
|
||||||
@@ -61,13 +62,13 @@ async function loadModel(modelName, options = {}) {
|
|||||||
model_name: appendBinSuffixIfMissing(modelName),
|
model_name: appendBinSuffixIfMissing(modelName),
|
||||||
model_path: loadOptions.modelPath,
|
model_path: loadOptions.modelPath,
|
||||||
library_path: libPath,
|
library_path: libPath,
|
||||||
|
device: loadOptions.device,
|
||||||
};
|
};
|
||||||
|
|
||||||
if (loadOptions.verbose) {
|
if (loadOptions.verbose) {
|
||||||
console.debug("Creating LLModel with options:", llmOptions);
|
console.debug("Creating LLModel with options:", llmOptions);
|
||||||
}
|
}
|
||||||
const llmodel = new LLModel(llmOptions);
|
const llmodel = new LLModel(llmOptions);
|
||||||
|
|
||||||
if (loadOptions.type === "embedding") {
|
if (loadOptions.type === "embedding") {
|
||||||
return new EmbeddingModel(llmodel, modelConfig);
|
return new EmbeddingModel(llmodel, modelConfig);
|
||||||
} else if (loadOptions.type === "inference") {
|
} else if (loadOptions.type === "inference") {
|
||||||
|
@@ -15,6 +15,10 @@ class InferenceModel {
|
|||||||
const result = this.llm.raw_prompt(prompt, normalizedPromptContext, () => {});
|
const result = this.llm.raw_prompt(prompt, normalizedPromptContext, () => {});
|
||||||
return result;
|
return result;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
dispose() {
|
||||||
|
this.llm.dispose();
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
class EmbeddingModel {
|
class EmbeddingModel {
|
||||||
@@ -29,6 +33,10 @@ class EmbeddingModel {
|
|||||||
embed(text) {
|
embed(text) {
|
||||||
return this.llm.embed(text)
|
return this.llm.embed(text)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
dispose() {
|
||||||
|
this.llm.dispose();
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@@ -43,8 +43,9 @@ async function listModels(
|
|||||||
}
|
}
|
||||||
|
|
||||||
function appendBinSuffixIfMissing(name) {
|
function appendBinSuffixIfMissing(name) {
|
||||||
if (!name.endsWith(".bin")) {
|
const ext = path.extname(name);
|
||||||
return name + ".bin";
|
if (![".bin", ".gguf"].includes(ext)) {
|
||||||
|
return name + ".gguf";
|
||||||
}
|
}
|
||||||
return name;
|
return name;
|
||||||
}
|
}
|
||||||
|
@@ -92,7 +92,7 @@ describe("listModels", () => {
|
|||||||
|
|
||||||
describe("appendBinSuffixIfMissing", () => {
|
describe("appendBinSuffixIfMissing", () => {
|
||||||
it("should make sure the suffix is there", () => {
|
it("should make sure the suffix is there", () => {
|
||||||
expect(appendBinSuffixIfMissing("filename")).toBe("filename.bin");
|
expect(appendBinSuffixIfMissing("filename")).toBe("filename.gguf");
|
||||||
expect(appendBinSuffixIfMissing("filename.bin")).toBe("filename.bin");
|
expect(appendBinSuffixIfMissing("filename.bin")).toBe("filename.bin");
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
@@ -156,11 +156,11 @@ describe("downloadModel", () => {
|
|||||||
test("should successfully download a model file", async () => {
|
test("should successfully download a model file", async () => {
|
||||||
const downloadController = downloadModel(fakeModelName);
|
const downloadController = downloadModel(fakeModelName);
|
||||||
const modelFilePath = await downloadController.promise;
|
const modelFilePath = await downloadController.promise;
|
||||||
expect(modelFilePath).toBe(path.resolve(DEFAULT_DIRECTORY, `${fakeModelName}.bin`));
|
expect(modelFilePath).toBe(path.resolve(DEFAULT_DIRECTORY, `${fakeModelName}.gguf`));
|
||||||
|
|
||||||
expect(global.fetch).toHaveBeenCalledTimes(1);
|
expect(global.fetch).toHaveBeenCalledTimes(1);
|
||||||
expect(global.fetch).toHaveBeenCalledWith(
|
expect(global.fetch).toHaveBeenCalledWith(
|
||||||
"https://gpt4all.io/models/fake-model.bin",
|
"https://gpt4all.io/models/gguf/fake-model.gguf",
|
||||||
{
|
{
|
||||||
signal: "signal",
|
signal: "signal",
|
||||||
headers: {
|
headers: {
|
||||||
@@ -189,7 +189,7 @@ describe("downloadModel", () => {
|
|||||||
expect(global.fetch).toHaveBeenCalledTimes(1);
|
expect(global.fetch).toHaveBeenCalledTimes(1);
|
||||||
// the file should be missing
|
// the file should be missing
|
||||||
await expect(
|
await expect(
|
||||||
fsp.access(path.resolve(DEFAULT_DIRECTORY, `${fakeModelName}.bin`))
|
fsp.access(path.resolve(DEFAULT_DIRECTORY, `${fakeModelName}.gguf`))
|
||||||
).rejects.toThrow();
|
).rejects.toThrow();
|
||||||
// partial file should also be missing
|
// partial file should also be missing
|
||||||
await expect(
|
await expect(
|
||||||
|
@@ -3,8 +3,8 @@
|
|||||||
"order": "a",
|
"order": "a",
|
||||||
"md5sum": "08d6c05a21512a79a1dfeb9d2a8f262f",
|
"md5sum": "08d6c05a21512a79a1dfeb9d2a8f262f",
|
||||||
"name": "Not a real model",
|
"name": "Not a real model",
|
||||||
"filename": "fake-model.bin",
|
"filename": "fake-model.gguf",
|
||||||
"filesize": "4",
|
"filesize": "4",
|
||||||
"systemPrompt": " "
|
"systemPrompt": " "
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
|
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user