mirror of
https://github.com/nomic-ai/gpt4all.git
synced 2025-08-02 00:00:35 +00:00
* feat(typescript)/dynamic template (#1287) * remove packaged yarn * prompt templates update wip * prompt template update * system prompt template, update types, remove embed promises, cleanup * support both snakecased and camelcased prompt context * fix #1277 libbert, libfalcon and libreplit libs not being moved into the right folder after build * added support for modelConfigFile param, allowing the user to specify a local file instead of downloading the remote models.json. added a warning message if code fails to load a model config. included prompt context docs by amogus. * snakecase warning, put logic for loading local models.json into listModels, added constant for the default remote model list url, test improvements, simpler hasOwnProperty call * add DEFAULT_PROMPT_CONTEXT, export new constants * add md5sum testcase and fix constants export * update types * throw if attempting to list models without a source * rebuild docs * fix download logging undefined url, toFixed typo, pass config filesize in for future progress report * added overload with union types * bump to 2.2.0, remove alpha * code speling --------- Co-authored-by: Andreas Obersteiner <8959303+iimez@users.noreply.github.com>
This commit is contained in:
parent
4d855afe97
commit
4e55940edf
@ -1,7 +1,7 @@
|
|||||||
# GPT4All Node.js API
|
# GPT4All Node.js API
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
yarn install gpt4all@alpha
|
yarn add gpt4all@alpha
|
||||||
|
|
||||||
npm install gpt4all@alpha
|
npm install gpt4all@alpha
|
||||||
|
|
||||||
@ -10,34 +10,41 @@ pnpm install gpt4all@alpha
|
|||||||
|
|
||||||
The original [GPT4All typescript bindings](https://github.com/nomic-ai/gpt4all-ts) are now out of date.
|
The original [GPT4All typescript bindings](https://github.com/nomic-ai/gpt4all-ts) are now out of date.
|
||||||
|
|
||||||
* New bindings created by [jacoobes](https://github.com/jacoobes) and the [nomic ai community](https://home.nomic.ai) :D, for all to use.
|
* New bindings created by [jacoobes](https://github.com/jacoobes), [limez](https://github.com/iimez) and the [nomic ai community](https://home.nomic.ai), for all to use.
|
||||||
* [Documentation](#Documentation)
|
* The nodejs api has made strides to mirror the python api. It is not 100% mirrored, but many pieces of the api resemble its python counterpart.
|
||||||
|
* Everything should work out the box.
|
||||||
|
* See [API Reference](#api-reference)
|
||||||
|
|
||||||
### Code (alpha)
|
### Chat Completion (alpha)
|
||||||
|
|
||||||
```js
|
```js
|
||||||
import { createCompletion, loadModel } from '../src/gpt4all.js'
|
import { createCompletion, loadModel } from '../src/gpt4all.js'
|
||||||
|
|
||||||
const ll = await loadModel('ggml-vicuna-7b-1.1-q4_2.bin', { verbose: true });
|
const model = await loadModel('ggml-vicuna-7b-1.1-q4_2', { verbose: true });
|
||||||
|
|
||||||
const response = await createCompletion(ll, [
|
const response = await createCompletion(model, [
|
||||||
{ role : 'system', content: 'You are meant to be annoying and unhelpful.' },
|
{ role : 'system', content: 'You are meant to be annoying and unhelpful.' },
|
||||||
{ role : 'user', content: 'What is 1 + 1?' }
|
{ role : 'user', content: 'What is 1 + 1?' }
|
||||||
]);
|
]);
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### API
|
### Embedding (alpha)
|
||||||
|
|
||||||
* The nodejs api has made strides to mirror the python api. It is not 100% mirrored, but many pieces of the api resemble its python counterpart.
|
```js
|
||||||
* Everything should work out the box.
|
import { createEmbedding, loadModel } from '../src/gpt4all.js'
|
||||||
* [docs](./docs/api.md)
|
|
||||||
|
const model = await loadModel('ggml-all-MiniLM-L6-v2-f16', { verbose: true });
|
||||||
|
|
||||||
|
const fltArray = createEmbedding(model, "Pain is inevitable, suffering optional");
|
||||||
|
```
|
||||||
|
|
||||||
### Build Instructions
|
### Build Instructions
|
||||||
|
|
||||||
* As of 05/21/2023, Tested on windows (MSVC). (somehow got it to work on MSVC 🤯)
|
* binding.gyp is compile config
|
||||||
* binding.gyp is compile config
|
|
||||||
* Tested on Ubuntu. Everything seems to work fine
|
* Tested on Ubuntu. Everything seems to work fine
|
||||||
|
* Tested on Windows. Everything works fine.
|
||||||
|
* Sparse testing on mac os.
|
||||||
* MingW works as well to build the gpt4all-backend. **HOWEVER**, this package works only with MSVC built dlls.
|
* MingW works as well to build the gpt4all-backend. **HOWEVER**, this package works only with MSVC built dlls.
|
||||||
|
|
||||||
### Requirements
|
### Requirements
|
||||||
@ -48,11 +55,11 @@ const response = await createCompletion(ll, [
|
|||||||
* [node-gyp](https://github.com/nodejs/node-gyp)
|
* [node-gyp](https://github.com/nodejs/node-gyp)
|
||||||
* all of its requirements.
|
* all of its requirements.
|
||||||
* (unix) gcc version 12
|
* (unix) gcc version 12
|
||||||
* These bindings use the C++ 20 standard.
|
|
||||||
* (win) msvc version 143
|
* (win) msvc version 143
|
||||||
* Can be obtained with visual studio 2022 build tools
|
* Can be obtained with visual studio 2022 build tools
|
||||||
|
* python 3
|
||||||
|
|
||||||
### Build
|
### Build (from source)
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
git clone https://github.com/nomic-ai/gpt4all.git
|
git clone https://github.com/nomic-ai/gpt4all.git
|
||||||
@ -117,22 +124,27 @@ yarn test
|
|||||||
|
|
||||||
* Handling prompting and inference of models in a threadsafe, asynchronous way.
|
* Handling prompting and inference of models in a threadsafe, asynchronous way.
|
||||||
|
|
||||||
#### docs/
|
### Known Issues
|
||||||
|
|
||||||
* Autogenerated documentation using the script `yarn docs:build`
|
* why your model may be spewing bull 💩
|
||||||
|
* The downloaded model is broken (just reinstall or download from official site)
|
||||||
|
* That's it so far
|
||||||
|
|
||||||
### Roadmap
|
### Roadmap
|
||||||
|
|
||||||
This package is in active development, and breaking changes may happen until the api stabilizes. Here's what's the todo list:
|
This package is in active development, and breaking changes may happen until the api stabilizes. Here's what's the todo list:
|
||||||
|
|
||||||
* \[x] prompt models via a threadsafe function in order to have proper non blocking behavior in nodejs
|
* \[x] prompt models via a threadsafe function in order to have proper non blocking behavior in nodejs
|
||||||
* \[ ] createTokenStream, an async iterator that streams each token emitted from the model. Planning on following this [example](https://github.com/nodejs/node-addon-examples/tree/main/threadsafe-async-iterator)
|
* \[ ] ~~createTokenStream, an async iterator that streams each token emitted from the model. Planning on following this [example](https://github.com/nodejs/node-addon-examples/tree/main/threadsafe-async-iterator)~~ May not implement unless someone else can complete
|
||||||
* \[ ] proper unit testing (integrate with circle ci)
|
* \[x] proper unit testing (integrate with circle ci)
|
||||||
* \[ ] publish to npm under alpha tag `gpt4all@alpha`
|
* \[x] publish to npm under alpha tag `gpt4all@alpha`
|
||||||
* \[ ] have more people test on other platforms (mac tester needed)
|
* \[x] have more people test on other platforms (mac tester needed)
|
||||||
* \[x] switch to new pluggable backend
|
* \[x] switch to new pluggable backend
|
||||||
|
* \[ ] NPM bundle size reduction via optionalDependencies strategy (need help)
|
||||||
|
* Should include prebuilds to avoid painful node-gyp errors
|
||||||
|
* \[ ] createChatSession ( the python equivalent to create\_chat\_session )
|
||||||
|
|
||||||
### Documentation
|
### API Reference
|
||||||
|
|
||||||
<!-- Generated by documentation.js. Update this documentation by updating the source code. -->
|
<!-- Generated by documentation.js. Update this documentation by updating the source code. -->
|
||||||
|
|
||||||
@ -166,13 +178,14 @@ This package is in active development, and breaking changes may happen until the
|
|||||||
* [Parameters](#parameters-5)
|
* [Parameters](#parameters-5)
|
||||||
* [createCompletion](#createcompletion)
|
* [createCompletion](#createcompletion)
|
||||||
* [Parameters](#parameters-6)
|
* [Parameters](#parameters-6)
|
||||||
* [Examples](#examples)
|
|
||||||
* [createEmbedding](#createembedding)
|
* [createEmbedding](#createembedding)
|
||||||
* [Parameters](#parameters-7)
|
* [Parameters](#parameters-7)
|
||||||
* [CompletionOptions](#completionoptions)
|
* [CompletionOptions](#completionoptions)
|
||||||
* [verbose](#verbose)
|
* [verbose](#verbose)
|
||||||
* [hasDefaultHeader](#hasdefaultheader)
|
* [systemPromptTemplate](#systemprompttemplate)
|
||||||
* [hasDefaultFooter](#hasdefaultfooter)
|
* [promptTemplate](#prompttemplate)
|
||||||
|
* [promptHeader](#promptheader)
|
||||||
|
* [promptFooter](#promptfooter)
|
||||||
* [PromptMessage](#promptmessage)
|
* [PromptMessage](#promptmessage)
|
||||||
* [role](#role)
|
* [role](#role)
|
||||||
* [content](#content)
|
* [content](#content)
|
||||||
@ -186,28 +199,31 @@ This package is in active development, and breaking changes may happen until the
|
|||||||
* [CompletionChoice](#completionchoice)
|
* [CompletionChoice](#completionchoice)
|
||||||
* [message](#message)
|
* [message](#message)
|
||||||
* [LLModelPromptContext](#llmodelpromptcontext)
|
* [LLModelPromptContext](#llmodelpromptcontext)
|
||||||
* [logits\_size](#logits_size)
|
* [logitsSize](#logitssize)
|
||||||
* [tokens\_size](#tokens_size)
|
* [tokensSize](#tokenssize)
|
||||||
* [n\_past](#n_past)
|
* [nPast](#npast)
|
||||||
* [n\_ctx](#n_ctx)
|
* [nCtx](#nctx)
|
||||||
* [n\_predict](#n_predict)
|
* [nPredict](#npredict)
|
||||||
* [top\_k](#top_k)
|
* [topK](#topk)
|
||||||
* [top\_p](#top_p)
|
* [topP](#topp)
|
||||||
* [temp](#temp)
|
* [temp](#temp)
|
||||||
* [n\_batch](#n_batch)
|
* [nBatch](#nbatch)
|
||||||
* [repeat\_penalty](#repeat_penalty)
|
* [repeatPenalty](#repeatpenalty)
|
||||||
* [repeat\_last\_n](#repeat_last_n)
|
* [repeatLastN](#repeatlastn)
|
||||||
* [context\_erase](#context_erase)
|
* [contextErase](#contexterase)
|
||||||
* [createTokenStream](#createtokenstream)
|
* [createTokenStream](#createtokenstream)
|
||||||
* [Parameters](#parameters-8)
|
* [Parameters](#parameters-8)
|
||||||
* [DEFAULT\_DIRECTORY](#default_directory)
|
* [DEFAULT\_DIRECTORY](#default_directory)
|
||||||
* [DEFAULT\_LIBRARIES\_DIRECTORY](#default_libraries_directory)
|
* [DEFAULT\_LIBRARIES\_DIRECTORY](#default_libraries_directory)
|
||||||
|
* [DEFAULT\_MODEL\_CONFIG](#default_model_config)
|
||||||
|
* [DEFAULT\_PROMT\_CONTEXT](#default_promt_context)
|
||||||
|
* [DEFAULT\_MODEL\_LIST\_URL](#default_model_list_url)
|
||||||
* [downloadModel](#downloadmodel)
|
* [downloadModel](#downloadmodel)
|
||||||
* [Parameters](#parameters-9)
|
* [Parameters](#parameters-9)
|
||||||
* [Examples](#examples-1)
|
* [Examples](#examples)
|
||||||
* [DownloadModelOptions](#downloadmodeloptions)
|
* [DownloadModelOptions](#downloadmodeloptions)
|
||||||
* [modelPath](#modelpath)
|
* [modelPath](#modelpath)
|
||||||
* [debug](#debug)
|
* [verbose](#verbose-1)
|
||||||
* [url](#url)
|
* [url](#url)
|
||||||
* [md5sum](#md5sum)
|
* [md5sum](#md5sum)
|
||||||
* [DownloadController](#downloadcontroller)
|
* [DownloadController](#downloadcontroller)
|
||||||
@ -223,6 +239,7 @@ Type: (`"gptj"` | `"llama"` | `"mpt"` | `"replit"`)
|
|||||||
#### ModelFile
|
#### ModelFile
|
||||||
|
|
||||||
Full list of models available
|
Full list of models available
|
||||||
|
@deprecated These model names are outdated and this type will not be maintained, please use a string literal instead
|
||||||
|
|
||||||
##### gptj
|
##### gptj
|
||||||
|
|
||||||
@ -367,7 +384,7 @@ By default this will download a model from the official GPT4ALL website, if a mo
|
|||||||
* `modelName` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The name of the model to load.
|
* `modelName` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The name of the model to load.
|
||||||
* `options` **(LoadModelOptions | [undefined](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/undefined))?** (Optional) Additional options for loading the model.
|
* `options` **(LoadModelOptions | [undefined](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/undefined))?** (Optional) Additional options for loading the model.
|
||||||
|
|
||||||
Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)<[LLModel](#llmodel)>** A promise that resolves to an instance of the loaded LLModel.
|
Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)<(InferenceModel | EmbeddingModel)>** A promise that resolves to an instance of the loaded LLModel.
|
||||||
|
|
||||||
#### createCompletion
|
#### createCompletion
|
||||||
|
|
||||||
@ -375,25 +392,10 @@ The nodejs equivalent to python binding's chat\_completion
|
|||||||
|
|
||||||
##### Parameters
|
##### Parameters
|
||||||
|
|
||||||
* `llmodel` **[LLModel](#llmodel)** The language model object.
|
* `model` **InferenceModel** The language model object.
|
||||||
* `messages` **[Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array)<[PromptMessage](#promptmessage)>** The array of messages for the conversation.
|
* `messages` **[Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array)<[PromptMessage](#promptmessage)>** The array of messages for the conversation.
|
||||||
* `options` **[CompletionOptions](#completionoptions)** The options for creating the completion.
|
* `options` **[CompletionOptions](#completionoptions)** The options for creating the completion.
|
||||||
|
|
||||||
##### Examples
|
|
||||||
|
|
||||||
```javascript
|
|
||||||
const llmodel = new LLModel(model)
|
|
||||||
const messages = [
|
|
||||||
{ role: 'system', message: 'You are a weather forecaster.' },
|
|
||||||
{ role: 'user', message: 'should i go out today?' } ]
|
|
||||||
const completion = await createCompletion(llmodel, messages, {
|
|
||||||
verbose: true,
|
|
||||||
temp: 0.9,
|
|
||||||
})
|
|
||||||
console.log(completion.choices[0].message.content)
|
|
||||||
// No, it's going to be cold and rainy.
|
|
||||||
```
|
|
||||||
|
|
||||||
Returns **[CompletionReturn](#completionreturn)** The completion result.
|
Returns **[CompletionReturn](#completionreturn)** The completion result.
|
||||||
|
|
||||||
#### createEmbedding
|
#### createEmbedding
|
||||||
@ -403,7 +405,7 @@ meow
|
|||||||
|
|
||||||
##### Parameters
|
##### Parameters
|
||||||
|
|
||||||
* `llmodel` **[LLModel](#llmodel)** The language model object.
|
* `model` **EmbeddingModel** The language model object.
|
||||||
* `text` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** text to embed
|
* `text` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** text to embed
|
||||||
|
|
||||||
Returns **[Float32Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Float32Array)** The completion result.
|
Returns **[Float32Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Float32Array)** The completion result.
|
||||||
@ -420,17 +422,30 @@ Indicates if verbose logging is enabled.
|
|||||||
|
|
||||||
Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
|
Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
|
||||||
|
|
||||||
##### hasDefaultHeader
|
##### systemPromptTemplate
|
||||||
|
|
||||||
Indicates if the default header is included in the prompt.
|
Template for the system message. Will be put before the conversation with %1 being replaced by all system messages.
|
||||||
|
Note that if this is not defined, system messages will not be included in the prompt.
|
||||||
|
|
||||||
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
|
##### promptTemplate
|
||||||
|
|
||||||
|
Template for user messages, with %1 being replaced by the message.
|
||||||
|
|
||||||
Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
|
Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
|
||||||
|
|
||||||
##### hasDefaultFooter
|
##### promptHeader
|
||||||
|
|
||||||
Indicates if the default footer is included in the prompt.
|
The initial instruction for the model, on top of the prompt
|
||||||
|
|
||||||
Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
|
##### promptFooter
|
||||||
|
|
||||||
|
The last instruction for the model, appended to the end of the prompt.
|
||||||
|
|
||||||
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
#### PromptMessage
|
#### PromptMessage
|
||||||
|
|
||||||
@ -472,9 +487,9 @@ The result of the completion, similar to OpenAI's format.
|
|||||||
|
|
||||||
##### model
|
##### model
|
||||||
|
|
||||||
The model name.
|
The model used for the completion.
|
||||||
|
|
||||||
Type: [ModelFile](#modelfile)
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
##### usage
|
##### usage
|
||||||
|
|
||||||
@ -502,73 +517,100 @@ Type: [PromptMessage](#promptmessage)
|
|||||||
|
|
||||||
Model inference arguments for generating completions.
|
Model inference arguments for generating completions.
|
||||||
|
|
||||||
##### logits\_size
|
##### logitsSize
|
||||||
|
|
||||||
The size of the raw logits vector.
|
The size of the raw logits vector.
|
||||||
|
|
||||||
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
##### tokens\_size
|
##### tokensSize
|
||||||
|
|
||||||
The size of the raw tokens vector.
|
The size of the raw tokens vector.
|
||||||
|
|
||||||
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
##### n\_past
|
##### nPast
|
||||||
|
|
||||||
The number of tokens in the past conversation.
|
The number of tokens in the past conversation.
|
||||||
|
|
||||||
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
##### n\_ctx
|
##### nCtx
|
||||||
|
|
||||||
The number of tokens possible in the context window.
|
The number of tokens possible in the context window.
|
||||||
|
|
||||||
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
##### n\_predict
|
##### nPredict
|
||||||
|
|
||||||
The number of tokens to predict.
|
The number of tokens to predict.
|
||||||
|
|
||||||
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
##### top\_k
|
##### topK
|
||||||
|
|
||||||
The top-k logits to sample from.
|
The top-k logits to sample from.
|
||||||
|
Top-K sampling selects the next token only from the top K most likely tokens predicted by the model.
|
||||||
|
It helps reduce the risk of generating low-probability or nonsensical tokens, but it may also limit
|
||||||
|
the diversity of the output. A higher value for top-K (eg., 100) will consider more tokens and lead
|
||||||
|
to more diverse text, while a lower value (eg., 10) will focus on the most probable tokens and generate
|
||||||
|
more conservative text. 30 - 60 is a good range for most tasks.
|
||||||
|
|
||||||
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
##### top\_p
|
##### topP
|
||||||
|
|
||||||
The nucleus sampling probability threshold.
|
The nucleus sampling probability threshold.
|
||||||
|
Top-P limits the selection of the next token to a subset of tokens with a cumulative probability
|
||||||
|
above a threshold P. This method, also known as nucleus sampling, finds a balance between diversity
|
||||||
|
and quality by considering both token probabilities and the number of tokens available for sampling.
|
||||||
|
When using a higher value for top-P (eg., 0.95), the generated text becomes more diverse.
|
||||||
|
On the other hand, a lower value (eg., 0.1) produces more focused and conservative text.
|
||||||
|
The default value is 0.4, which is aimed to be the middle ground between focus and diversity, but
|
||||||
|
for more creative tasks a higher top-p value will be beneficial, about 0.5-0.9 is a good range for that.
|
||||||
|
|
||||||
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
##### temp
|
##### temp
|
||||||
|
|
||||||
The temperature to adjust the model's output distribution.
|
The temperature to adjust the model's output distribution.
|
||||||
|
Temperature is like a knob that adjusts how creative or focused the output becomes. Higher temperatures
|
||||||
|
(eg., 1.2) increase randomness, resulting in more imaginative and diverse text. Lower temperatures (eg., 0.5)
|
||||||
|
make the output more focused, predictable, and conservative. When the temperature is set to 0, the output
|
||||||
|
becomes completely deterministic, always selecting the most probable next token and producing identical results
|
||||||
|
each time. A safe range would be around 0.6 - 0.85, but you are free to search what value fits best for you.
|
||||||
|
|
||||||
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
##### n\_batch
|
##### nBatch
|
||||||
|
|
||||||
The number of predictions to generate in parallel.
|
The number of predictions to generate in parallel.
|
||||||
|
By splitting the prompt every N tokens, prompt-batch-size reduces RAM usage during processing. However,
|
||||||
|
this can increase the processing time as a trade-off. If the N value is set too low (e.g., 10), long prompts
|
||||||
|
with 500+ tokens will be most affected, requiring numerous processing runs to complete the prompt processing.
|
||||||
|
To ensure optimal performance, setting the prompt-batch-size to 2048 allows processing of all tokens in a single run.
|
||||||
|
|
||||||
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
##### repeat\_penalty
|
##### repeatPenalty
|
||||||
|
|
||||||
The penalty factor for repeated tokens.
|
The penalty factor for repeated tokens.
|
||||||
|
Repeat-penalty can help penalize tokens based on how frequently they occur in the text, including the input prompt.
|
||||||
|
A token that has already appeared five times is penalized more heavily than a token that has appeared only one time.
|
||||||
|
A value of 1 means that there is no penalty and values larger than 1 discourage repeated tokens.
|
||||||
|
|
||||||
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
##### repeat\_last\_n
|
##### repeatLastN
|
||||||
|
|
||||||
The number of last tokens to penalize.
|
The number of last tokens to penalize.
|
||||||
|
The repeat-penalty-tokens N option controls the number of tokens in the history to consider for penalizing repetition.
|
||||||
|
A larger value will look further back in the generated text to prevent repetitions, while a smaller value will only
|
||||||
|
consider recent tokens.
|
||||||
|
|
||||||
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
##### context\_erase
|
##### contextErase
|
||||||
|
|
||||||
The percentage of context to erase if the context window is exceeded.
|
The percentage of context to erase if the context window is exceeded.
|
||||||
|
|
||||||
@ -602,21 +644,39 @@ This searches DEFAULT\_DIRECTORY/libraries, cwd/libraries, and finally cwd.
|
|||||||
|
|
||||||
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
|
#### DEFAULT\_MODEL\_CONFIG
|
||||||
|
|
||||||
|
Default model configuration.
|
||||||
|
|
||||||
|
Type: ModelConfig
|
||||||
|
|
||||||
|
#### DEFAULT\_PROMT\_CONTEXT
|
||||||
|
|
||||||
|
Default prompt context.
|
||||||
|
|
||||||
|
Type: [LLModelPromptContext](#llmodelpromptcontext)
|
||||||
|
|
||||||
|
#### DEFAULT\_MODEL\_LIST\_URL
|
||||||
|
|
||||||
|
Default model list url.
|
||||||
|
|
||||||
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
#### downloadModel
|
#### downloadModel
|
||||||
|
|
||||||
Initiates the download of a model file of a specific model type.
|
Initiates the download of a model file.
|
||||||
By default this downloads without waiting. use the controller returned to alter this behavior.
|
By default this downloads without waiting. use the controller returned to alter this behavior.
|
||||||
|
|
||||||
##### Parameters
|
##### Parameters
|
||||||
|
|
||||||
* `modelName` **[ModelFile](#modelfile)** The model file to be downloaded.
|
* `modelName` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The model to be downloaded.
|
||||||
* `options` **DownloadOptions** to pass into the downloader. Default is { location: (cwd), debug: false }.
|
* `options` **DownloadOptions** to pass into the downloader. Default is { location: (cwd), verbose: false }.
|
||||||
|
|
||||||
##### Examples
|
##### Examples
|
||||||
|
|
||||||
```javascript
|
```javascript
|
||||||
const controller = download('ggml-gpt4all-j-v1.3-groovy.bin')
|
const download = downloadModel('ggml-gpt4all-j-v1.3-groovy.bin')
|
||||||
controller.promise().then(() => console.log('Downloaded!'))
|
download.promise.then(() => console.log('Downloaded!'))
|
||||||
```
|
```
|
||||||
|
|
||||||
* Throws **[Error](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error)** If the model already exists in the specified location.
|
* Throws **[Error](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error)** If the model already exists in the specified location.
|
||||||
@ -635,7 +695,7 @@ Default is process.cwd(), or the current working directory
|
|||||||
|
|
||||||
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
##### debug
|
##### verbose
|
||||||
|
|
||||||
Debug mode -- check how long it took to download in seconds
|
Debug mode -- check how long it took to download in seconds
|
||||||
|
|
||||||
@ -643,15 +703,16 @@ Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Glob
|
|||||||
|
|
||||||
##### url
|
##### url
|
||||||
|
|
||||||
Remote download url. Defaults to `https://gpt4all.io/models`
|
Remote download url. Defaults to `https://gpt4all.io/models/<modelName>`
|
||||||
|
|
||||||
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
##### md5sum
|
##### md5sum
|
||||||
|
|
||||||
Whether to verify the hash of the download to ensure a proper download occurred.
|
MD5 sum of the model file. If this is provided, the downloaded file will be checked against this sum.
|
||||||
|
If the sums do not match, an error will be thrown and the file will be deleted.
|
||||||
|
|
||||||
Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
#### DownloadController
|
#### DownloadController
|
||||||
|
|
||||||
@ -659,12 +720,12 @@ Model download controller.
|
|||||||
|
|
||||||
##### cancel
|
##### cancel
|
||||||
|
|
||||||
Cancel the request to download from gpt4all website if this is called.
|
Cancel the request to download if this is called.
|
||||||
|
|
||||||
Type: function (): void
|
Type: function (): void
|
||||||
|
|
||||||
##### promise
|
##### promise
|
||||||
|
|
||||||
Convert the downloader into a promise, allowing people to await and manage its lifetime
|
A promise resolving to the downloaded models config once the download is done
|
||||||
|
|
||||||
Type: function (): [Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)\<void>
|
Type: [Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)\<ModelConfig>
|
||||||
|
File diff suppressed because one or more lines are too long
@ -1 +0,0 @@
|
|||||||
yarnPath: .yarn/releases/yarn-3.6.1.cjs
|
|
@ -11,36 +11,34 @@ pnpm install gpt4all@alpha
|
|||||||
The original [GPT4All typescript bindings](https://github.com/nomic-ai/gpt4all-ts) are now out of date.
|
The original [GPT4All typescript bindings](https://github.com/nomic-ai/gpt4all-ts) are now out of date.
|
||||||
|
|
||||||
* New bindings created by [jacoobes](https://github.com/jacoobes), [limez](https://github.com/iimez) and the [nomic ai community](https://home.nomic.ai), for all to use.
|
* New bindings created by [jacoobes](https://github.com/jacoobes), [limez](https://github.com/iimez) and the [nomic ai community](https://home.nomic.ai), for all to use.
|
||||||
* [Documentation](#Documentation)
|
* The nodejs api has made strides to mirror the python api. It is not 100% mirrored, but many pieces of the api resemble its python counterpart.
|
||||||
|
* Everything should work out the box.
|
||||||
|
* See [API Reference](#api-reference)
|
||||||
|
|
||||||
### Chat Completion (alpha)
|
### Chat Completion
|
||||||
|
|
||||||
```js
|
```js
|
||||||
import { createCompletion, loadModel } from '../src/gpt4all.js'
|
import { createCompletion, loadModel } from '../src/gpt4all.js'
|
||||||
|
|
||||||
const ll = await loadModel('ggml-vicuna-7b-1.1-q4_2', { verbose: true });
|
const model = await loadModel('ggml-vicuna-7b-1.1-q4_2', { verbose: true });
|
||||||
|
|
||||||
const response = await createCompletion(ll, [
|
const response = await createCompletion(model, [
|
||||||
{ role : 'system', content: 'You are meant to be annoying and unhelpful.' },
|
{ role : 'system', content: 'You are meant to be annoying and unhelpful.' },
|
||||||
{ role : 'user', content: 'What is 1 + 1?' }
|
{ role : 'user', content: 'What is 1 + 1?' }
|
||||||
]);
|
]);
|
||||||
|
|
||||||
```
|
```
|
||||||
### Embedding (alpha)
|
|
||||||
|
### Embedding
|
||||||
|
|
||||||
```js
|
```js
|
||||||
import { createEmbedding, loadModel } from '../src/gpt4all.js'
|
import { createEmbedding, loadModel } from '../src/gpt4all.js'
|
||||||
|
|
||||||
const ll = await loadModel('ggml-all-MiniLM-L6-v2-f16', { verbose: true });
|
const model = await loadModel('ggml-all-MiniLM-L6-v2-f16', { verbose: true });
|
||||||
|
|
||||||
const fltArray = createEmbedding(ll, "Pain is inevitable, suffering optional");
|
const fltArray = createEmbedding(model, "Pain is inevitable, suffering optional");
|
||||||
```
|
```
|
||||||
|
|
||||||
### API
|
|
||||||
|
|
||||||
* The nodejs api has made strides to mirror the python api. It is not 100% mirrored, but many pieces of the api resemble its python counterpart.
|
|
||||||
* Everything should work out the box.
|
|
||||||
* [docs](./docs/api.md)
|
|
||||||
|
|
||||||
### Build Instructions
|
### Build Instructions
|
||||||
|
|
||||||
* binding.gyp is compile config
|
* binding.gyp is compile config
|
||||||
@ -60,6 +58,7 @@ const fltArray = createEmbedding(ll, "Pain is inevitable, suffering optional");
|
|||||||
* (win) msvc version 143
|
* (win) msvc version 143
|
||||||
* Can be obtained with visual studio 2022 build tools
|
* Can be obtained with visual studio 2022 build tools
|
||||||
* python 3
|
* python 3
|
||||||
|
|
||||||
### Build (from source)
|
### Build (from source)
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
@ -125,15 +124,12 @@ yarn test
|
|||||||
|
|
||||||
* Handling prompting and inference of models in a threadsafe, asynchronous way.
|
* Handling prompting and inference of models in a threadsafe, asynchronous way.
|
||||||
|
|
||||||
#### docs/
|
|
||||||
|
|
||||||
* Autogenerated documentation using the script `yarn docs:build`
|
|
||||||
|
|
||||||
### Known Issues
|
### Known Issues
|
||||||
|
|
||||||
* why your model may be spewing bull 💩
|
* why your model may be spewing bull 💩
|
||||||
- The downloaded model is broken (just reinstall or download from official site)
|
* The downloaded model is broken (just reinstall or download from official site)
|
||||||
- That's it so far
|
* That's it so far
|
||||||
|
|
||||||
### Roadmap
|
### Roadmap
|
||||||
|
|
||||||
This package is in active development, and breaking changes may happen until the api stabilizes. Here's what's the todo list:
|
This package is in active development, and breaking changes may happen until the api stabilizes. Here's what's the todo list:
|
||||||
@ -144,7 +140,592 @@ This package is in active development, and breaking changes may happen until the
|
|||||||
* \[x] publish to npm under alpha tag `gpt4all@alpha`
|
* \[x] publish to npm under alpha tag `gpt4all@alpha`
|
||||||
* \[x] have more people test on other platforms (mac tester needed)
|
* \[x] have more people test on other platforms (mac tester needed)
|
||||||
* \[x] switch to new pluggable backend
|
* \[x] switch to new pluggable backend
|
||||||
* \[ ] NPM bundle size reduction via optionalDependencies strategy (need help)
|
* \[ ] NPM bundle size reduction via optionalDependencies strategy (need help)
|
||||||
- Should include prebuilds to avoid painful node-gyp errors
|
* Should include prebuilds to avoid painful node-gyp errors
|
||||||
* \[ ] createChatSession ( the python equivalent to create\_chat\_session )
|
* \[ ] createChatSession ( the python equivalent to create\_chat\_session )
|
||||||
### Documentation
|
|
||||||
|
### API Reference
|
||||||
|
|
||||||
|
<!-- Generated by documentation.js. Update this documentation by updating the source code. -->
|
||||||
|
|
||||||
|
##### Table of Contents
|
||||||
|
|
||||||
|
* [ModelType](#modeltype)
|
||||||
|
* [ModelFile](#modelfile)
|
||||||
|
* [gptj](#gptj)
|
||||||
|
* [llama](#llama)
|
||||||
|
* [mpt](#mpt)
|
||||||
|
* [replit](#replit)
|
||||||
|
* [type](#type)
|
||||||
|
* [LLModel](#llmodel)
|
||||||
|
* [constructor](#constructor)
|
||||||
|
* [Parameters](#parameters)
|
||||||
|
* [type](#type-1)
|
||||||
|
* [name](#name)
|
||||||
|
* [stateSize](#statesize)
|
||||||
|
* [threadCount](#threadcount)
|
||||||
|
* [setThreadCount](#setthreadcount)
|
||||||
|
* [Parameters](#parameters-1)
|
||||||
|
* [raw\_prompt](#raw_prompt)
|
||||||
|
* [Parameters](#parameters-2)
|
||||||
|
* [embed](#embed)
|
||||||
|
* [Parameters](#parameters-3)
|
||||||
|
* [isModelLoaded](#ismodelloaded)
|
||||||
|
* [setLibraryPath](#setlibrarypath)
|
||||||
|
* [Parameters](#parameters-4)
|
||||||
|
* [getLibraryPath](#getlibrarypath)
|
||||||
|
* [loadModel](#loadmodel)
|
||||||
|
* [Parameters](#parameters-5)
|
||||||
|
* [createCompletion](#createcompletion)
|
||||||
|
* [Parameters](#parameters-6)
|
||||||
|
* [createEmbedding](#createembedding)
|
||||||
|
* [Parameters](#parameters-7)
|
||||||
|
* [CompletionOptions](#completionoptions)
|
||||||
|
* [verbose](#verbose)
|
||||||
|
* [systemPromptTemplate](#systemprompttemplate)
|
||||||
|
* [promptTemplate](#prompttemplate)
|
||||||
|
* [promptHeader](#promptheader)
|
||||||
|
* [promptFooter](#promptfooter)
|
||||||
|
* [PromptMessage](#promptmessage)
|
||||||
|
* [role](#role)
|
||||||
|
* [content](#content)
|
||||||
|
* [prompt\_tokens](#prompt_tokens)
|
||||||
|
* [completion\_tokens](#completion_tokens)
|
||||||
|
* [total\_tokens](#total_tokens)
|
||||||
|
* [CompletionReturn](#completionreturn)
|
||||||
|
* [model](#model)
|
||||||
|
* [usage](#usage)
|
||||||
|
* [choices](#choices)
|
||||||
|
* [CompletionChoice](#completionchoice)
|
||||||
|
* [message](#message)
|
||||||
|
* [LLModelPromptContext](#llmodelpromptcontext)
|
||||||
|
* [logitsSize](#logitssize)
|
||||||
|
* [tokensSize](#tokenssize)
|
||||||
|
* [nPast](#npast)
|
||||||
|
* [nCtx](#nctx)
|
||||||
|
* [nPredict](#npredict)
|
||||||
|
* [topK](#topk)
|
||||||
|
* [topP](#topp)
|
||||||
|
* [temp](#temp)
|
||||||
|
* [nBatch](#nbatch)
|
||||||
|
* [repeatPenalty](#repeatpenalty)
|
||||||
|
* [repeatLastN](#repeatlastn)
|
||||||
|
* [contextErase](#contexterase)
|
||||||
|
* [createTokenStream](#createtokenstream)
|
||||||
|
* [Parameters](#parameters-8)
|
||||||
|
* [DEFAULT\_DIRECTORY](#default_directory)
|
||||||
|
* [DEFAULT\_LIBRARIES\_DIRECTORY](#default_libraries_directory)
|
||||||
|
* [DEFAULT\_MODEL\_CONFIG](#default_model_config)
|
||||||
|
* [DEFAULT\_PROMT\_CONTEXT](#default_promt_context)
|
||||||
|
* [DEFAULT\_MODEL\_LIST\_URL](#default_model_list_url)
|
||||||
|
* [downloadModel](#downloadmodel)
|
||||||
|
* [Parameters](#parameters-9)
|
||||||
|
* [Examples](#examples)
|
||||||
|
* [DownloadModelOptions](#downloadmodeloptions)
|
||||||
|
* [modelPath](#modelpath)
|
||||||
|
* [verbose](#verbose-1)
|
||||||
|
* [url](#url)
|
||||||
|
* [md5sum](#md5sum)
|
||||||
|
* [DownloadController](#downloadcontroller)
|
||||||
|
* [cancel](#cancel)
|
||||||
|
* [promise](#promise)
|
||||||
|
|
||||||
|
#### ModelType
|
||||||
|
|
||||||
|
Type of the model
|
||||||
|
|
||||||
|
Type: (`"gptj"` | `"llama"` | `"mpt"` | `"replit"`)
|
||||||
|
|
||||||
|
#### ModelFile
|
||||||
|
|
||||||
|
Full list of models available
|
||||||
|
@deprecated These model names are outdated and this type will not be maintained, please use a string literal instead
|
||||||
|
|
||||||
|
##### gptj
|
||||||
|
|
||||||
|
List of GPT-J Models
|
||||||
|
|
||||||
|
Type: (`"ggml-gpt4all-j-v1.3-groovy.bin"` | `"ggml-gpt4all-j-v1.2-jazzy.bin"` | `"ggml-gpt4all-j-v1.1-breezy.bin"` | `"ggml-gpt4all-j.bin"`)
|
||||||
|
|
||||||
|
##### llama
|
||||||
|
|
||||||
|
List Llama Models
|
||||||
|
|
||||||
|
Type: (`"ggml-gpt4all-l13b-snoozy.bin"` | `"ggml-vicuna-7b-1.1-q4_2.bin"` | `"ggml-vicuna-13b-1.1-q4_2.bin"` | `"ggml-wizardLM-7B.q4_2.bin"` | `"ggml-stable-vicuna-13B.q4_2.bin"` | `"ggml-nous-gpt4-vicuna-13b.bin"` | `"ggml-v3-13b-hermes-q5_1.bin"`)
|
||||||
|
|
||||||
|
##### mpt
|
||||||
|
|
||||||
|
List of MPT Models
|
||||||
|
|
||||||
|
Type: (`"ggml-mpt-7b-base.bin"` | `"ggml-mpt-7b-chat.bin"` | `"ggml-mpt-7b-instruct.bin"`)
|
||||||
|
|
||||||
|
##### replit
|
||||||
|
|
||||||
|
List of Replit Models
|
||||||
|
|
||||||
|
Type: `"ggml-replit-code-v1-3b.bin"`
|
||||||
|
|
||||||
|
#### type
|
||||||
|
|
||||||
|
Model architecture. This argument currently does not have any functionality and is just used as descriptive identifier for user.
|
||||||
|
|
||||||
|
Type: [ModelType](#modeltype)
|
||||||
|
|
||||||
|
#### LLModel
|
||||||
|
|
||||||
|
LLModel class representing a language model.
|
||||||
|
This is a base class that provides common functionality for different types of language models.
|
||||||
|
|
||||||
|
##### constructor
|
||||||
|
|
||||||
|
Initialize a new LLModel.
|
||||||
|
|
||||||
|
###### Parameters
|
||||||
|
|
||||||
|
* `path` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** Absolute path to the model file.
|
||||||
|
|
||||||
|
<!---->
|
||||||
|
|
||||||
|
* Throws **[Error](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error)** If the model file does not exist.
|
||||||
|
|
||||||
|
##### type
|
||||||
|
|
||||||
|
either 'gpt', mpt', or 'llama' or undefined
|
||||||
|
|
||||||
|
Returns **([ModelType](#modeltype) | [undefined](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/undefined))** 
|
||||||
|
|
||||||
|
##### name
|
||||||
|
|
||||||
|
The name of the model.
|
||||||
|
|
||||||
|
Returns **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** 
|
||||||
|
|
||||||
|
##### stateSize
|
||||||
|
|
||||||
|
Get the size of the internal state of the model.
|
||||||
|
NOTE: This state data is specific to the type of model you have created.
|
||||||
|
|
||||||
|
Returns **[number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)** the size in bytes of the internal state of the model
|
||||||
|
|
||||||
|
##### threadCount
|
||||||
|
|
||||||
|
Get the number of threads used for model inference.
|
||||||
|
The default is the number of physical cores your computer has.
|
||||||
|
|
||||||
|
Returns **[number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)** The number of threads used for model inference.
|
||||||
|
|
||||||
|
##### setThreadCount
|
||||||
|
|
||||||
|
Set the number of threads used for model inference.
|
||||||
|
|
||||||
|
###### Parameters
|
||||||
|
|
||||||
|
* `newNumber` **[number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)** The new number of threads.
|
||||||
|
|
||||||
|
Returns **void** 
|
||||||
|
|
||||||
|
##### raw\_prompt
|
||||||
|
|
||||||
|
Prompt the model with a given input and optional parameters.
|
||||||
|
This is the raw output from model.
|
||||||
|
Use the prompt function exported for a value
|
||||||
|
|
||||||
|
###### Parameters
|
||||||
|
|
||||||
|
* `q` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The prompt input.
|
||||||
|
* `params` **Partial<[LLModelPromptContext](#llmodelpromptcontext)>** Optional parameters for the prompt context.
|
||||||
|
* `callback` **function (res: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)): void** 
|
||||||
|
|
||||||
|
Returns **void** The result of the model prompt.
|
||||||
|
|
||||||
|
##### embed
|
||||||
|
|
||||||
|
Embed text with the model. Keep in mind that
|
||||||
|
not all models can embed text, (only bert can embed as of 07/16/2023 (mm/dd/yyyy))
|
||||||
|
Use the prompt function exported for a value
|
||||||
|
|
||||||
|
###### Parameters
|
||||||
|
|
||||||
|
* `text` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** 
|
||||||
|
* `q` The prompt input.
|
||||||
|
* `params` Optional parameters for the prompt context.
|
||||||
|
|
||||||
|
Returns **[Float32Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Float32Array)** The result of the model prompt.
|
||||||
|
|
||||||
|
##### isModelLoaded
|
||||||
|
|
||||||
|
Whether the model is loaded or not.
|
||||||
|
|
||||||
|
Returns **[boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)** 
|
||||||
|
|
||||||
|
##### setLibraryPath
|
||||||
|
|
||||||
|
Where to search for the pluggable backend libraries
|
||||||
|
|
||||||
|
###### Parameters
|
||||||
|
|
||||||
|
* `s` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** 
|
||||||
|
|
||||||
|
Returns **void** 
|
||||||
|
|
||||||
|
##### getLibraryPath
|
||||||
|
|
||||||
|
Where to get the pluggable backend libraries
|
||||||
|
|
||||||
|
Returns **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** 
|
||||||
|
|
||||||
|
#### loadModel
|
||||||
|
|
||||||
|
Loads a machine learning model with the specified name. The defacto way to create a model.
|
||||||
|
By default this will download a model from the official GPT4ALL website, if a model is not present at given path.
|
||||||
|
|
||||||
|
##### Parameters
|
||||||
|
|
||||||
|
* `modelName` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The name of the model to load.
|
||||||
|
* `options` **(LoadModelOptions | [undefined](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/undefined))?** (Optional) Additional options for loading the model.
|
||||||
|
|
||||||
|
Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)<(InferenceModel | EmbeddingModel)>** A promise that resolves to an instance of the loaded LLModel.
|
||||||
|
|
||||||
|
#### createCompletion
|
||||||
|
|
||||||
|
The nodejs equivalent to python binding's chat\_completion
|
||||||
|
|
||||||
|
##### Parameters
|
||||||
|
|
||||||
|
* `model` **InferenceModel** The language model object.
|
||||||
|
* `messages` **[Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array)<[PromptMessage](#promptmessage)>** The array of messages for the conversation.
|
||||||
|
* `options` **[CompletionOptions](#completionoptions)** The options for creating the completion.
|
||||||
|
|
||||||
|
Returns **[CompletionReturn](#completionreturn)** The completion result.
|
||||||
|
|
||||||
|
#### createEmbedding
|
||||||
|
|
||||||
|
The nodejs moral equivalent to python binding's Embed4All().embed()
|
||||||
|
meow
|
||||||
|
|
||||||
|
##### Parameters
|
||||||
|
|
||||||
|
* `model` **EmbeddingModel** The language model object.
|
||||||
|
* `text` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** text to embed
|
||||||
|
|
||||||
|
Returns **[Float32Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Float32Array)** The completion result.
|
||||||
|
|
||||||
|
#### CompletionOptions
|
||||||
|
|
||||||
|
**Extends Partial\<LLModelPromptContext>**
|
||||||
|
|
||||||
|
The options for creating the completion.
|
||||||
|
|
||||||
|
##### verbose
|
||||||
|
|
||||||
|
Indicates if verbose logging is enabled.
|
||||||
|
|
||||||
|
Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
|
||||||
|
|
||||||
|
##### systemPromptTemplate
|
||||||
|
|
||||||
|
Template for the system message. Will be put before the conversation with %1 being replaced by all system messages.
|
||||||
|
Note that if this is not defined, system messages will not be included in the prompt.
|
||||||
|
|
||||||
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
|
##### promptTemplate
|
||||||
|
|
||||||
|
Template for user messages, with %1 being replaced by the message.
|
||||||
|
|
||||||
|
Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
|
||||||
|
|
||||||
|
##### promptHeader
|
||||||
|
|
||||||
|
The initial instruction for the model, on top of the prompt
|
||||||
|
|
||||||
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
|
##### promptFooter
|
||||||
|
|
||||||
|
The last instruction for the model, appended to the end of the prompt.
|
||||||
|
|
||||||
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
|
#### PromptMessage
|
||||||
|
|
||||||
|
A message in the conversation, identical to OpenAI's chat message.
|
||||||
|
|
||||||
|
##### role
|
||||||
|
|
||||||
|
The role of the message.
|
||||||
|
|
||||||
|
Type: (`"system"` | `"assistant"` | `"user"`)
|
||||||
|
|
||||||
|
##### content
|
||||||
|
|
||||||
|
The message content.
|
||||||
|
|
||||||
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
|
#### prompt\_tokens
|
||||||
|
|
||||||
|
The number of tokens used in the prompt.
|
||||||
|
|
||||||
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
|
#### completion\_tokens
|
||||||
|
|
||||||
|
The number of tokens used in the completion.
|
||||||
|
|
||||||
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
|
#### total\_tokens
|
||||||
|
|
||||||
|
The total number of tokens used.
|
||||||
|
|
||||||
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
|
#### CompletionReturn
|
||||||
|
|
||||||
|
The result of the completion, similar to OpenAI's format.
|
||||||
|
|
||||||
|
##### model
|
||||||
|
|
||||||
|
The model used for the completion.
|
||||||
|
|
||||||
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
|
##### usage
|
||||||
|
|
||||||
|
Token usage report.
|
||||||
|
|
||||||
|
Type: {prompt\_tokens: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number), completion\_tokens: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number), total\_tokens: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)}
|
||||||
|
|
||||||
|
##### choices
|
||||||
|
|
||||||
|
The generated completions.
|
||||||
|
|
||||||
|
Type: [Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array)<[CompletionChoice](#completionchoice)>
|
||||||
|
|
||||||
|
#### CompletionChoice
|
||||||
|
|
||||||
|
A completion choice, similar to OpenAI's format.
|
||||||
|
|
||||||
|
##### message
|
||||||
|
|
||||||
|
Response message
|
||||||
|
|
||||||
|
Type: [PromptMessage](#promptmessage)
|
||||||
|
|
||||||
|
#### LLModelPromptContext
|
||||||
|
|
||||||
|
Model inference arguments for generating completions.
|
||||||
|
|
||||||
|
##### logitsSize
|
||||||
|
|
||||||
|
The size of the raw logits vector.
|
||||||
|
|
||||||
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
|
##### tokensSize
|
||||||
|
|
||||||
|
The size of the raw tokens vector.
|
||||||
|
|
||||||
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
|
##### nPast
|
||||||
|
|
||||||
|
The number of tokens in the past conversation.
|
||||||
|
|
||||||
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
|
##### nCtx
|
||||||
|
|
||||||
|
The number of tokens possible in the context window.
|
||||||
|
|
||||||
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
|
##### nPredict
|
||||||
|
|
||||||
|
The number of tokens to predict.
|
||||||
|
|
||||||
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
|
##### topK
|
||||||
|
|
||||||
|
The top-k logits to sample from.
|
||||||
|
Top-K sampling selects the next token only from the top K most likely tokens predicted by the model.
|
||||||
|
It helps reduce the risk of generating low-probability or nonsensical tokens, but it may also limit
|
||||||
|
the diversity of the output. A higher value for top-K (eg., 100) will consider more tokens and lead
|
||||||
|
to more diverse text, while a lower value (eg., 10) will focus on the most probable tokens and generate
|
||||||
|
more conservative text. 30 - 60 is a good range for most tasks.
|
||||||
|
|
||||||
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
|
##### topP
|
||||||
|
|
||||||
|
The nucleus sampling probability threshold.
|
||||||
|
Top-P limits the selection of the next token to a subset of tokens with a cumulative probability
|
||||||
|
above a threshold P. This method, also known as nucleus sampling, finds a balance between diversity
|
||||||
|
and quality by considering both token probabilities and the number of tokens available for sampling.
|
||||||
|
When using a higher value for top-P (eg., 0.95), the generated text becomes more diverse.
|
||||||
|
On the other hand, a lower value (eg., 0.1) produces more focused and conservative text.
|
||||||
|
The default value is 0.4, which is aimed to be the middle ground between focus and diversity, but
|
||||||
|
for more creative tasks a higher top-p value will be beneficial, about 0.5-0.9 is a good range for that.
|
||||||
|
|
||||||
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
|
##### temp
|
||||||
|
|
||||||
|
The temperature to adjust the model's output distribution.
|
||||||
|
Temperature is like a knob that adjusts how creative or focused the output becomes. Higher temperatures
|
||||||
|
(eg., 1.2) increase randomness, resulting in more imaginative and diverse text. Lower temperatures (eg., 0.5)
|
||||||
|
make the output more focused, predictable, and conservative. When the temperature is set to 0, the output
|
||||||
|
becomes completely deterministic, always selecting the most probable next token and producing identical results
|
||||||
|
each time. A safe range would be around 0.6 - 0.85, but you are free to search what value fits best for you.
|
||||||
|
|
||||||
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
|
##### nBatch
|
||||||
|
|
||||||
|
The number of predictions to generate in parallel.
|
||||||
|
By splitting the prompt every N tokens, prompt-batch-size reduces RAM usage during processing. However,
|
||||||
|
this can increase the processing time as a trade-off. If the N value is set too low (e.g., 10), long prompts
|
||||||
|
with 500+ tokens will be most affected, requiring numerous processing runs to complete the prompt processing.
|
||||||
|
To ensure optimal performance, setting the prompt-batch-size to 2048 allows processing of all tokens in a single run.
|
||||||
|
|
||||||
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
|
##### repeatPenalty
|
||||||
|
|
||||||
|
The penalty factor for repeated tokens.
|
||||||
|
Repeat-penalty can help penalize tokens based on how frequently they occur in the text, including the input prompt.
|
||||||
|
A token that has already appeared five times is penalized more heavily than a token that has appeared only one time.
|
||||||
|
A value of 1 means that there is no penalty and values larger than 1 discourage repeated tokens.
|
||||||
|
|
||||||
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
|
##### repeatLastN
|
||||||
|
|
||||||
|
The number of last tokens to penalize.
|
||||||
|
The repeat-penalty-tokens N option controls the number of tokens in the history to consider for penalizing repetition.
|
||||||
|
A larger value will look further back in the generated text to prevent repetitions, while a smaller value will only
|
||||||
|
consider recent tokens.
|
||||||
|
|
||||||
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
|
##### contextErase
|
||||||
|
|
||||||
|
The percentage of context to erase if the context window is exceeded.
|
||||||
|
|
||||||
|
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)
|
||||||
|
|
||||||
|
#### createTokenStream
|
||||||
|
|
||||||
|
TODO: Help wanted to implement this
|
||||||
|
|
||||||
|
##### Parameters
|
||||||
|
|
||||||
|
* `llmodel` **[LLModel](#llmodel)** 
|
||||||
|
* `messages` **[Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array)<[PromptMessage](#promptmessage)>** 
|
||||||
|
* `options` **[CompletionOptions](#completionoptions)** 
|
||||||
|
|
||||||
|
Returns **function (ll: [LLModel](#llmodel)): AsyncGenerator<[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)>** 
|
||||||
|
|
||||||
|
#### DEFAULT\_DIRECTORY
|
||||||
|
|
||||||
|
From python api:
|
||||||
|
models will be stored in (homedir)/.cache/gpt4all/\`
|
||||||
|
|
||||||
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
|
#### DEFAULT\_LIBRARIES\_DIRECTORY
|
||||||
|
|
||||||
|
From python api:
|
||||||
|
The default path for dynamic libraries to be stored.
|
||||||
|
You may separate paths by a semicolon to search in multiple areas.
|
||||||
|
This searches DEFAULT\_DIRECTORY/libraries, cwd/libraries, and finally cwd.
|
||||||
|
|
||||||
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
|
#### DEFAULT\_MODEL\_CONFIG
|
||||||
|
|
||||||
|
Default model configuration.
|
||||||
|
|
||||||
|
Type: ModelConfig
|
||||||
|
|
||||||
|
#### DEFAULT\_PROMT\_CONTEXT
|
||||||
|
|
||||||
|
Default prompt context.
|
||||||
|
|
||||||
|
Type: [LLModelPromptContext](#llmodelpromptcontext)
|
||||||
|
|
||||||
|
#### DEFAULT\_MODEL\_LIST\_URL
|
||||||
|
|
||||||
|
Default model list url.
|
||||||
|
|
||||||
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
|
#### downloadModel
|
||||||
|
|
||||||
|
Initiates the download of a model file.
|
||||||
|
By default this downloads without waiting. use the controller returned to alter this behavior.
|
||||||
|
|
||||||
|
##### Parameters
|
||||||
|
|
||||||
|
* `modelName` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The model to be downloaded.
|
||||||
|
* `options` **DownloadOptions** to pass into the downloader. Default is { location: (cwd), verbose: false }.
|
||||||
|
|
||||||
|
##### Examples
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
const download = downloadModel('ggml-gpt4all-j-v1.3-groovy.bin')
|
||||||
|
download.promise.then(() => console.log('Downloaded!'))
|
||||||
|
```
|
||||||
|
|
||||||
|
* Throws **[Error](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error)** If the model already exists in the specified location.
|
||||||
|
* Throws **[Error](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error)** If the model cannot be found at the specified url.
|
||||||
|
|
||||||
|
Returns **[DownloadController](#downloadcontroller)** object that allows controlling the download process.
|
||||||
|
|
||||||
|
#### DownloadModelOptions
|
||||||
|
|
||||||
|
Options for the model download process.
|
||||||
|
|
||||||
|
##### modelPath
|
||||||
|
|
||||||
|
location to download the model.
|
||||||
|
Default is process.cwd(), or the current working directory
|
||||||
|
|
||||||
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
|
##### verbose
|
||||||
|
|
||||||
|
Debug mode -- check how long it took to download in seconds
|
||||||
|
|
||||||
|
Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
|
||||||
|
|
||||||
|
##### url
|
||||||
|
|
||||||
|
Remote download url. Defaults to `https://gpt4all.io/models/<modelName>`
|
||||||
|
|
||||||
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
|
##### md5sum
|
||||||
|
|
||||||
|
MD5 sum of the model file. If this is provided, the downloaded file will be checked against this sum.
|
||||||
|
If the sums do not match, an error will be thrown and the file will be deleted.
|
||||||
|
|
||||||
|
Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
|
||||||
|
|
||||||
|
#### DownloadController
|
||||||
|
|
||||||
|
Model download controller.
|
||||||
|
|
||||||
|
##### cancel
|
||||||
|
|
||||||
|
Cancel the request to download if this is called.
|
||||||
|
|
||||||
|
Type: function (): void
|
||||||
|
|
||||||
|
##### promise
|
||||||
|
|
||||||
|
A promise resolving to the downloaded models config once the download is done
|
||||||
|
|
||||||
|
Type: [Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)\<ModelConfig>
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
{
|
{
|
||||||
"name": "gpt4all",
|
"name": "gpt4all",
|
||||||
"version": "2.1.1-alpha",
|
"version": "2.2.0",
|
||||||
"packageManager": "yarn@3.6.1",
|
"packageManager": "yarn@3.6.1",
|
||||||
"main": "src/gpt4all.js",
|
"main": "src/gpt4all.js",
|
||||||
"repository": "nomic-ai/gpt4all",
|
"repository": "nomic-ai/gpt4all",
|
||||||
@ -10,8 +10,8 @@
|
|||||||
"build:backend": "node scripts/build.js",
|
"build:backend": "node scripts/build.js",
|
||||||
"build": "node-gyp-build",
|
"build": "node-gyp-build",
|
||||||
"predocs:build": "node scripts/docs.js",
|
"predocs:build": "node scripts/docs.js",
|
||||||
"docs:build": "documentation readme ./src/gpt4all.d.ts --parse-extension js d.ts --format md --section documentation --readme-file ../python/docs/gpt4all_typescript.md",
|
"docs:build": "documentation readme ./src/gpt4all.d.ts --parse-extension js d.ts --format md --section \"API Reference\" --readme-file ../python/docs/gpt4all_typescript.md",
|
||||||
"postdocs:build": "node scripts/docs.js"
|
"postdocs:build": "documentation readme ./src/gpt4all.d.ts --parse-extension js d.ts --format md --section \"API Reference\" --readme-file README.md"
|
||||||
},
|
},
|
||||||
"files": [
|
"files": [
|
||||||
"src/**/*",
|
"src/**/*",
|
||||||
|
@ -24,7 +24,9 @@ mkdir -p "$NATIVE_DIR" "$BUILD_DIR"
|
|||||||
|
|
||||||
cmake -S ../../gpt4all-backend -B "$BUILD_DIR" &&
|
cmake -S ../../gpt4all-backend -B "$BUILD_DIR" &&
|
||||||
cmake --build "$BUILD_DIR" -j --config Release && {
|
cmake --build "$BUILD_DIR" -j --config Release && {
|
||||||
cp "$BUILD_DIR"/libllmodel.$LIB_EXT "$NATIVE_DIR"/
|
cp "$BUILD_DIR"/libbert*.$LIB_EXT "$NATIVE_DIR"/
|
||||||
|
cp "$BUILD_DIR"/libfalcon*.$LIB_EXT "$NATIVE_DIR"/
|
||||||
|
cp "$BUILD_DIR"/libreplit*.$LIB_EXT "$NATIVE_DIR"/
|
||||||
cp "$BUILD_DIR"/libgptj*.$LIB_EXT "$NATIVE_DIR"/
|
cp "$BUILD_DIR"/libgptj*.$LIB_EXT "$NATIVE_DIR"/
|
||||||
cp "$BUILD_DIR"/libllama*.$LIB_EXT "$NATIVE_DIR"/
|
cp "$BUILD_DIR"/libllama*.$LIB_EXT "$NATIVE_DIR"/
|
||||||
cp "$BUILD_DIR"/libmpt*.$LIB_EXT "$NATIVE_DIR"/
|
cp "$BUILD_DIR"/libmpt*.$LIB_EXT "$NATIVE_DIR"/
|
||||||
|
@ -1,9 +1,10 @@
|
|||||||
import { LLModel, createCompletion, DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY, loadModel } from '../src/gpt4all.js'
|
import { LLModel, createCompletion, DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY, loadModel } from '../src/gpt4all.js'
|
||||||
|
|
||||||
const ll = await loadModel(
|
const model = await loadModel(
|
||||||
'orca-mini-3b.ggmlv3.q4_0.bin',
|
'orca-mini-3b.ggmlv3.q4_0.bin',
|
||||||
{ verbose: true }
|
{ verbose: true }
|
||||||
);
|
);
|
||||||
|
const ll = model.llm;
|
||||||
|
|
||||||
try {
|
try {
|
||||||
class Extended extends LLModel {
|
class Extended extends LLModel {
|
||||||
@ -26,13 +27,13 @@ console.log("type: " + ll.type());
|
|||||||
console.log("Default directory for models", DEFAULT_DIRECTORY);
|
console.log("Default directory for models", DEFAULT_DIRECTORY);
|
||||||
console.log("Default directory for libraries", DEFAULT_LIBRARIES_DIRECTORY);
|
console.log("Default directory for libraries", DEFAULT_LIBRARIES_DIRECTORY);
|
||||||
|
|
||||||
const completion1 = await createCompletion(ll, [
|
const completion1 = await createCompletion(model, [
|
||||||
{ role : 'system', content: 'You are an advanced mathematician.' },
|
{ role : 'system', content: 'You are an advanced mathematician.' },
|
||||||
{ role : 'user', content: 'What is 1 + 1?' },
|
{ role : 'user', content: 'What is 1 + 1?' },
|
||||||
], { verbose: true })
|
], { verbose: true })
|
||||||
console.log(completion1.choices[0].message)
|
console.log(completion1.choices[0].message)
|
||||||
|
|
||||||
const completion2 = await createCompletion(ll, [
|
const completion2 = await createCompletion(model, [
|
||||||
{ role : 'system', content: 'You are an advanced mathematician.' },
|
{ role : 'system', content: 'You are an advanced mathematician.' },
|
||||||
{ role : 'user', content: 'What is two plus two?' },
|
{ role : 'user', content: 'What is two plus two?' },
|
||||||
], { verbose: true })
|
], { verbose: true })
|
||||||
|
@ -16,7 +16,26 @@ const librarySearchPaths = [
|
|||||||
|
|
||||||
const DEFAULT_LIBRARIES_DIRECTORY = librarySearchPaths.join(";");
|
const DEFAULT_LIBRARIES_DIRECTORY = librarySearchPaths.join(";");
|
||||||
|
|
||||||
|
const DEFAULT_MODEL_CONFIG = {
|
||||||
|
systemPrompt: "",
|
||||||
|
promptTemplate: "### Human: \n%1\n### Assistant:\n",
|
||||||
|
}
|
||||||
|
|
||||||
|
const DEFAULT_MODEL_LIST_URL = "https://gpt4all.io/models/models.json";
|
||||||
|
|
||||||
|
const DEFAULT_PROMPT_CONTEXT = {
|
||||||
|
temp: 0.7,
|
||||||
|
topK: 40,
|
||||||
|
topP: 0.4,
|
||||||
|
repeatPenalty: 1.18,
|
||||||
|
repeatLastN: 64,
|
||||||
|
nBatch: 8,
|
||||||
|
}
|
||||||
|
|
||||||
module.exports = {
|
module.exports = {
|
||||||
DEFAULT_DIRECTORY,
|
DEFAULT_DIRECTORY,
|
||||||
DEFAULT_LIBRARIES_DIRECTORY,
|
DEFAULT_LIBRARIES_DIRECTORY,
|
||||||
|
DEFAULT_MODEL_CONFIG,
|
||||||
|
DEFAULT_MODEL_LIST_URL,
|
||||||
|
DEFAULT_PROMPT_CONTEXT,
|
||||||
};
|
};
|
||||||
|
241
gpt4all-bindings/typescript/src/gpt4all.d.ts
vendored
241
gpt4all-bindings/typescript/src/gpt4all.d.ts
vendored
@ -1,12 +1,13 @@
|
|||||||
/// <reference types="node" />
|
/// <reference types="node" />
|
||||||
declare module "gpt4all";
|
declare module "gpt4all";
|
||||||
|
|
||||||
|
|
||||||
/** Type of the model */
|
/** Type of the model */
|
||||||
type ModelType = "gptj" | "llama" | "mpt" | "replit";
|
type ModelType = "gptj" | "llama" | "mpt" | "replit";
|
||||||
|
|
||||||
|
// NOTE: "deprecated" tag in below comment breaks the doc generator https://github.com/documentationjs/documentation/issues/1596
|
||||||
/**
|
/**
|
||||||
* Full list of models available
|
* Full list of models available
|
||||||
|
* @deprecated These model names are outdated and this type will not be maintained, please use a string literal instead
|
||||||
*/
|
*/
|
||||||
interface ModelFile {
|
interface ModelFile {
|
||||||
/** List of GPT-J Models */
|
/** List of GPT-J Models */
|
||||||
@ -39,10 +40,37 @@ interface LLModelOptions {
|
|||||||
* Model architecture. This argument currently does not have any functionality and is just used as descriptive identifier for user.
|
* Model architecture. This argument currently does not have any functionality and is just used as descriptive identifier for user.
|
||||||
*/
|
*/
|
||||||
type?: ModelType;
|
type?: ModelType;
|
||||||
model_name: ModelFile[ModelType];
|
model_name: string;
|
||||||
model_path: string;
|
model_path: string;
|
||||||
library_path?: string;
|
library_path?: string;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
interface ModelConfig {
|
||||||
|
systemPrompt: string;
|
||||||
|
promptTemplate: string;
|
||||||
|
path: string;
|
||||||
|
url?: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
declare class InferenceModel {
|
||||||
|
constructor(llm: LLModel, config: ModelConfig);
|
||||||
|
llm: LLModel;
|
||||||
|
config: ModelConfig;
|
||||||
|
|
||||||
|
generate(
|
||||||
|
prompt: string,
|
||||||
|
options?: Partial<LLModelPromptContext>
|
||||||
|
): Promise<string>;
|
||||||
|
}
|
||||||
|
|
||||||
|
declare class EmbeddingModel {
|
||||||
|
constructor(llm: LLModel, config: ModelConfig);
|
||||||
|
llm: LLModel;
|
||||||
|
config: ModelConfig;
|
||||||
|
|
||||||
|
embed(text: string): Float32Array;
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* LLModel class representing a language model.
|
* LLModel class representing a language model.
|
||||||
* This is a base class that provides common functionality for different types of language models.
|
* This is a base class that provides common functionality for different types of language models.
|
||||||
@ -90,17 +118,21 @@ declare class LLModel {
|
|||||||
* @param params Optional parameters for the prompt context.
|
* @param params Optional parameters for the prompt context.
|
||||||
* @returns The result of the model prompt.
|
* @returns The result of the model prompt.
|
||||||
*/
|
*/
|
||||||
raw_prompt(q: string, params: Partial<LLModelPromptContext>, callback: (res: string) => void): void; // TODO work on return type
|
raw_prompt(
|
||||||
|
q: string,
|
||||||
|
params: Partial<LLModelPromptContext>,
|
||||||
|
callback: (res: string) => void
|
||||||
|
): void; // TODO work on return type
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Embed text with the model. Keep in mind that
|
* Embed text with the model. Keep in mind that
|
||||||
* not all models can embed text, (only bert can embed as of 07/16/2023 (mm/dd/yyyy))
|
* not all models can embed text, (only bert can embed as of 07/16/2023 (mm/dd/yyyy))
|
||||||
* Use the prompt function exported for a value
|
* Use the prompt function exported for a value
|
||||||
* @param q The prompt input.
|
* @param q The prompt input.
|
||||||
* @param params Optional parameters for the prompt context.
|
* @param params Optional parameters for the prompt context.
|
||||||
* @returns The result of the model prompt.
|
* @returns The result of the model prompt.
|
||||||
*/
|
*/
|
||||||
embed(text: string) : Float32Array
|
embed(text: string): Float32Array;
|
||||||
/**
|
/**
|
||||||
* Whether the model is loaded or not.
|
* Whether the model is loaded or not.
|
||||||
*/
|
*/
|
||||||
@ -119,60 +151,66 @@ declare class LLModel {
|
|||||||
interface LoadModelOptions {
|
interface LoadModelOptions {
|
||||||
modelPath?: string;
|
modelPath?: string;
|
||||||
librariesPath?: string;
|
librariesPath?: string;
|
||||||
|
modelConfigFile?: string;
|
||||||
allowDownload?: boolean;
|
allowDownload?: boolean;
|
||||||
verbose?: boolean;
|
verbose?: boolean;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
interface InferenceModelOptions extends LoadModelOptions {
|
||||||
|
type?: "inference";
|
||||||
|
}
|
||||||
|
|
||||||
|
interface EmbeddingModelOptions extends LoadModelOptions {
|
||||||
|
type: "embedding";
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Loads a machine learning model with the specified name. The defacto way to create a model.
|
* Loads a machine learning model with the specified name. The defacto way to create a model.
|
||||||
* By default this will download a model from the official GPT4ALL website, if a model is not present at given path.
|
* By default this will download a model from the official GPT4ALL website, if a model is not present at given path.
|
||||||
*
|
*
|
||||||
* @param {string} modelName - The name of the model to load.
|
* @param {string} modelName - The name of the model to load.
|
||||||
* @param {LoadModelOptions|undefined} [options] - (Optional) Additional options for loading the model.
|
* @param {LoadModelOptions|undefined} [options] - (Optional) Additional options for loading the model.
|
||||||
* @returns {Promise<LLModel>} A promise that resolves to an instance of the loaded LLModel.
|
* @returns {Promise<InferenceModel | EmbeddingModel>} A promise that resolves to an instance of the loaded LLModel.
|
||||||
*/
|
*/
|
||||||
declare function loadModel(
|
declare function loadModel(
|
||||||
modelName: string,
|
modelName: string,
|
||||||
options?: LoadModelOptions
|
options?: InferenceModelOptions
|
||||||
): Promise<LLModel>;
|
): Promise<InferenceModel>;
|
||||||
|
|
||||||
|
declare function loadModel(
|
||||||
|
modelName: string,
|
||||||
|
options?: EmbeddingModelOptions
|
||||||
|
): Promise<EmbeddingModel>;
|
||||||
|
|
||||||
|
declare function loadModel(
|
||||||
|
modelName: string,
|
||||||
|
options?: EmbeddingOptions | InferenceOptions
|
||||||
|
): Promise<InferenceModel | EmbeddingModel>;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* The nodejs equivalent to python binding's chat_completion
|
* The nodejs equivalent to python binding's chat_completion
|
||||||
* @param {LLModel} llmodel - The language model object.
|
* @param {InferenceModel} model - The language model object.
|
||||||
* @param {PromptMessage[]} messages - The array of messages for the conversation.
|
* @param {PromptMessage[]} messages - The array of messages for the conversation.
|
||||||
* @param {CompletionOptions} options - The options for creating the completion.
|
* @param {CompletionOptions} options - The options for creating the completion.
|
||||||
* @returns {CompletionReturn} The completion result.
|
* @returns {CompletionReturn} The completion result.
|
||||||
* @example
|
|
||||||
* const llmodel = new LLModel(model)
|
|
||||||
* const messages = [
|
|
||||||
* { role: 'system', message: 'You are a weather forecaster.' },
|
|
||||||
* { role: 'user', message: 'should i go out today?' } ]
|
|
||||||
* const completion = await createCompletion(llmodel, messages, {
|
|
||||||
* verbose: true,
|
|
||||||
* temp: 0.9,
|
|
||||||
* })
|
|
||||||
* console.log(completion.choices[0].message.content)
|
|
||||||
* // No, it's going to be cold and rainy.
|
|
||||||
*/
|
*/
|
||||||
declare function createCompletion(
|
declare function createCompletion(
|
||||||
llmodel: LLModel,
|
model: InferenceModel,
|
||||||
messages: PromptMessage[],
|
messages: PromptMessage[],
|
||||||
options?: CompletionOptions
|
options?: CompletionOptions
|
||||||
): Promise<CompletionReturn>;
|
): Promise<CompletionReturn>;
|
||||||
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* The nodejs moral equivalent to python binding's Embed4All().embed()
|
* The nodejs moral equivalent to python binding's Embed4All().embed()
|
||||||
* meow
|
* meow
|
||||||
* @param {LLModel} llmodel - The language model object.
|
* @param {EmbeddingModel} model - The language model object.
|
||||||
* @param {string} text - text to embed
|
* @param {string} text - text to embed
|
||||||
* @returns {Float32Array} The completion result.
|
* @returns {Float32Array} The completion result.
|
||||||
*/
|
*/
|
||||||
declare function createEmbedding(
|
declare function createEmbedding(
|
||||||
llmodel: LLModel,
|
model: EmbeddingModel,
|
||||||
text: string,
|
text: string
|
||||||
): Float32Array
|
): Float32Array;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* The options for creating the completion.
|
* The options for creating the completion.
|
||||||
@ -185,16 +223,25 @@ interface CompletionOptions extends Partial<LLModelPromptContext> {
|
|||||||
verbose?: boolean;
|
verbose?: boolean;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Indicates if the default header is included in the prompt.
|
* Template for the system message. Will be put before the conversation with %1 being replaced by all system messages.
|
||||||
* @default true
|
* Note that if this is not defined, system messages will not be included in the prompt.
|
||||||
*/
|
*/
|
||||||
hasDefaultHeader?: boolean;
|
systemPromptTemplate?: string;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Indicates if the default footer is included in the prompt.
|
* Template for user messages, with %1 being replaced by the message.
|
||||||
* @default true
|
|
||||||
*/
|
*/
|
||||||
hasDefaultFooter?: boolean;
|
promptTemplate?: boolean;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* The initial instruction for the model, on top of the prompt
|
||||||
|
*/
|
||||||
|
promptHeader?: string;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* The last instruction for the model, appended to the end of the prompt.
|
||||||
|
*/
|
||||||
|
promptFooter?: string;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@ -212,10 +259,8 @@ interface PromptMessage {
|
|||||||
* The result of the completion, similar to OpenAI's format.
|
* The result of the completion, similar to OpenAI's format.
|
||||||
*/
|
*/
|
||||||
interface CompletionReturn {
|
interface CompletionReturn {
|
||||||
/** The model name.
|
/** The model used for the completion. */
|
||||||
* @type {ModelFile}
|
model: string;
|
||||||
*/
|
|
||||||
model: ModelFile[ModelType];
|
|
||||||
|
|
||||||
/** Token usage report. */
|
/** Token usage report. */
|
||||||
usage: {
|
usage: {
|
||||||
@ -246,58 +291,85 @@ interface CompletionChoice {
|
|||||||
*/
|
*/
|
||||||
interface LLModelPromptContext {
|
interface LLModelPromptContext {
|
||||||
/** The size of the raw logits vector. */
|
/** The size of the raw logits vector. */
|
||||||
logits_size: number;
|
logitsSize: number;
|
||||||
|
|
||||||
/** The size of the raw tokens vector. */
|
/** The size of the raw tokens vector. */
|
||||||
tokens_size: number;
|
tokensSize: number;
|
||||||
|
|
||||||
/** The number of tokens in the past conversation. */
|
/** The number of tokens in the past conversation. */
|
||||||
n_past: number;
|
nPast: number;
|
||||||
|
|
||||||
/** The number of tokens possible in the context window.
|
/** The number of tokens possible in the context window.
|
||||||
* @default 1024
|
* @default 1024
|
||||||
*/
|
*/
|
||||||
n_ctx: number;
|
nCtx: number;
|
||||||
|
|
||||||
/** The number of tokens to predict.
|
/** The number of tokens to predict.
|
||||||
* @default 128
|
* @default 128
|
||||||
* */
|
* */
|
||||||
n_predict: number;
|
nPredict: number;
|
||||||
|
|
||||||
/** The top-k logits to sample from.
|
/** The top-k logits to sample from.
|
||||||
|
* Top-K sampling selects the next token only from the top K most likely tokens predicted by the model.
|
||||||
|
* It helps reduce the risk of generating low-probability or nonsensical tokens, but it may also limit
|
||||||
|
* the diversity of the output. A higher value for top-K (eg., 100) will consider more tokens and lead
|
||||||
|
* to more diverse text, while a lower value (eg., 10) will focus on the most probable tokens and generate
|
||||||
|
* more conservative text. 30 - 60 is a good range for most tasks.
|
||||||
* @default 40
|
* @default 40
|
||||||
* */
|
* */
|
||||||
top_k: number;
|
topK: number;
|
||||||
|
|
||||||
/** The nucleus sampling probability threshold.
|
/** The nucleus sampling probability threshold.
|
||||||
* @default 0.9
|
* Top-P limits the selection of the next token to a subset of tokens with a cumulative probability
|
||||||
|
* above a threshold P. This method, also known as nucleus sampling, finds a balance between diversity
|
||||||
|
* and quality by considering both token probabilities and the number of tokens available for sampling.
|
||||||
|
* When using a higher value for top-P (eg., 0.95), the generated text becomes more diverse.
|
||||||
|
* On the other hand, a lower value (eg., 0.1) produces more focused and conservative text.
|
||||||
|
* The default value is 0.4, which is aimed to be the middle ground between focus and diversity, but
|
||||||
|
* for more creative tasks a higher top-p value will be beneficial, about 0.5-0.9 is a good range for that.
|
||||||
|
* @default 0.4
|
||||||
* */
|
* */
|
||||||
top_p: number;
|
topP: number;
|
||||||
|
|
||||||
/** The temperature to adjust the model's output distribution.
|
/** The temperature to adjust the model's output distribution.
|
||||||
* @default 0.72
|
* Temperature is like a knob that adjusts how creative or focused the output becomes. Higher temperatures
|
||||||
|
* (eg., 1.2) increase randomness, resulting in more imaginative and diverse text. Lower temperatures (eg., 0.5)
|
||||||
|
* make the output more focused, predictable, and conservative. When the temperature is set to 0, the output
|
||||||
|
* becomes completely deterministic, always selecting the most probable next token and producing identical results
|
||||||
|
* each time. A safe range would be around 0.6 - 0.85, but you are free to search what value fits best for you.
|
||||||
|
* @default 0.7
|
||||||
* */
|
* */
|
||||||
temp: number;
|
temp: number;
|
||||||
|
|
||||||
/** The number of predictions to generate in parallel.
|
/** The number of predictions to generate in parallel.
|
||||||
|
* By splitting the prompt every N tokens, prompt-batch-size reduces RAM usage during processing. However,
|
||||||
|
* this can increase the processing time as a trade-off. If the N value is set too low (e.g., 10), long prompts
|
||||||
|
* with 500+ tokens will be most affected, requiring numerous processing runs to complete the prompt processing.
|
||||||
|
* To ensure optimal performance, setting the prompt-batch-size to 2048 allows processing of all tokens in a single run.
|
||||||
* @default 8
|
* @default 8
|
||||||
* */
|
* */
|
||||||
n_batch: number;
|
nBatch: number;
|
||||||
|
|
||||||
/** The penalty factor for repeated tokens.
|
/** The penalty factor for repeated tokens.
|
||||||
* @default 1
|
* Repeat-penalty can help penalize tokens based on how frequently they occur in the text, including the input prompt.
|
||||||
|
* A token that has already appeared five times is penalized more heavily than a token that has appeared only one time.
|
||||||
|
* A value of 1 means that there is no penalty and values larger than 1 discourage repeated tokens.
|
||||||
|
* @default 1.18
|
||||||
* */
|
* */
|
||||||
repeat_penalty: number;
|
repeatPenalty: number;
|
||||||
|
|
||||||
/** The number of last tokens to penalize.
|
/** The number of last tokens to penalize.
|
||||||
* @default 10
|
* The repeat-penalty-tokens N option controls the number of tokens in the history to consider for penalizing repetition.
|
||||||
|
* A larger value will look further back in the generated text to prevent repetitions, while a smaller value will only
|
||||||
|
* consider recent tokens.
|
||||||
|
* @default 64
|
||||||
* */
|
* */
|
||||||
repeat_last_n: number;
|
repeatLastN: number;
|
||||||
|
|
||||||
/** The percentage of context to erase if the context window is exceeded.
|
/** The percentage of context to erase if the context window is exceeded.
|
||||||
* @default 0.5
|
* @default 0.5
|
||||||
* */
|
* */
|
||||||
context_erase: number;
|
contextErase: number;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@ -320,24 +392,35 @@ declare const DEFAULT_DIRECTORY: string;
|
|||||||
* This searches DEFAULT_DIRECTORY/libraries, cwd/libraries, and finally cwd.
|
* This searches DEFAULT_DIRECTORY/libraries, cwd/libraries, and finally cwd.
|
||||||
*/
|
*/
|
||||||
declare const DEFAULT_LIBRARIES_DIRECTORY: string;
|
declare const DEFAULT_LIBRARIES_DIRECTORY: string;
|
||||||
interface PromptMessage {
|
|
||||||
role: "system" | "assistant" | "user";
|
|
||||||
content: string;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Initiates the download of a model file of a specific model type.
|
* Default model configuration.
|
||||||
|
*/
|
||||||
|
declare const DEFAULT_MODEL_CONFIG: ModelConfig;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Default prompt context.
|
||||||
|
*/
|
||||||
|
declare const DEFAULT_PROMT_CONTEXT: LLModelPromptContext;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Default model list url.
|
||||||
|
*/
|
||||||
|
declare const DEFAULT_MODEL_LIST_URL: string;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Initiates the download of a model file.
|
||||||
* By default this downloads without waiting. use the controller returned to alter this behavior.
|
* By default this downloads without waiting. use the controller returned to alter this behavior.
|
||||||
* @param {ModelFile} modelName - The model file to be downloaded.
|
* @param {string} modelName - The model to be downloaded.
|
||||||
* @param {DownloadOptions} options - to pass into the downloader. Default is { location: (cwd), debug: false }.
|
* @param {DownloadOptions} options - to pass into the downloader. Default is { location: (cwd), verbose: false }.
|
||||||
* @returns {DownloadController} object that allows controlling the download process.
|
* @returns {DownloadController} object that allows controlling the download process.
|
||||||
*
|
*
|
||||||
* @throws {Error} If the model already exists in the specified location.
|
* @throws {Error} If the model already exists in the specified location.
|
||||||
* @throws {Error} If the model cannot be found at the specified url.
|
* @throws {Error} If the model cannot be found at the specified url.
|
||||||
*
|
*
|
||||||
* @example
|
* @example
|
||||||
* const controller = download('ggml-gpt4all-j-v1.3-groovy.bin')
|
* const download = downloadModel('ggml-gpt4all-j-v1.3-groovy.bin')
|
||||||
* controller.promise().then(() => console.log('Downloaded!'))
|
* download.promise.then(() => console.log('Downloaded!'))
|
||||||
*/
|
*/
|
||||||
declare function downloadModel(
|
declare function downloadModel(
|
||||||
modelName: string,
|
modelName: string,
|
||||||
@ -358,46 +441,55 @@ interface DownloadModelOptions {
|
|||||||
* Debug mode -- check how long it took to download in seconds
|
* Debug mode -- check how long it took to download in seconds
|
||||||
* @default false
|
* @default false
|
||||||
*/
|
*/
|
||||||
debug?: boolean;
|
verbose?: boolean;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Remote download url. Defaults to `https://gpt4all.io/models`
|
* Remote download url. Defaults to `https://gpt4all.io/models/<modelName>`
|
||||||
* @default https://gpt4all.io/models
|
* @default https://gpt4all.io/models/<modelName>
|
||||||
*/
|
*/
|
||||||
url?: string;
|
url?: string;
|
||||||
/**
|
/**
|
||||||
* Whether to verify the hash of the download to ensure a proper download occurred.
|
* MD5 sum of the model file. If this is provided, the downloaded file will be checked against this sum.
|
||||||
* @default true
|
* If the sums do not match, an error will be thrown and the file will be deleted.
|
||||||
*/
|
*/
|
||||||
md5sum?: boolean;
|
md5sum?: string;
|
||||||
}
|
}
|
||||||
|
|
||||||
declare function listModels(): Promise<Record<string, string>[]>;
|
interface ListModelsOptions {
|
||||||
|
url?: string;
|
||||||
|
file?: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
declare function listModels(options?: ListModelsOptions): Promise<ModelConfig[]>;
|
||||||
|
|
||||||
interface RetrieveModelOptions {
|
interface RetrieveModelOptions {
|
||||||
allowDownload?: boolean;
|
allowDownload?: boolean;
|
||||||
verbose?: boolean;
|
verbose?: boolean;
|
||||||
modelPath?: string;
|
modelPath?: string;
|
||||||
|
modelConfigFile?: string;
|
||||||
}
|
}
|
||||||
|
|
||||||
declare function retrieveModel(
|
declare function retrieveModel(
|
||||||
model: string,
|
modelName: string,
|
||||||
options?: RetrieveModelOptions
|
options?: RetrieveModelOptions
|
||||||
): Promise<string>;
|
): Promise<ModelConfig>;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Model download controller.
|
* Model download controller.
|
||||||
*/
|
*/
|
||||||
interface DownloadController {
|
interface DownloadController {
|
||||||
/** Cancel the request to download from gpt4all website if this is called. */
|
/** Cancel the request to download if this is called. */
|
||||||
cancel: () => void;
|
cancel: () => void;
|
||||||
/** Convert the downloader into a promise, allowing people to await and manage its lifetime */
|
/** A promise resolving to the downloaded models config once the download is done */
|
||||||
promise: () => Promise<void>;
|
promise: Promise<ModelConfig>;
|
||||||
}
|
}
|
||||||
|
|
||||||
export {
|
export {
|
||||||
ModelType,
|
ModelType,
|
||||||
ModelFile,
|
ModelFile,
|
||||||
|
ModelConfig,
|
||||||
|
InferenceModel,
|
||||||
|
EmbeddingModel,
|
||||||
LLModel,
|
LLModel,
|
||||||
LLModelPromptContext,
|
LLModelPromptContext,
|
||||||
PromptMessage,
|
PromptMessage,
|
||||||
@ -409,10 +501,13 @@ export {
|
|||||||
createTokenStream,
|
createTokenStream,
|
||||||
DEFAULT_DIRECTORY,
|
DEFAULT_DIRECTORY,
|
||||||
DEFAULT_LIBRARIES_DIRECTORY,
|
DEFAULT_LIBRARIES_DIRECTORY,
|
||||||
|
DEFAULT_MODEL_CONFIG,
|
||||||
|
DEFAULT_PROMT_CONTEXT,
|
||||||
|
DEFAULT_MODEL_LIST_URL,
|
||||||
downloadModel,
|
downloadModel,
|
||||||
retrieveModel,
|
retrieveModel,
|
||||||
listModels,
|
listModels,
|
||||||
DownloadController,
|
DownloadController,
|
||||||
RetrieveModelOptions,
|
RetrieveModelOptions,
|
||||||
DownloadModelOptions
|
DownloadModelOptions,
|
||||||
};
|
};
|
||||||
|
@ -10,19 +10,36 @@ const {
|
|||||||
downloadModel,
|
downloadModel,
|
||||||
appendBinSuffixIfMissing,
|
appendBinSuffixIfMissing,
|
||||||
} = require("./util.js");
|
} = require("./util.js");
|
||||||
const { DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY } = require("./config.js");
|
const {
|
||||||
|
DEFAULT_DIRECTORY,
|
||||||
|
DEFAULT_LIBRARIES_DIRECTORY,
|
||||||
|
DEFAULT_PROMPT_CONTEXT,
|
||||||
|
DEFAULT_MODEL_CONFIG,
|
||||||
|
DEFAULT_MODEL_LIST_URL,
|
||||||
|
} = require("./config.js");
|
||||||
|
const { InferenceModel, EmbeddingModel } = require("./models.js");
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Loads a machine learning model with the specified name. The defacto way to create a model.
|
||||||
|
* By default this will download a model from the official GPT4ALL website, if a model is not present at given path.
|
||||||
|
*
|
||||||
|
* @param {string} modelName - The name of the model to load.
|
||||||
|
* @param {LoadModelOptions|undefined} [options] - (Optional) Additional options for loading the model.
|
||||||
|
* @returns {Promise<InferenceModel | EmbeddingModel>} A promise that resolves to an instance of the loaded LLModel.
|
||||||
|
*/
|
||||||
async function loadModel(modelName, options = {}) {
|
async function loadModel(modelName, options = {}) {
|
||||||
const loadOptions = {
|
const loadOptions = {
|
||||||
modelPath: DEFAULT_DIRECTORY,
|
modelPath: DEFAULT_DIRECTORY,
|
||||||
librariesPath: DEFAULT_LIBRARIES_DIRECTORY,
|
librariesPath: DEFAULT_LIBRARIES_DIRECTORY,
|
||||||
|
type: "inference",
|
||||||
allowDownload: true,
|
allowDownload: true,
|
||||||
verbose: true,
|
verbose: true,
|
||||||
...options,
|
...options,
|
||||||
};
|
};
|
||||||
|
|
||||||
await retrieveModel(modelName, {
|
const modelConfig = await retrieveModel(modelName, {
|
||||||
modelPath: loadOptions.modelPath,
|
modelPath: loadOptions.modelPath,
|
||||||
|
modelConfigFile: loadOptions.modelConfigFile,
|
||||||
allowDownload: loadOptions.allowDownload,
|
allowDownload: loadOptions.allowDownload,
|
||||||
verbose: loadOptions.verbose,
|
verbose: loadOptions.verbose,
|
||||||
});
|
});
|
||||||
@ -37,7 +54,7 @@ async function loadModel(modelName, options = {}) {
|
|||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
if(!libPath) {
|
if (!libPath) {
|
||||||
throw Error("Could not find a valid path from " + libSearchPaths);
|
throw Error("Could not find a valid path from " + libSearchPaths);
|
||||||
}
|
}
|
||||||
const llmOptions = {
|
const llmOptions = {
|
||||||
@ -47,99 +64,183 @@ async function loadModel(modelName, options = {}) {
|
|||||||
};
|
};
|
||||||
|
|
||||||
if (loadOptions.verbose) {
|
if (loadOptions.verbose) {
|
||||||
console.log("Creating LLModel with options:", llmOptions);
|
console.debug("Creating LLModel with options:", llmOptions);
|
||||||
}
|
}
|
||||||
const llmodel = new LLModel(llmOptions);
|
const llmodel = new LLModel(llmOptions);
|
||||||
|
|
||||||
return llmodel;
|
if (loadOptions.type === "embedding") {
|
||||||
|
return new EmbeddingModel(llmodel, modelConfig);
|
||||||
|
} else if (loadOptions.type === "inference") {
|
||||||
|
return new InferenceModel(llmodel, modelConfig);
|
||||||
|
} else {
|
||||||
|
throw Error("Invalid model type: " + loadOptions.type);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
function createPrompt(messages, hasDefaultHeader, hasDefaultFooter) {
|
/**
|
||||||
let fullPrompt = [];
|
* Formats a list of messages into a single prompt string.
|
||||||
|
*/
|
||||||
for (const message of messages) {
|
function formatChatPrompt(
|
||||||
if (message.role === "system") {
|
|
||||||
const systemMessage = message.content;
|
|
||||||
fullPrompt.push(systemMessage);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if (hasDefaultHeader) {
|
|
||||||
fullPrompt.push(`### Instruction: The prompt below is a question to answer, a task to complete, or a conversation to respond to; decide which and write an appropriate response.`);
|
|
||||||
}
|
|
||||||
let prompt = "### Prompt:";
|
|
||||||
for (const message of messages) {
|
|
||||||
if (message.role === "user") {
|
|
||||||
const user_message = message["content"];
|
|
||||||
prompt += user_message;
|
|
||||||
}
|
|
||||||
if (message["role"] == "assistant") {
|
|
||||||
const assistant_message = "Response:" + message["content"];
|
|
||||||
prompt += assistant_message;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
fullPrompt.push(prompt);
|
|
||||||
if (hasDefaultFooter) {
|
|
||||||
fullPrompt.push("### Response:");
|
|
||||||
}
|
|
||||||
|
|
||||||
return fullPrompt.join('\n');
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
function createEmbedding(llmodel, text) {
|
|
||||||
return llmodel.embed(text)
|
|
||||||
}
|
|
||||||
async function createCompletion(
|
|
||||||
llmodel,
|
|
||||||
messages,
|
messages,
|
||||||
options = {
|
{
|
||||||
hasDefaultHeader: true,
|
systemPromptTemplate,
|
||||||
hasDefaultFooter: false,
|
defaultSystemPrompt,
|
||||||
verbose: true,
|
promptTemplate,
|
||||||
|
promptFooter,
|
||||||
|
promptHeader,
|
||||||
}
|
}
|
||||||
) {
|
) {
|
||||||
//creating the keys to insert into promptMaker.
|
const systemMessages = messages
|
||||||
const fullPrompt = createPrompt(
|
.filter((message) => message.role === "system")
|
||||||
messages,
|
.map((message) => message.content);
|
||||||
options.hasDefaultHeader ?? true,
|
|
||||||
options.hasDefaultFooter ?? true
|
let fullPrompt = "";
|
||||||
);
|
|
||||||
if (options.verbose) {
|
if (promptHeader) {
|
||||||
console.log("Sent: " + fullPrompt);
|
fullPrompt += promptHeader + "\n\n";
|
||||||
}
|
}
|
||||||
const promisifiedRawPrompt = llmodel.raw_prompt(fullPrompt, options, (s) => {});
|
|
||||||
return promisifiedRawPrompt.then((response) => {
|
if (systemPromptTemplate) {
|
||||||
return {
|
// if user specified a template for the system prompt, put all system messages in the template
|
||||||
llmodel: llmodel.name(),
|
let systemPrompt = "";
|
||||||
usage: {
|
|
||||||
prompt_tokens: fullPrompt.length,
|
if (systemMessages.length > 0) {
|
||||||
completion_tokens: response.length, //TODO
|
systemPrompt += systemMessages.join("\n");
|
||||||
total_tokens: fullPrompt.length + response.length, //TODO
|
}
|
||||||
},
|
|
||||||
choices: [
|
if (systemPrompt) {
|
||||||
{
|
fullPrompt +=
|
||||||
message: {
|
systemPromptTemplate.replace("%1", systemPrompt) + "\n";
|
||||||
role: "assistant",
|
}
|
||||||
content: response,
|
} else if (defaultSystemPrompt) {
|
||||||
},
|
// otherwise, use the system prompt from the model config and ignore system messages
|
||||||
},
|
fullPrompt += defaultSystemPrompt + "\n\n";
|
||||||
],
|
}
|
||||||
};
|
|
||||||
|
if (systemMessages.length > 0 && !systemPromptTemplate) {
|
||||||
|
console.warn(
|
||||||
|
"System messages were provided, but no systemPromptTemplate was specified. System messages will be ignored."
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
for (const message of messages) {
|
||||||
|
if (message.role === "user") {
|
||||||
|
const userMessage = promptTemplate.replace(
|
||||||
|
"%1",
|
||||||
|
message["content"]
|
||||||
|
);
|
||||||
|
fullPrompt += userMessage;
|
||||||
|
}
|
||||||
|
if (message["role"] == "assistant") {
|
||||||
|
const assistantMessage = message["content"] + "\n";
|
||||||
|
fullPrompt += assistantMessage;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (promptFooter) {
|
||||||
|
fullPrompt += "\n\n" + promptFooter;
|
||||||
|
}
|
||||||
|
|
||||||
|
return fullPrompt;
|
||||||
|
}
|
||||||
|
|
||||||
|
function createEmbedding(model, text) {
|
||||||
|
return model.embed(text);
|
||||||
|
}
|
||||||
|
|
||||||
|
const defaultCompletionOptions = {
|
||||||
|
verbose: false,
|
||||||
|
...DEFAULT_PROMPT_CONTEXT,
|
||||||
|
};
|
||||||
|
|
||||||
|
async function createCompletion(
|
||||||
|
model,
|
||||||
|
messages,
|
||||||
|
options = defaultCompletionOptions
|
||||||
|
) {
|
||||||
|
if (options.hasDefaultHeader !== undefined) {
|
||||||
|
console.warn(
|
||||||
|
"hasDefaultHeader (bool) is deprecated and has no effect, use promptHeader (string) instead"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (options.hasDefaultFooter !== undefined) {
|
||||||
|
console.warn(
|
||||||
|
"hasDefaultFooter (bool) is deprecated and has no effect, use promptFooter (string) instead"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
const optionsWithDefaults = {
|
||||||
|
...defaultCompletionOptions,
|
||||||
|
...options,
|
||||||
|
};
|
||||||
|
|
||||||
|
const {
|
||||||
|
verbose,
|
||||||
|
systemPromptTemplate,
|
||||||
|
promptTemplate,
|
||||||
|
promptHeader,
|
||||||
|
promptFooter,
|
||||||
|
...promptContext
|
||||||
|
} = optionsWithDefaults;
|
||||||
|
|
||||||
|
const prompt = formatChatPrompt(messages, {
|
||||||
|
systemPromptTemplate,
|
||||||
|
defaultSystemPrompt: model.config.systemPrompt,
|
||||||
|
promptTemplate: promptTemplate || model.config.promptTemplate || "%1",
|
||||||
|
promptHeader: promptHeader || "",
|
||||||
|
promptFooter: promptFooter || "",
|
||||||
|
// These were the default header/footer prompts used for non-chat single turn completions.
|
||||||
|
// both seem to be working well still with some models, so keeping them here for reference.
|
||||||
|
// promptHeader: '### Instruction: The prompt below is a question to answer, a task to complete, or a conversation to respond to; decide which and write an appropriate response.',
|
||||||
|
// promptFooter: '### Response:',
|
||||||
});
|
});
|
||||||
|
|
||||||
|
if (verbose) {
|
||||||
|
console.debug("Sending Prompt:\n" + prompt);
|
||||||
|
}
|
||||||
|
|
||||||
|
const response = await model.generate(prompt, promptContext);
|
||||||
|
|
||||||
|
if (verbose) {
|
||||||
|
console.debug("Received Response:\n" + response);
|
||||||
|
}
|
||||||
|
|
||||||
|
return {
|
||||||
|
llmodel: model.llm.name(),
|
||||||
|
usage: {
|
||||||
|
prompt_tokens: prompt.length,
|
||||||
|
completion_tokens: response.length, //TODO
|
||||||
|
total_tokens: prompt.length + response.length, //TODO
|
||||||
|
},
|
||||||
|
choices: [
|
||||||
|
{
|
||||||
|
message: {
|
||||||
|
role: "assistant",
|
||||||
|
content: response,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
],
|
||||||
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
function createTokenStream() {
|
function createTokenStream() {
|
||||||
throw Error("This API has not been completed yet!")
|
throw Error("This API has not been completed yet!");
|
||||||
}
|
}
|
||||||
|
|
||||||
module.exports = {
|
module.exports = {
|
||||||
DEFAULT_LIBRARIES_DIRECTORY,
|
DEFAULT_LIBRARIES_DIRECTORY,
|
||||||
DEFAULT_DIRECTORY,
|
DEFAULT_DIRECTORY,
|
||||||
|
DEFAULT_PROMPT_CONTEXT,
|
||||||
|
DEFAULT_MODEL_CONFIG,
|
||||||
|
DEFAULT_MODEL_LIST_URL,
|
||||||
LLModel,
|
LLModel,
|
||||||
|
InferenceModel,
|
||||||
|
EmbeddingModel,
|
||||||
createCompletion,
|
createCompletion,
|
||||||
createEmbedding,
|
createEmbedding,
|
||||||
downloadModel,
|
downloadModel,
|
||||||
retrieveModel,
|
retrieveModel,
|
||||||
loadModel,
|
loadModel,
|
||||||
createTokenStream
|
createTokenStream,
|
||||||
};
|
};
|
||||||
|
38
gpt4all-bindings/typescript/src/models.js
Normal file
38
gpt4all-bindings/typescript/src/models.js
Normal file
@ -0,0 +1,38 @@
|
|||||||
|
const { normalizePromptContext, warnOnSnakeCaseKeys } = require('./util');
|
||||||
|
|
||||||
|
class InferenceModel {
|
||||||
|
llm;
|
||||||
|
config;
|
||||||
|
|
||||||
|
constructor(llmodel, config) {
|
||||||
|
this.llm = llmodel;
|
||||||
|
this.config = config;
|
||||||
|
}
|
||||||
|
|
||||||
|
async generate(prompt, promptContext) {
|
||||||
|
warnOnSnakeCaseKeys(promptContext);
|
||||||
|
const normalizedPromptContext = normalizePromptContext(promptContext);
|
||||||
|
const result = this.llm.raw_prompt(prompt, normalizedPromptContext, () => {});
|
||||||
|
return result;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
class EmbeddingModel {
|
||||||
|
llm;
|
||||||
|
config;
|
||||||
|
|
||||||
|
constructor(llmodel, config) {
|
||||||
|
this.llm = llmodel;
|
||||||
|
this.config = config;
|
||||||
|
}
|
||||||
|
|
||||||
|
embed(text) {
|
||||||
|
return this.llm.embed(text)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
module.exports = {
|
||||||
|
InferenceModel,
|
||||||
|
EmbeddingModel,
|
||||||
|
};
|
@ -1,14 +1,45 @@
|
|||||||
const { createWriteStream, existsSync, statSync } = require("node:fs");
|
const { createWriteStream, existsSync, statSync } = require("node:fs");
|
||||||
const fsp = require('node:fs/promises')
|
const fsp = require("node:fs/promises");
|
||||||
const { performance } = require("node:perf_hooks");
|
const { performance } = require("node:perf_hooks");
|
||||||
const path = require("node:path");
|
const path = require("node:path");
|
||||||
const {mkdirp} = require("mkdirp");
|
const { mkdirp } = require("mkdirp");
|
||||||
const { DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY } = require("./config.js");
|
const md5File = require("md5-file");
|
||||||
const md5File = require('md5-file');
|
const {
|
||||||
async function listModels() {
|
DEFAULT_DIRECTORY,
|
||||||
const res = await fetch("https://gpt4all.io/models/models.json");
|
DEFAULT_MODEL_CONFIG,
|
||||||
const modelList = await res.json();
|
DEFAULT_MODEL_LIST_URL,
|
||||||
return modelList;
|
} = require("./config.js");
|
||||||
|
|
||||||
|
async function listModels(
|
||||||
|
options = {
|
||||||
|
url: DEFAULT_MODEL_LIST_URL,
|
||||||
|
}
|
||||||
|
) {
|
||||||
|
if (!options || (!options.url && !options.file)) {
|
||||||
|
throw new Error(
|
||||||
|
`No model list source specified. Please specify either a url or a file.`
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (options.file) {
|
||||||
|
if (!existsSync(options.file)) {
|
||||||
|
throw new Error(`Model list file ${options.file} does not exist.`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const fileContents = await fsp.readFile(options.file, "utf-8");
|
||||||
|
const modelList = JSON.parse(fileContents);
|
||||||
|
return modelList;
|
||||||
|
} else if (options.url) {
|
||||||
|
const res = await fetch(options.url);
|
||||||
|
|
||||||
|
if (!res.ok) {
|
||||||
|
throw Error(
|
||||||
|
`Failed to retrieve model list from ${url} - ${res.status} ${res.statusText}`
|
||||||
|
);
|
||||||
|
}
|
||||||
|
const modelList = await res.json();
|
||||||
|
return modelList;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
function appendBinSuffixIfMissing(name) {
|
function appendBinSuffixIfMissing(name) {
|
||||||
@ -32,11 +63,46 @@ function readChunks(reader) {
|
|||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Prints a warning if any keys in the prompt context are snake_case.
|
||||||
|
*/
|
||||||
|
function warnOnSnakeCaseKeys(promptContext) {
|
||||||
|
const snakeCaseKeys = Object.keys(promptContext).filter((key) =>
|
||||||
|
key.includes("_")
|
||||||
|
);
|
||||||
|
|
||||||
|
if (snakeCaseKeys.length > 0) {
|
||||||
|
console.warn(
|
||||||
|
"Prompt context keys should be camelCase. Support for snake_case might be removed in the future. Found keys: " +
|
||||||
|
snakeCaseKeys.join(", ")
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Converts all keys in the prompt context to snake_case
|
||||||
|
* For duplicate definitions, the value of the last occurrence will be used.
|
||||||
|
*/
|
||||||
|
function normalizePromptContext(promptContext) {
|
||||||
|
const normalizedPromptContext = {};
|
||||||
|
|
||||||
|
for (const key in promptContext) {
|
||||||
|
if (promptContext.hasOwnProperty(key)) {
|
||||||
|
const snakeKey = key.replace(
|
||||||
|
/[A-Z]/g,
|
||||||
|
(match) => `_${match.toLowerCase()}`
|
||||||
|
);
|
||||||
|
normalizedPromptContext[snakeKey] = promptContext[key];
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return normalizedPromptContext;
|
||||||
|
}
|
||||||
|
|
||||||
function downloadModel(modelName, options = {}) {
|
function downloadModel(modelName, options = {}) {
|
||||||
const downloadOptions = {
|
const downloadOptions = {
|
||||||
modelPath: DEFAULT_DIRECTORY,
|
modelPath: DEFAULT_DIRECTORY,
|
||||||
debug: false,
|
verbose: false,
|
||||||
md5sum: true,
|
|
||||||
...options,
|
...options,
|
||||||
};
|
};
|
||||||
|
|
||||||
@ -46,11 +112,16 @@ function downloadModel(modelName, options = {}) {
|
|||||||
modelName + ".part"
|
modelName + ".part"
|
||||||
);
|
);
|
||||||
const finalModelPath = path.join(downloadOptions.modelPath, modelFileName);
|
const finalModelPath = path.join(downloadOptions.modelPath, modelFileName);
|
||||||
const modelUrl = downloadOptions.url ?? `https://gpt4all.io/models/${modelFileName}`;
|
const modelUrl =
|
||||||
|
downloadOptions.url ?? `https://gpt4all.io/models/${modelFileName}`;
|
||||||
|
|
||||||
if (existsSync(finalModelPath)) {
|
if (existsSync(finalModelPath)) {
|
||||||
throw Error(`Model already exists at ${finalModelPath}`);
|
throw Error(`Model already exists at ${finalModelPath}`);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (downloadOptions.verbose) {
|
||||||
|
console.log(`Downloading ${modelName} from ${modelUrl}`);
|
||||||
|
}
|
||||||
|
|
||||||
const headers = {
|
const headers = {
|
||||||
"Accept-Ranges": "arraybuffer",
|
"Accept-Ranges": "arraybuffer",
|
||||||
@ -69,85 +140,81 @@ function downloadModel(modelName, options = {}) {
|
|||||||
const abortController = new AbortController();
|
const abortController = new AbortController();
|
||||||
const signal = abortController.signal;
|
const signal = abortController.signal;
|
||||||
|
|
||||||
// wrapper function to get the readable stream from request
|
const finalizeDownload = async () => {
|
||||||
const fetchModel = (fetchOpts = {}) =>
|
if (options.md5sum) {
|
||||||
fetch(modelUrl, {
|
const fileHash = await md5File(partialModelPath);
|
||||||
signal,
|
if (fileHash !== options.md5sum) {
|
||||||
...fetchOpts,
|
await fsp.unlink(partialModelPath);
|
||||||
}).then((res) => {
|
const message = `Model "${modelName}" failed verification: Hashes mismatch. Expected ${options.md5sum}, got ${fileHash}`;
|
||||||
if (!res.ok) {
|
throw Error(message);
|
||||||
throw Error(
|
|
||||||
`Failed to download model from ${modelUrl} - ${res.statusText}`
|
|
||||||
);
|
|
||||||
}
|
}
|
||||||
return res.body.getReader();
|
if (options.verbose) {
|
||||||
|
console.log(`MD5 hash verified: ${fileHash}`);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
await fsp.rename(partialModelPath, finalModelPath);
|
||||||
|
};
|
||||||
|
|
||||||
|
// a promise that executes and writes to a stream. Resolves to the path the model was downloaded to when done writing.
|
||||||
|
const downloadPromise = new Promise((resolve, reject) => {
|
||||||
|
let timestampStart;
|
||||||
|
|
||||||
|
if (options.verbose) {
|
||||||
|
console.log(`Downloading @ ${partialModelPath} ...`);
|
||||||
|
timestampStart = performance.now();
|
||||||
|
}
|
||||||
|
|
||||||
|
const writeStream = createWriteStream(
|
||||||
|
partialModelPath,
|
||||||
|
writeStreamOpts
|
||||||
|
);
|
||||||
|
|
||||||
|
writeStream.on("error", (e) => {
|
||||||
|
writeStream.close();
|
||||||
|
reject(e);
|
||||||
});
|
});
|
||||||
|
|
||||||
// a promise that executes and writes to a stream. Resolves when done writing.
|
writeStream.on("finish", () => {
|
||||||
const res = new Promise((resolve, reject) => {
|
if (options.verbose) {
|
||||||
fetchModel({ headers })
|
const elapsed = performance.now() - timestampStart;
|
||||||
// Resolves an array of a reader and writestream.
|
console.log(`Finished. Download took ${elapsed.toFixed(2)} ms`);
|
||||||
.then((reader) => [
|
}
|
||||||
reader,
|
|
||||||
createWriteStream(partialModelPath, writeStreamOpts),
|
|
||||||
])
|
|
||||||
.then(async ([readable, wstream]) => {
|
|
||||||
console.log("Downloading @ ", partialModelPath);
|
|
||||||
let perf;
|
|
||||||
|
|
||||||
if (options.debug) {
|
finalizeDownload()
|
||||||
perf = performance.now();
|
.then(() => {
|
||||||
|
resolve(finalModelPath);
|
||||||
|
})
|
||||||
|
.catch(reject);
|
||||||
|
});
|
||||||
|
|
||||||
|
fetch(modelUrl, {
|
||||||
|
signal,
|
||||||
|
headers,
|
||||||
|
})
|
||||||
|
.then((res) => {
|
||||||
|
if (!res.ok) {
|
||||||
|
const message = `Failed to download model from ${modelUrl} - ${res.status} ${res.statusText}`;
|
||||||
|
reject(Error(message));
|
||||||
}
|
}
|
||||||
|
return res.body.getReader();
|
||||||
wstream.on("finish", () => {
|
})
|
||||||
if (options.debug) {
|
.then(async (reader) => {
|
||||||
console.log(
|
for await (const chunk of readChunks(reader)) {
|
||||||
"Time taken: ",
|
writeStream.write(chunk);
|
||||||
(performance.now() - perf).toFixed(2),
|
|
||||||
" ms"
|
|
||||||
);
|
|
||||||
}
|
|
||||||
wstream.close();
|
|
||||||
});
|
|
||||||
|
|
||||||
wstream.on("error", (e) => {
|
|
||||||
wstream.close();
|
|
||||||
reject(e);
|
|
||||||
});
|
|
||||||
|
|
||||||
for await (const chunk of readChunks(readable)) {
|
|
||||||
wstream.write(chunk);
|
|
||||||
}
|
}
|
||||||
|
writeStream.end();
|
||||||
if (options.md5sum) {
|
|
||||||
const fileHash = await md5File(partialModelPath);
|
|
||||||
if (fileHash !== options.md5sum) {
|
|
||||||
await fsp.unlink(partialModelPath);
|
|
||||||
return reject(
|
|
||||||
Error(`Model "${modelName}" failed verification: Hashes mismatch`)
|
|
||||||
);
|
|
||||||
}
|
|
||||||
if (options.debug) {
|
|
||||||
console.log("MD5 hash verified: ", fileHash);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
await fsp.rename(partialModelPath, finalModelPath);
|
|
||||||
resolve(finalModelPath);
|
|
||||||
})
|
})
|
||||||
.catch(reject);
|
.catch(reject);
|
||||||
});
|
});
|
||||||
|
|
||||||
return {
|
return {
|
||||||
cancel: () => abortController.abort(),
|
cancel: () => abortController.abort(),
|
||||||
promise: () => res,
|
promise: downloadPromise,
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
async function retrieveModel (
|
async function retrieveModel(modelName, options = {}) {
|
||||||
modelName,
|
|
||||||
options = {}
|
|
||||||
) {
|
|
||||||
const retrieveOptions = {
|
const retrieveOptions = {
|
||||||
modelPath: DEFAULT_DIRECTORY,
|
modelPath: DEFAULT_DIRECTORY,
|
||||||
allowDownload: true,
|
allowDownload: true,
|
||||||
@ -161,46 +228,68 @@ async function retrieveModel (
|
|||||||
const fullModelPath = path.join(retrieveOptions.modelPath, modelFileName);
|
const fullModelPath = path.join(retrieveOptions.modelPath, modelFileName);
|
||||||
const modelExists = existsSync(fullModelPath);
|
const modelExists = existsSync(fullModelPath);
|
||||||
|
|
||||||
if (modelExists) {
|
let config = { ...DEFAULT_MODEL_CONFIG };
|
||||||
return fullModelPath;
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!retrieveOptions.allowDownload) {
|
const availableModels = await listModels({
|
||||||
throw Error(`Model does not exist at ${fullModelPath}`);
|
file: retrieveOptions.modelConfigFile,
|
||||||
}
|
url:
|
||||||
|
retrieveOptions.allowDownload &&
|
||||||
const availableModels = await listModels();
|
"https://gpt4all.io/models/models.json",
|
||||||
|
|
||||||
const foundModel = availableModels.find((model) => model.filename === modelFileName);
|
|
||||||
|
|
||||||
if (!foundModel) {
|
|
||||||
throw Error(`Model "${modelName}" is not available.`);
|
|
||||||
}
|
|
||||||
//todo
|
|
||||||
if (retrieveOptions.verbose) {
|
|
||||||
console.log(`Downloading ${modelName}...`);
|
|
||||||
}
|
|
||||||
|
|
||||||
const downloadController = downloadModel(modelName, {
|
|
||||||
modelPath: retrieveOptions.modelPath,
|
|
||||||
debug: retrieveOptions.verbose,
|
|
||||||
url: foundModel.url
|
|
||||||
});
|
});
|
||||||
|
|
||||||
const downloadPath = await downloadController.promise();
|
const loadedModelConfig = availableModels.find(
|
||||||
|
(model) => model.filename === modelFileName
|
||||||
|
);
|
||||||
|
|
||||||
if (retrieveOptions.verbose) {
|
if (loadedModelConfig) {
|
||||||
console.log(`Model downloaded to ${downloadPath}`);
|
config = {
|
||||||
|
...config,
|
||||||
|
...loadedModelConfig,
|
||||||
|
};
|
||||||
|
} else {
|
||||||
|
// if there's no local modelConfigFile specified, and allowDownload is false, the default model config will be used.
|
||||||
|
// warning the user here because the model may not work as expected.
|
||||||
|
console.warn(
|
||||||
|
`Failed to load model config for ${modelName}. Using defaults.`
|
||||||
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
return downloadPath
|
config.systemPrompt = config.systemPrompt.trim();
|
||||||
|
|
||||||
|
if (modelExists) {
|
||||||
|
config.path = fullModelPath;
|
||||||
|
|
||||||
|
if (retrieveOptions.verbose) {
|
||||||
|
console.log(`Found ${modelName} at ${fullModelPath}`);
|
||||||
|
}
|
||||||
|
} else if (retrieveOptions.allowDownload) {
|
||||||
|
|
||||||
|
const downloadController = downloadModel(modelName, {
|
||||||
|
modelPath: retrieveOptions.modelPath,
|
||||||
|
verbose: retrieveOptions.verbose,
|
||||||
|
filesize: config.filesize,
|
||||||
|
url: config.url,
|
||||||
|
md5sum: config.md5sum,
|
||||||
|
});
|
||||||
|
|
||||||
|
const downloadPath = await downloadController.promise;
|
||||||
|
config.path = downloadPath;
|
||||||
|
|
||||||
|
if (retrieveOptions.verbose) {
|
||||||
|
console.log(`Model downloaded to ${downloadPath}`);
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
throw Error("Failed to retrieve model.");
|
||||||
|
}
|
||||||
|
|
||||||
|
return config;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
module.exports = {
|
module.exports = {
|
||||||
appendBinSuffixIfMissing,
|
appendBinSuffixIfMissing,
|
||||||
downloadModel,
|
downloadModel,
|
||||||
retrieveModel,
|
retrieveModel,
|
||||||
listModels
|
listModels,
|
||||||
|
normalizePromptContext,
|
||||||
|
warnOnSnakeCaseKeys,
|
||||||
};
|
};
|
||||||
|
@ -1,79 +1,228 @@
|
|||||||
const path = require('node:path');
|
const path = require("node:path");
|
||||||
const os = require('node:os');
|
const os = require("node:os");
|
||||||
const { LLModel } = require('node-gyp-build')(path.resolve(__dirname, '..'));
|
const fsp = require("node:fs/promises");
|
||||||
|
const { LLModel } = require("node-gyp-build")(path.resolve(__dirname, ".."));
|
||||||
const {
|
const {
|
||||||
listModels,
|
listModels,
|
||||||
downloadModel,
|
downloadModel,
|
||||||
appendBinSuffixIfMissing,
|
appendBinSuffixIfMissing,
|
||||||
} = require('../src/util.js');
|
normalizePromptContext,
|
||||||
|
} = require("../src/util.js");
|
||||||
const {
|
const {
|
||||||
DEFAULT_DIRECTORY,
|
DEFAULT_DIRECTORY,
|
||||||
DEFAULT_LIBRARIES_DIRECTORY,
|
DEFAULT_LIBRARIES_DIRECTORY,
|
||||||
} = require('../src/config.js');
|
DEFAULT_MODEL_LIST_URL,
|
||||||
|
} = require("../src/config.js");
|
||||||
const {
|
const {
|
||||||
loadModel,
|
loadModel,
|
||||||
createPrompt,
|
createPrompt,
|
||||||
createCompletion,
|
createCompletion,
|
||||||
} = require('../src/gpt4all.js');
|
} = require("../src/gpt4all.js");
|
||||||
|
const { mock } = require("node:test");
|
||||||
|
|
||||||
|
describe("config", () => {
|
||||||
global.fetch = jest.fn(() =>
|
test("default paths constants are available and correct", () => {
|
||||||
Promise.resolve({
|
expect(DEFAULT_DIRECTORY).toBe(
|
||||||
json: () => Promise.resolve([{}, {}, {}]),
|
path.resolve(os.homedir(), ".cache/gpt4all")
|
||||||
})
|
);
|
||||||
);
|
|
||||||
|
|
||||||
jest.mock('../src/util.js', () => {
|
|
||||||
const actualModule = jest.requireActual('../src/util.js');
|
|
||||||
return {
|
|
||||||
...actualModule,
|
|
||||||
downloadModel: jest.fn(() =>
|
|
||||||
({ cancel: jest.fn(), promise: jest.fn() })
|
|
||||||
)
|
|
||||||
|
|
||||||
}
|
|
||||||
})
|
|
||||||
|
|
||||||
beforeEach(() => {
|
|
||||||
downloadModel.mockClear()
|
|
||||||
});
|
|
||||||
|
|
||||||
afterEach( () => {
|
|
||||||
fetch.mockClear();
|
|
||||||
jest.clearAllMocks()
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('utils', () => {
|
|
||||||
test("appendBinSuffixIfMissing", () => {
|
|
||||||
expect(appendBinSuffixIfMissing("filename")).toBe("filename.bin")
|
|
||||||
expect(appendBinSuffixIfMissing("filename.bin")).toBe("filename.bin")
|
|
||||||
})
|
|
||||||
test("default paths", () => {
|
|
||||||
expect(DEFAULT_DIRECTORY).toBe(path.resolve(os.homedir(), ".cache/gpt4all"))
|
|
||||||
const paths = [
|
const paths = [
|
||||||
path.join(DEFAULT_DIRECTORY, "libraries"),
|
path.join(DEFAULT_DIRECTORY, "libraries"),
|
||||||
path.resolve("./libraries"),
|
path.resolve("./libraries"),
|
||||||
path.resolve(
|
path.resolve(
|
||||||
__dirname,
|
__dirname,
|
||||||
"..",
|
"..",
|
||||||
`runtimes/${process.platform}-${process.arch}/native`
|
`runtimes/${process.platform}-${process.arch}/native`
|
||||||
),
|
),
|
||||||
process.cwd(),
|
process.cwd(),
|
||||||
];
|
];
|
||||||
expect(typeof DEFAULT_LIBRARIES_DIRECTORY).toBe('string')
|
expect(typeof DEFAULT_LIBRARIES_DIRECTORY).toBe("string");
|
||||||
expect(DEFAULT_LIBRARIES_DIRECTORY).toBe(paths.join(';'))
|
expect(DEFAULT_LIBRARIES_DIRECTORY).toBe(paths.join(";"));
|
||||||
})
|
});
|
||||||
|
});
|
||||||
|
|
||||||
test("listModels", async () => {
|
describe("listModels", () => {
|
||||||
try {
|
const fakeModels = require("./models.json");
|
||||||
await listModels();
|
const fakeModel = fakeModels[0];
|
||||||
} catch(e) {}
|
const mockResponse = JSON.stringify([fakeModel]);
|
||||||
|
|
||||||
expect(fetch).toHaveBeenCalledTimes(1)
|
let mockFetch, originalFetch;
|
||||||
expect(fetch).toHaveBeenCalledWith(
|
|
||||||
"https://gpt4all.io/models/models.json"
|
beforeAll(() => {
|
||||||
|
// Mock the fetch function for all tests
|
||||||
|
mockFetch = jest.fn().mockResolvedValue({
|
||||||
|
ok: true,
|
||||||
|
json: () => JSON.parse(mockResponse),
|
||||||
|
});
|
||||||
|
originalFetch = global.fetch;
|
||||||
|
global.fetch = mockFetch;
|
||||||
|
});
|
||||||
|
|
||||||
|
afterEach(() => {
|
||||||
|
// Reset the fetch counter after each test
|
||||||
|
mockFetch.mockClear();
|
||||||
|
});
|
||||||
|
afterAll(() => {
|
||||||
|
// Restore fetch
|
||||||
|
global.fetch = originalFetch;
|
||||||
|
});
|
||||||
|
|
||||||
|
it("should load the model list from remote when called without args", async () => {
|
||||||
|
const models = await listModels();
|
||||||
|
expect(fetch).toHaveBeenCalledTimes(1);
|
||||||
|
expect(fetch).toHaveBeenCalledWith(DEFAULT_MODEL_LIST_URL);
|
||||||
|
expect(models[0]).toEqual(fakeModel);
|
||||||
|
});
|
||||||
|
|
||||||
|
it("should load the model list from a local file, if specified", async () => {
|
||||||
|
const models = await listModels({
|
||||||
|
file: path.resolve(__dirname, "models.json"),
|
||||||
|
});
|
||||||
|
expect(fetch).toHaveBeenCalledTimes(0);
|
||||||
|
expect(models[0]).toEqual(fakeModel);
|
||||||
|
});
|
||||||
|
|
||||||
|
it("should throw an error if neither url nor file is specified", async () => {
|
||||||
|
await expect(listModels(null)).rejects.toThrow(
|
||||||
|
"No model list source specified. Please specify either a url or a file."
|
||||||
);
|
);
|
||||||
|
});
|
||||||
})
|
});
|
||||||
|
|
||||||
})
|
describe("appendBinSuffixIfMissing", () => {
|
||||||
|
it("should make sure the suffix is there", () => {
|
||||||
|
expect(appendBinSuffixIfMissing("filename")).toBe("filename.bin");
|
||||||
|
expect(appendBinSuffixIfMissing("filename.bin")).toBe("filename.bin");
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
describe("downloadModel", () => {
|
||||||
|
let mockAbortController, mockFetch;
|
||||||
|
const fakeModelName = "fake-model";
|
||||||
|
|
||||||
|
const createMockFetch = () => {
|
||||||
|
const mockData = new Uint8Array([1, 2, 3, 4]);
|
||||||
|
const mockResponse = new ReadableStream({
|
||||||
|
start(controller) {
|
||||||
|
controller.enqueue(mockData);
|
||||||
|
controller.close();
|
||||||
|
},
|
||||||
|
});
|
||||||
|
const mockFetchImplementation = jest.fn(() =>
|
||||||
|
Promise.resolve({
|
||||||
|
ok: true,
|
||||||
|
body: mockResponse,
|
||||||
|
})
|
||||||
|
);
|
||||||
|
return mockFetchImplementation;
|
||||||
|
};
|
||||||
|
|
||||||
|
beforeEach(() => {
|
||||||
|
// Mocking the AbortController constructor
|
||||||
|
mockAbortController = jest.fn();
|
||||||
|
global.AbortController = mockAbortController;
|
||||||
|
mockAbortController.mockReturnValue({
|
||||||
|
signal: "signal",
|
||||||
|
abort: jest.fn(),
|
||||||
|
});
|
||||||
|
mockFetch = createMockFetch();
|
||||||
|
jest.spyOn(global, "fetch").mockImplementation(mockFetch);
|
||||||
|
});
|
||||||
|
|
||||||
|
afterEach(() => {
|
||||||
|
// Clean up mocks
|
||||||
|
mockAbortController.mockReset();
|
||||||
|
mockFetch.mockClear();
|
||||||
|
global.fetch.mockRestore();
|
||||||
|
});
|
||||||
|
|
||||||
|
test("should successfully download a model file", async () => {
|
||||||
|
const downloadController = downloadModel(fakeModelName);
|
||||||
|
const modelFilePath = await downloadController.promise;
|
||||||
|
expect(modelFilePath).toBe(`${DEFAULT_DIRECTORY}/${fakeModelName}.bin`);
|
||||||
|
|
||||||
|
expect(global.fetch).toHaveBeenCalledTimes(1);
|
||||||
|
expect(global.fetch).toHaveBeenCalledWith(
|
||||||
|
"https://gpt4all.io/models/fake-model.bin",
|
||||||
|
{
|
||||||
|
signal: "signal",
|
||||||
|
headers: {
|
||||||
|
"Accept-Ranges": "arraybuffer",
|
||||||
|
"Response-Type": "arraybuffer",
|
||||||
|
},
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
|
// final model file should be present
|
||||||
|
expect(fsp.access(modelFilePath)).resolves.not.toThrow();
|
||||||
|
|
||||||
|
// remove the testing model file
|
||||||
|
await fsp.unlink(modelFilePath);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("should error and cleanup if md5sum is not matching", async () => {
|
||||||
|
const downloadController = downloadModel(fakeModelName, {
|
||||||
|
md5sum: "wrong-md5sum",
|
||||||
|
});
|
||||||
|
// the promise should reject with a mismatch
|
||||||
|
await expect(downloadController.promise).rejects.toThrow(
|
||||||
|
`Model "${fakeModelName}" failed verification: Hashes mismatch.`
|
||||||
|
);
|
||||||
|
// fetch should have been called
|
||||||
|
expect(global.fetch).toHaveBeenCalledTimes(1);
|
||||||
|
// the file should be missing
|
||||||
|
expect(
|
||||||
|
fsp.access(`${DEFAULT_DIRECTORY}/${fakeModelName}.bin`)
|
||||||
|
).rejects.toThrow();
|
||||||
|
// partial file should also be missing
|
||||||
|
expect(
|
||||||
|
fsp.access(`${DEFAULT_DIRECTORY}/${fakeModelName}.part`)
|
||||||
|
).rejects.toThrow();
|
||||||
|
});
|
||||||
|
|
||||||
|
// TODO
|
||||||
|
// test("should be able to cancel and resume a download", async () => {
|
||||||
|
// });
|
||||||
|
});
|
||||||
|
|
||||||
|
describe("normalizePromptContext", () => {
|
||||||
|
it("should convert a dict with camelCased keys to snake_case", () => {
|
||||||
|
const camelCased = {
|
||||||
|
topK: 20,
|
||||||
|
repeatLastN: 10,
|
||||||
|
};
|
||||||
|
|
||||||
|
const expectedSnakeCased = {
|
||||||
|
top_k: 20,
|
||||||
|
repeat_last_n: 10,
|
||||||
|
};
|
||||||
|
|
||||||
|
const result = normalizePromptContext(camelCased);
|
||||||
|
expect(result).toEqual(expectedSnakeCased);
|
||||||
|
});
|
||||||
|
|
||||||
|
it("should convert a mixed case dict to snake_case, last value taking precedence", () => {
|
||||||
|
const mixedCased = {
|
||||||
|
topK: 20,
|
||||||
|
top_k: 10,
|
||||||
|
repeatLastN: 10,
|
||||||
|
};
|
||||||
|
|
||||||
|
const expectedSnakeCased = {
|
||||||
|
top_k: 10,
|
||||||
|
repeat_last_n: 10,
|
||||||
|
};
|
||||||
|
|
||||||
|
const result = normalizePromptContext(mixedCased);
|
||||||
|
expect(result).toEqual(expectedSnakeCased);
|
||||||
|
});
|
||||||
|
|
||||||
|
it("should not modify already snake cased dict", () => {
|
||||||
|
const snakeCased = {
|
||||||
|
top_k: 10,
|
||||||
|
repeast_last_n: 10,
|
||||||
|
};
|
||||||
|
|
||||||
|
const result = normalizePromptContext(snakeCased);
|
||||||
|
expect(result).toEqual(snakeCased);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
10
gpt4all-bindings/typescript/test/models.json
Normal file
10
gpt4all-bindings/typescript/test/models.json
Normal file
@ -0,0 +1,10 @@
|
|||||||
|
[
|
||||||
|
{
|
||||||
|
"order": "a",
|
||||||
|
"md5sum": "08d6c05a21512a79a1dfeb9d2a8f262f",
|
||||||
|
"name": "Not a real model",
|
||||||
|
"filename": "fake-model.bin",
|
||||||
|
"filesize": "4",
|
||||||
|
"systemPrompt": " "
|
||||||
|
}
|
||||||
|
]
|
File diff suppressed because it is too large
Load Diff
Loading…
Reference in New Issue
Block a user