feat(typescript)/dynamic template (#1287) (#1326)

* feat(typescript)/dynamic template (#1287) * remove packaged yarn * prompt templates update wip * prompt template update * system prompt template, update types, remove embed promises, cleanup * support both snakecased and camelcased prompt context * fix #1277 libbert, libfalcon and libreplit libs not being moved into the right folder after build * added support for modelConfigFile param, allowing the user to specify a local file instead of downloading the remote models.json. added a warning message if code fails to load a model config. included prompt context docs by amogus. * snakecase warning, put logic for loading local models.json into listModels, added constant for the default remote model list url, test improvements, simpler hasOwnProperty call * add DEFAULT_PROMPT_CONTEXT, export new constants * add md5sum testcase and fix constants export * update types * throw if attempting to list models without a source * rebuild docs * fix download logging undefined url, toFixed typo, pass config filesize in for future progress report * added overload with union types * bump to 2.2.0, remove alpha * code speling --------- Co-authored-by: Andreas Obersteiner <8959303+iimez@users.noreply.github.com>
2025-11-13 05:58:13 +00:00 · 2023-08-14 11:45:45 -05:00
parent 4d855afe97
commit 4e55940edf
15 changed files with 5876 additions and 6938 deletions
--- a/gpt4all-bindings/python/docs/gpt4all_typescript.md
+++ b/gpt4all-bindings/python/docs/gpt4all_typescript.md
@@ -1,7 +1,7 @@
 # GPT4All Node.js API

 ```sh
-yarn install gpt4all@alpha
+yarn add gpt4all@alpha

 npm install gpt4all@alpha

@@ -10,34 +10,41 @@ pnpm install gpt4all@alpha

 The original [GPT4All typescript bindings](https://github.com/nomic-ai/gpt4all-ts) are now out of date.

-*   New bindings created by [jacoobes](https://github.com/jacoobes) and the [nomic ai community](https://home.nomic.ai) :D, for all to use.
-*   [Documentation](#Documentation)
+*   New bindings created by [jacoobes](https://github.com/jacoobes), [limez](https://github.com/iimez) and the [nomic ai community](https://home.nomic.ai), for all to use.
+*   The nodejs api has made strides to mirror the python api. It is not 100% mirrored, but many pieces of the api resemble its python counterpart.
+*   Everything should work out the box.
+*   See [API Reference](#api-reference)

-### Code (alpha)
+### Chat Completion (alpha)

 ```js
 import { createCompletion, loadModel } from '../src/gpt4all.js'

-const ll = await loadModel('ggml-vicuna-7b-1.1-q4_2.bin', { verbose: true });
+const model = await loadModel('ggml-vicuna-7b-1.1-q4_2', { verbose: true });

-const response = await createCompletion(ll, [
+const response = await createCompletion(model, [
    { role : 'system', content: 'You are meant to be annoying and unhelpful.'  },
    { role : 'user', content: 'What is 1 + 1?'  } 
 ]);

 ```

-### API
+### Embedding (alpha)

-*   The nodejs api has made strides to mirror the python api. It is not 100% mirrored, but many pieces of the api resemble its python counterpart.
-*   Everything should work out the box.
-*   [docs](./docs/api.md)
+```js
+import { createEmbedding, loadModel } from '../src/gpt4all.js'
+
+const model = await loadModel('ggml-all-MiniLM-L6-v2-f16', { verbose: true });
+
+const fltArray = createEmbedding(model, "Pain is inevitable, suffering optional");
+```

 ### Build Instructions

-*   As of 05/21/2023, Tested on windows (MSVC). (somehow got it to work on MSVC 🤯)
-    *   binding.gyp is compile config
+*   binding.gyp is compile config
 *   Tested on Ubuntu. Everything seems to work fine
+*   Tested on Windows. Everything works fine.
+*   Sparse testing on mac os.
 *   MingW works as well to build the gpt4all-backend. **HOWEVER**, this package works only with MSVC built dlls.

 ### Requirements
@@ -48,11 +55,11 @@ const response = await createCompletion(ll, [
 *   [node-gyp](https://github.com/nodejs/node-gyp)
    *   all of its requirements.
 *   (unix) gcc version 12
-    *   These bindings use the C++ 20 standard.
 *   (win) msvc version 143
    *   Can be obtained with visual studio 2022 build tools
+*   python 3

-### Build
+### Build (from source)

 ```sh
 git clone https://github.com/nomic-ai/gpt4all.git
@@ -117,22 +124,27 @@ yarn test

 *   Handling prompting and inference of models in a threadsafe, asynchronous way.

-#### docs/
+### Known Issues

-*   Autogenerated documentation using the script `yarn docs:build`
+*   why your model may be spewing bull 💩
+    *   The downloaded model is broken (just reinstall or download from official site)
+    *   That's it so far

 ### Roadmap

 This package is in active development, and breaking changes may happen until the api stabilizes. Here's what's the todo list:

 *   \[x] prompt models via a threadsafe function in order to have proper non blocking behavior in nodejs
-*   \[ ] createTokenStream, an async iterator that streams each token emitted from the model. Planning on following this [example](https://github.com/nodejs/node-addon-examples/tree/main/threadsafe-async-iterator)
-*   \[ ] proper unit testing (integrate with circle ci)
-*   \[ ] publish to npm under alpha tag `gpt4all@alpha`
-*   \[ ] have more people test on other platforms (mac tester needed)
+*   \[ ] ~~createTokenStream, an async iterator that streams each token emitted from the model. Planning on following this [example](https://github.com/nodejs/node-addon-examples/tree/main/threadsafe-async-iterator)~~ May not implement unless someone else can complete
+*   \[x] proper unit testing (integrate with circle ci)
+*   \[x] publish to npm under alpha tag `gpt4all@alpha`
+*   \[x] have more people test on other platforms (mac tester needed)
 *   \[x] switch to new pluggable backend
+*   \[ ] NPM bundle size reduction via optionalDependencies strategy (need help)
+    *   Should include prebuilds to avoid painful node-gyp errors
+*   \[ ] createChatSession ( the python equivalent to create\_chat\_session )

-### Documentation
+### API Reference

 <!-- Generated by documentation.js. Update this documentation by updating the source code. -->

@@ -166,13 +178,14 @@ This package is in active development, and breaking changes may happen until the
    *   [Parameters](#parameters-5)
 *   [createCompletion](#createcompletion)
    *   [Parameters](#parameters-6)
-    *   [Examples](#examples)
 *   [createEmbedding](#createembedding)
    *   [Parameters](#parameters-7)
 *   [CompletionOptions](#completionoptions)
    *   [verbose](#verbose)
-    *   [hasDefaultHeader](#hasdefaultheader)
-    *   [hasDefaultFooter](#hasdefaultfooter)
+    *   [systemPromptTemplate](#systemprompttemplate)
+    *   [promptTemplate](#prompttemplate)
+    *   [promptHeader](#promptheader)
+    *   [promptFooter](#promptfooter)
 *   [PromptMessage](#promptmessage)
    *   [role](#role)
    *   [content](#content)
@@ -186,28 +199,31 @@ This package is in active development, and breaking changes may happen until the
 *   [CompletionChoice](#completionchoice)
    *   [message](#message)
 *   [LLModelPromptContext](#llmodelpromptcontext)
-    *   [logits\_size](#logits_size)
-    *   [tokens\_size](#tokens_size)
-    *   [n\_past](#n_past)
-    *   [n\_ctx](#n_ctx)
-    *   [n\_predict](#n_predict)
-    *   [top\_k](#top_k)
-    *   [top\_p](#top_p)
+    *   [logitsSize](#logitssize)
+    *   [tokensSize](#tokenssize)
+    *   [nPast](#npast)
+    *   [nCtx](#nctx)
+    *   [nPredict](#npredict)
+    *   [topK](#topk)
+    *   [topP](#topp)
    *   [temp](#temp)
-    *   [n\_batch](#n_batch)
-    *   [repeat\_penalty](#repeat_penalty)
-    *   [repeat\_last\_n](#repeat_last_n)
-    *   [context\_erase](#context_erase)
+    *   [nBatch](#nbatch)
+    *   [repeatPenalty](#repeatpenalty)
+    *   [repeatLastN](#repeatlastn)
+    *   [contextErase](#contexterase)
 *   [createTokenStream](#createtokenstream)
    *   [Parameters](#parameters-8)
 *   [DEFAULT\_DIRECTORY](#default_directory)
 *   [DEFAULT\_LIBRARIES\_DIRECTORY](#default_libraries_directory)
+*   [DEFAULT\_MODEL\_CONFIG](#default_model_config)
+*   [DEFAULT\_PROMT\_CONTEXT](#default_promt_context)
+*   [DEFAULT\_MODEL\_LIST\_URL](#default_model_list_url)
 *   [downloadModel](#downloadmodel)
    *   [Parameters](#parameters-9)
-    *   [Examples](#examples-1)
+    *   [Examples](#examples)
 *   [DownloadModelOptions](#downloadmodeloptions)
    *   [modelPath](#modelpath)
-    *   [debug](#debug)
+    *   [verbose](#verbose-1)
    *   [url](#url)
    *   [md5sum](#md5sum)
 *   [DownloadController](#downloadcontroller)
@@ -223,6 +239,7 @@ Type: (`"gptj"` | `"llama"` | `"mpt"` | `"replit"`)
 #### ModelFile

 Full list of models available
+@deprecated These model names are outdated and this type will not be maintained, please use a string literal instead

 ##### gptj

@@ -367,7 +384,7 @@ By default this will download a model from the official GPT4ALL website, if a mo
 *   `modelName` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The name of the model to load.
 *   `options` **(LoadModelOptions | [undefined](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/undefined))?** (Optional) Additional options for loading the model.

-Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)<[LLModel](#llmodel)>** A promise that resolves to an instance of the loaded LLModel.
+Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)<(InferenceModel | EmbeddingModel)>** A promise that resolves to an instance of the loaded LLModel.

 #### createCompletion

@@ -375,25 +392,10 @@ The nodejs equivalent to python binding's chat\_completion

 ##### Parameters

-*   `llmodel` **[LLModel](#llmodel)** The language model object.
+*   `model` **InferenceModel** The language model object.
 *   `messages` **[Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array)<[PromptMessage](#promptmessage)>** The array of messages for the conversation.
 *   `options` **[CompletionOptions](#completionoptions)** The options for creating the completion.

-##### Examples
-
-```javascript
-const llmodel = new LLModel(model)
-const messages = [
-{ role: 'system', message: 'You are a weather forecaster.' },
-{ role: 'user', message: 'should i go out today?' } ]
-const completion = await createCompletion(llmodel, messages, {
- verbose: true,
- temp: 0.9,
-})
-console.log(completion.choices[0].message.content)
-// No, it's going to be cold and rainy.
-```
-
 Returns **[CompletionReturn](#completionreturn)** The completion result.

 #### createEmbedding
@@ -403,7 +405,7 @@ meow

 ##### Parameters

-*   `llmodel` **[LLModel](#llmodel)** The language model object.
+*   `model` **EmbeddingModel** The language model object.
 *   `text` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** text to embed

 Returns **[Float32Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Float32Array)** The completion result.
@@ -420,17 +422,30 @@ Indicates if verbose logging is enabled.

 Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)

-##### hasDefaultHeader
+##### systemPromptTemplate

-Indicates if the default header is included in the prompt.
+Template for the system message. Will be put before the conversation with %1 being replaced by all system messages.
+Note that if this is not defined, system messages will not be included in the prompt.
+
+Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
+
+##### promptTemplate
+
+Template for user messages, with %1 being replaced by the message.

 Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)

-##### hasDefaultFooter
+##### promptHeader

-Indicates if the default footer is included in the prompt.
+The initial instruction for the model, on top of the prompt

-Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
+Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
+
+##### promptFooter
+
+The last instruction for the model, appended to the end of the prompt.
+
+Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)

 #### PromptMessage

@@ -472,9 +487,9 @@ The result of the completion, similar to OpenAI's format.

 ##### model

-The model name.
+The model used for the completion.

-Type: [ModelFile](#modelfile)
+Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)

 ##### usage

@@ -502,73 +517,100 @@ Type: [PromptMessage](#promptmessage)

 Model inference arguments for generating completions.

-##### logits\_size
+##### logitsSize

 The size of the raw logits vector.

 Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)

-##### tokens\_size
+##### tokensSize

 The size of the raw tokens vector.

 Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)

-##### n\_past
+##### nPast

 The number of tokens in the past conversation.

 Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)

-##### n\_ctx
+##### nCtx

 The number of tokens possible in the context window.

 Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)

-##### n\_predict
+##### nPredict

 The number of tokens to predict.

 Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)

-##### top\_k
+##### topK

 The top-k logits to sample from.
+Top-K sampling selects the next token only from the top K most likely tokens predicted by the model.
+It helps reduce the risk of generating low-probability or nonsensical tokens, but it may also limit
+the diversity of the output. A higher value for top-K (eg., 100) will consider more tokens and lead
+to more diverse text, while a lower value (eg., 10) will focus on the most probable tokens and generate
+more conservative text. 30 - 60 is a good range for most tasks.

 Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)

-##### top\_p
+##### topP

 The nucleus sampling probability threshold.
+Top-P limits the selection of the next token to a subset of tokens with a cumulative probability
+above a threshold P. This method, also known as nucleus sampling, finds a balance between diversity
+and quality by considering both token probabilities and the number of tokens available for sampling.
+When using a higher value for top-P (eg., 0.95), the generated text becomes more diverse.
+On the other hand, a lower value (eg., 0.1) produces more focused and conservative text.
+The default value is 0.4, which is aimed to be the middle ground between focus and diversity, but
+for more creative tasks a higher top-p value will be beneficial, about 0.5-0.9 is a good range for that.

 Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)

 ##### temp

 The temperature to adjust the model's output distribution.
+Temperature is like a knob that adjusts how creative or focused the output becomes. Higher temperatures
+(eg., 1.2) increase randomness, resulting in more imaginative and diverse text. Lower temperatures (eg., 0.5)
+make the output more focused, predictable, and conservative. When the temperature is set to 0, the output
+becomes completely deterministic, always selecting the most probable next token and producing identical results
+each time. A safe range would be around 0.6 - 0.85, but you are free to search what value fits best for you.

 Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)

-##### n\_batch
+##### nBatch

 The number of predictions to generate in parallel.
+By splitting the prompt every N tokens, prompt-batch-size reduces RAM usage during processing. However,
+this can increase the processing time as a trade-off. If the N value is set too low (e.g., 10), long prompts
+with 500+ tokens will be most affected, requiring numerous processing runs to complete the prompt processing.
+To ensure optimal performance, setting the prompt-batch-size to 2048 allows processing of all tokens in a single run.

 Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)

-##### repeat\_penalty
+##### repeatPenalty

 The penalty factor for repeated tokens.
+Repeat-penalty can help penalize tokens based on how frequently they occur in the text, including the input prompt.
+A token that has already appeared five times is penalized more heavily than a token that has appeared only one time.
+A value of 1 means that there is no penalty and values larger than 1 discourage repeated tokens.

 Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)

-##### repeat\_last\_n
+##### repeatLastN

 The number of last tokens to penalize.
+The repeat-penalty-tokens N option controls the number of tokens in the history to consider for penalizing repetition.
+A larger value will look further back in the generated text to prevent repetitions, while a smaller value will only
+consider recent tokens.

 Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)

-##### context\_erase
+##### contextErase

 The percentage of context to erase if the context window is exceeded.

@@ -602,21 +644,39 @@ This searches DEFAULT\_DIRECTORY/libraries, cwd/libraries, and finally cwd.

 Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)

+#### DEFAULT\_MODEL\_CONFIG
+
+Default model configuration.
+
+Type: ModelConfig
+
+#### DEFAULT\_PROMT\_CONTEXT
+
+Default prompt context.
+
+Type: [LLModelPromptContext](#llmodelpromptcontext)
+
+#### DEFAULT\_MODEL\_LIST\_URL
+
+Default model list url.
+
+Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)
+
 #### downloadModel

-Initiates the download of a model file of a specific model type.
+Initiates the download of a model file.
 By default this downloads without waiting. use the controller returned to alter this behavior.

 ##### Parameters

-*   `modelName` **[ModelFile](#modelfile)** The model file to be downloaded.
-*   `options` **DownloadOptions** to pass into the downloader. Default is { location: (cwd), debug: false }.
+*   `modelName` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** The model to be downloaded.
+*   `options` **DownloadOptions** to pass into the downloader. Default is { location: (cwd), verbose: false }.

 ##### Examples

 ```javascript
-const controller = download('ggml-gpt4all-j-v1.3-groovy.bin')
-controller.promise().then(() => console.log('Downloaded!'))
+const download = downloadModel('ggml-gpt4all-j-v1.3-groovy.bin')
+download.promise.then(() => console.log('Downloaded!'))
 ```

 *   Throws **[Error](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error)** If the model already exists in the specified location.
@@ -635,7 +695,7 @@ Default is process.cwd(), or the current working directory

 Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)

-##### debug
+##### verbose

 Debug mode -- check how long it took to download in seconds

@@ -643,15 +703,16 @@ Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Glob

 ##### url

-Remote download url. Defaults to `https://gpt4all.io/models`
+Remote download url. Defaults to `https://gpt4all.io/models/<modelName>`

 Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)

 ##### md5sum

-Whether to verify the hash of the download to ensure a proper download occurred.
+MD5 sum of the model file. If this is provided, the downloaded file will be checked against this sum.
+If the sums do not match, an error will be thrown and the file will be deleted.

-Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)
+Type: [string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)

 #### DownloadController

@@ -659,12 +720,12 @@ Model download controller.

 ##### cancel

-Cancel the request to download from gpt4all website if this is called.
+Cancel the request to download if this is called.

 Type: function (): void

 ##### promise

-Convert the downloader into a promise, allowing people to await and manage its lifetime
+A promise resolving to the downloaded models config once the download is done

-Type: function (): [Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)\<void>
+Type: [Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)\<ModelConfig>