mirror of
https://github.com/nomic-ai/gpt4all.git
synced 2025-09-02 17:15:18 +00:00
Embed4All: optionally count tokens, misc fixes (#2145)
Key changes: * python: optionally return token count in Embed4All.embed * python and docs: models2.json -> models3.json * Embed4All: require explicit prefix for unknown models * llamamodel: fix shouldAddBOS for Bert and Nomic Bert Signed-off-by: Jared Van Bortel <jared@nomic.ai>
This commit is contained in:
@@ -7,7 +7,7 @@ It is optimized to run 7-13B parameter LLMs on the CPU's of any computer running
|
||||
## Running LLMs on CPU
|
||||
The GPT4All Chat UI supports models from all newer versions of `llama.cpp` with `GGUF` models including the `Mistral`, `LLaMA2`, `LLaMA`, `OpenLLaMa`, `Falcon`, `MPT`, `Replit`, `Starcoder`, and `Bert` architectures
|
||||
|
||||
GPT4All maintains an official list of recommended models located in [models2.json](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models2.json). You can pull request new models to it and if accepted they will show up in the official download dialog.
|
||||
GPT4All maintains an official list of recommended models located in [models3.json](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models3.json). You can pull request new models to it and if accepted they will show up in the official download dialog.
|
||||
|
||||
#### Sideloading any GGUF model
|
||||
If a model is compatible with the gpt4all-backend, you can sideload it into GPT4All Chat by:
|
||||
|
@@ -61,12 +61,12 @@ or `allowDownload=true` (default), a model is automatically downloaded into `.ca
|
||||
unless it already exists.
|
||||
|
||||
In case of connection issues or errors during the download, you might want to manually verify the model file's MD5
|
||||
checksum by comparing it with the one listed in [models2.json].
|
||||
checksum by comparing it with the one listed in [models3.json].
|
||||
|
||||
As an alternative to the basic downloader built into the bindings, you can choose to download from the
|
||||
<https://gpt4all.io/> website instead. Scroll down to 'Model Explorer' and pick your preferred model.
|
||||
|
||||
[models2.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models2.json
|
||||
[models3.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models3.json
|
||||
|
||||
#### I need the chat GUI and bindings to behave the same
|
||||
|
||||
@@ -93,7 +93,7 @@ The chat GUI and bindings are based on the same backend. You can make them behav
|
||||
- Next you'll have to compare the templates, adjusting them as necessary, based on how you're using the bindings.
|
||||
- Specifically, in Python:
|
||||
- With simple `generate()` calls, the input has to be surrounded with system and prompt templates.
|
||||
- When using a chat session, it depends on whether the bindings are allowed to download [models2.json]. If yes,
|
||||
- When using a chat session, it depends on whether the bindings are allowed to download [models3.json]. If yes,
|
||||
and in the chat GUI the default templates are used, it'll be handled automatically. If no, use
|
||||
`chat_session()` template parameters to customize them.
|
||||
|
||||
|
@@ -38,7 +38,7 @@ The GPT4All software ecosystem is compatible with the following Transformer arch
|
||||
- `MPT` (including `Replit`)
|
||||
- `GPT-J`
|
||||
|
||||
You can find an exhaustive list of supported models on the [website](https://gpt4all.io) or in the [models directory](https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models2.json)
|
||||
You can find an exhaustive list of supported models on the [website](https://gpt4all.io) or in the [models directory](https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models3.json)
|
||||
|
||||
|
||||
GPT4All models are artifacts produced through a process known as neural network quantization.
|
||||
|
Reference in New Issue
Block a user