Switch to new models2.json for new gguf release and bump our version to

2.5.0.
This commit is contained in:
Adam Treat
2023-10-05 09:56:40 -04:00
parent 088afada49
commit ea66669cef
13 changed files with 167 additions and 22 deletions

View File

@@ -7,7 +7,7 @@ It is optimized to run 7-13B parameter LLMs on the CPU's of any computer running
## Running LLMs on CPU
The GPT4All Chat UI supports models from all newer versions of `GGML`, `llama.cpp` including the `LLaMA`, `MPT`, `replit`, `GPT-J` and `falcon` architectures
GPT4All maintains an official list of recommended models located in [models.json](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models.json). You can pull request new models to it and if accepted they will show up in the official download dialog.
GPT4All maintains an official list of recommended models located in [models2.json](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models2.json). You can pull request new models to it and if accepted they will show up in the official download dialog.
#### Sideloading any GGML model
If a model is compatible with the gpt4all-backend, you can sideload it into GPT4All Chat by:

View File

@@ -61,12 +61,12 @@ or `allowDownload=true` (default), a model is automatically downloaded into `.ca
unless it already exists.
In case of connection issues or errors during the download, you might want to manually verify the model file's MD5
checksum by comparing it with the one listed in [models.json].
checksum by comparing it with the one listed in [models2.json].
As an alternative to the basic downloader built into the bindings, you can choose to download from the
<https://gpt4all.io/> website instead. Scroll down to 'Model Explorer' and pick your preferred model.
[models.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models.json
[models2.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models2.json
#### I need the chat GUI and bindings to behave the same
@@ -93,7 +93,7 @@ The chat GUI and bindings are based on the same backend. You can make them behav
- Next you'll have to compare the templates, adjusting them as necessary, based on how you're using the bindings.
- Specifically, in Python:
- With simple `generate()` calls, the input has to be surrounded with system and prompt templates.
- When using a chat session, it depends on whether the bindings are allowed to download [models.json]. If yes,
- When using a chat session, it depends on whether the bindings are allowed to download [models2.json]. If yes,
and in the chat GUI the default templates are used, it'll be handled automatically. If no, use
`chat_session()` template parameters to customize them.

View File

@@ -8,7 +8,7 @@ import modal
def download_model():
import gpt4all
#you can use any model from https://gpt4all.io/models/models.json
#you can use any model from https://gpt4all.io/models/models2.json
return gpt4all.GPT4All("ggml-gpt4all-j-v1.3-groovy.bin")
image=modal.Image.debian_slim().pip_install("gpt4all").run_function(download_model)
@@ -31,4 +31,4 @@ def main():
model = GPT4All()
for i in range(10):
model.generate.call()
```
```

View File

@@ -77,10 +77,10 @@ When using GPT4All models in the `chat_session` context:
- Consecutive chat exchanges are taken into account and not discarded until the session ends; as long as the model has capacity.
- Internal K/V caches are preserved from previous conversation history, speeding up inference.
- The model is given a system and prompt template which make it chatty. Depending on `allow_download=True` (default),
it will obtain the latest version of [models.json] from the repository, which contains specifically tailored templates
it will obtain the latest version of [models2.json] from the repository, which contains specifically tailored templates
for models. Conversely, if it is not allowed to download, it falls back to default templates instead.
[models.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models.json
[models2.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models2.json
### Streaming Generations
@@ -379,7 +379,7 @@ logging infrastructure offers [many more customization options][py-logging-cookb
### Without Online Connectivity
To prevent GPT4All from accessing online resources, instantiate it with `allow_download=False`. This will disable both
downloading missing models and [models.json], which contains information about them. As a result, predefined templates
downloading missing models and [models2.json], which contains information about them. As a result, predefined templates
are used instead of model-specific system and prompt templates:
=== "GPT4All Default Templates Example"

View File

@@ -38,7 +38,7 @@ The GPT4All software ecosystem is compatible with the following Transformer arch
- `MPT` (including `Replit`)
- `GPT-J`
You can find an exhaustive list of supported models on the [website](https://gpt4all.io) or in the [models directory](https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models.json)
You can find an exhaustive list of supported models on the [website](https://gpt4all.io) or in the [models directory](https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models2.json)
GPT4All models are artifacts produced through a process known as neural network quantization.