mirror of
https://github.com/nomic-ai/gpt4all.git
synced 2025-09-10 12:59:09 +00:00
Python Bindings: Improved unit tests, documentation and unification of API (#1090)
* Makefiles, black, isort * Black and isort * unit tests and generation method * chat context provider * context does not reset * Current state * Fixup * Python bindings with unit tests * GPT4All Python Bindings: chat contexts, tests * New python bindings and backend fixes * Black and Isort * Documentation error * preserved n_predict for backwords compat with langchain --------- Co-authored-by: Adam Treat <treat.adam@gmail.com>
This commit is contained in:
@@ -6,6 +6,19 @@ Nomic AI oversees contributions to the open-source ecosystem ensuring quality, s
|
||||
|
||||
GPT4All software is optimized to run inference of 7-13 billion parameter large language models on the CPUs of laptops, desktops and servers.
|
||||
|
||||
=== "GPT4All Example"
|
||||
``` py
|
||||
from gpt4all import GPT4All
|
||||
model = GPT4All("orca-mini-3b.ggmlv3.q4_0.bin")
|
||||
output = model.generate("The capital of France is ", max_tokens=3)
|
||||
print(output)
|
||||
```
|
||||
=== "Output"
|
||||
```
|
||||
1. Paris
|
||||
```
|
||||
See [Python Bindings](gpt4all_python.md) to use GPT4All.
|
||||
|
||||
### Navigating the Documentation
|
||||
In an effort to ensure cross-operating system and cross-language compatibility, the [GPT4All software ecosystem](https://github.com/nomic-ai/gpt4all)
|
||||
is organized as a monorepo with the following structure:
|
||||
@@ -18,31 +31,31 @@ This C API is then bound to any higher level programming language such as C++, P
|
||||
|
||||
Explore detailed documentation for the backend, bindings and chat client in the sidebar.
|
||||
## Models
|
||||
The GPT4All software ecosystem is compatible with the following Transformer architectures:
|
||||
|
||||
- `Falcon`
|
||||
- `LLaMA` (including `OpenLLaMA`)
|
||||
- `MPT` (including `Replit`)
|
||||
- `GPTJ`
|
||||
|
||||
You can find an exhaustive list of supported models on the [website](https://gpt4all.io) or in the [models directory](https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models.json)
|
||||
|
||||
|
||||
GPT4All models are artifacts produced through a process known as neural network quantization.
|
||||
A multi-billion parameter transformer decoder usually takes 30+ GB of VRAM to execute a forward pass.
|
||||
Most people do not have such a powerful computer or access to GPU hardware. By running trained LLMs through quantization algorithms,
|
||||
GPT4All models can run on your laptop using only 4-8GB of RAM enabling their wide-spread utility.
|
||||
|
||||
The GPT4All software ecosystem is currently compatible with three variants of the Transformer neural network architecture:
|
||||
|
||||
- LLaMa
|
||||
|
||||
- GPT-J
|
||||
|
||||
- MPT
|
||||
GPT4All models can run on your laptop using only 4-8GB of RAM enabling their wide-spread usage.
|
||||
|
||||
Any model trained with one of these architectures can be quantized and run locally with all GPT4All bindings and in the
|
||||
chat client. You can add new variants by contributing the gpt4all-backend.
|
||||
|
||||
You can find an exhaustive list of pre-quantized models on the [website](https://gpt4all.io) or in the download pane of the chat client.
|
||||
|
||||
## Frequently Asked Questions
|
||||
Find answers to frequently asked questions by searching the [Github issues](https://github.com/nomic-ai/gpt4all/issues) or in the [documentation FAQ](gpt4all_faq.md).
|
||||
|
||||
## Getting the most of your local LLM
|
||||
|
||||
**Inference Speed**
|
||||
Inference speed of a local LLM depends on two factors: model size and the number of tokens given as input.
|
||||
of a local LLM depends on two factors: model size and the number of tokens given as input.
|
||||
It is not advised to prompt local LLMs with large chunks of context as their inference speed will heavily degrade.
|
||||
You will likely want to run GPT4All models on GPU if you would like to utilize context windows larger than 750 tokens. Native GPU support for GPT4All models is planned.
|
||||
|
||||
|
Reference in New Issue
Block a user