small edits and placeholder gif (#2513)

* small edits and placeholder gif Signed-off-by: Max Cembalest <max@nomic.ai> * jul2 docs updates Signed-off-by: Max Cembalest <max@nomic.ai> * added video Signed-off-by: mcembalest <70534565+mcembalest@users.noreply.github.com> Signed-off-by: Max Cembalest <max@nomic.ai> * quantization nits Signed-off-by: Max Cembalest <max@nomic.ai> --------- Signed-off-by: Max Cembalest <max@nomic.ai> Signed-off-by: mcembalest <70534565+mcembalest@users.noreply.github.com>
2025-09-04 18:11:02 +00:00 · 2024-07-02 11:41:39 -04:00
parent b7d1b938cc
commit 69102a2859
7 changed files with 61 additions and 60 deletions
--- a/gpt4all-bindings/python/docs/assets/ubuntu.svg
+++ b/gpt4all-bindings/python/docs/assets/ubuntu.svg
@@ -0,0 +1,5 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<svg xmlns="http://www.w3.org/2000/svg" width="285" height="285" viewBox="-142.5 -142.5 285 285" xmlns:xlink="http://www.w3.org/1999/xlink">
+<circle fill="#FFFFFF" r="141.732"/><g id="U" fill="#DD4814"><circle cx="-96.3772" r="18.9215"/>
+<path d="M-45.6059,68.395C-62.1655,57.3316-74.4844,40.4175-79.6011,20.6065-73.623,15.7354-69.8047,8.3164-69.8047,0-69.8047-8.3164-73.623-15.7354-79.6011-20.6065-74.4844-40.4175-62.1655-57.3316-45.6059-68.395L-31.7715-45.2212C-45.9824-35.2197-55.2754-18.7026-55.2754,0-55.2754,18.7026-45.9824,35.2197-31.7715,45.2212Z"/></g>
+<use xlink:href="#U" transform="rotate(120)"/><use xlink:href="#U" transform="rotate(240)"/></svg>
--- a/gpt4all-bindings/python/docs/gpt4all_desktop/models.md
+++ b/gpt4all-bindings/python/docs/gpt4all_desktop/models.md
@@ -56,13 +56,13 @@ Many LLMs are available at various sizes, quantizations, and licenses.

 Here are a few examples:

-| Model| Filesize| RAM Required| Parameters| Developer| License| MD5 Sum (Unique Hash)|
-|------|---------|-------------|-----------|----------|--------|----------------------|
-| Llama 3 Instruct  | 4.66 GB| 8 GB| 8 Billion| Meta| [Llama 3 License](https://llama.meta.com/llama3/license/)| c87ad09e1e4c8f9c35a5fcef52b6f1c9|
-| Nous Hermes 2 Mistral DPO| 4.21 GB| 8 GB| 7 Billion| Mistral & Nous Research | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)| Coa5f6b4eabd3992da4d7fb7f020f921eb|
-| Phi-3 Mini Instruct | 2.03 GB| 4 GB| 4 billion| Microsoft| [MIT](https://opensource.org/license/mit)| f8347badde9bfc2efbe89124d78ddaf5|
-| Mini Orca (Small)| 1.84 GB| 4 GB| 3 billion| Microsoft | [CC-BY-NC-SA-4.0](https://spdx.org/licenses/CC-BY-NC-SA-4.0)| 0e769317b90ac30d6e09486d61fefa26|
-| GPT4All Snoozy| 7.36 GB| 16 GB| 13 billion| Nomic AI| [GPL](https://www.gnu.org/licenses/gpl-3.0.en.html)| 40388eb2f8d16bb5d08c96fdfaac6b2c|
+| Model| Filesize| RAM Required| Parameters| Quantization| Developer| License| MD5 Sum (Unique Hash)|
+|------|---------|-------------|-----------|-------------|----------|--------|----------------------|
+| Llama 3 Instruct  | 4.66 GB| 8 GB| 8 Billion| q4_0| Meta| [Llama 3 License](https://llama.meta.com/llama3/license/)| c87ad09e1e4c8f9c35a5fcef52b6f1c9|
+| Nous Hermes 2 Mistral DPO| 4.11 GB| 8 GB| 7 Billion| q4_0| Mistral & Nous Research | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)| Coa5f6b4eabd3992da4d7fb7f020f921eb|
+| Phi-3 Mini Instruct | 2.18 GB| 4 GB| 4 billion| q4_0| Microsoft| [MIT](https://opensource.org/license/mit)| f8347badde9bfc2efbe89124d78ddaf5|
+| Mini Orca (Small)| 1.98 GB| 4 GB| 3 billion| q4_0| Microsoft | [CC-BY-NC-SA-4.0](https://spdx.org/licenses/CC-BY-NC-SA-4.0)| 0e769317b90ac30d6e09486d61fefa26|
+| GPT4All Snoozy| 7.37 GB| 16 GB| 13 billion| q4_0| Nomic AI| [GPL](https://www.gnu.org/licenses/gpl-3.0.en.html)| 40388eb2f8d16bb5d08c96fdfaac6b2c|

 ### Search Results

--- a/gpt4all-bindings/python/docs/gpt4all_help/faq.md
+++ b/gpt4all-bindings/python/docs/gpt4all_help/faq.md
@@ -4,17 +4,11 @@

 ### Which language models are supported?

-Our backend supports models with a `llama.cpp` implementation which have been uploaded to [HuggingFace](https://huggingface.co/).
+We support models with a `llama.cpp` implementation which have been uploaded to [HuggingFace](https://huggingface.co/).

 ### Which embedding models are supported?

-The following embedding models can be used within the application and with the `Embed4All` class from the `gpt4all` Python library. The default context length as GGUF files is 2048 but can be [extended](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF#description).
-
-| Name               | Initializing with `Embed4All`                            | Context Length | Embedding Length | File Size |
-|--------------------|------------------------------------------------------|---------------:|-----------------:|----------:|
-| [SBert](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)| ```pythonemb = Embed4All("all-MiniLM-L6-v2.gguf2.f16.gguf")```|            512 |              384 |    44 MiB |
-| [Nomic Embed v1](https://huggingface.co/nomic-ai/nomic-embed-text-v1-GGUF)   | nomic&#x2011;embed&#x2011;text&#x2011;v1.f16.gguf|           2048 |              768 |   262 MiB |
-| [Nomic Embed v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF) | nomic&#x2011;embed&#x2011;text&#x2011;v1.5.f16.gguf|           2048 |           64-768 |   262 MiB |
+We support SBert and Nomic Embed Text v1 & v1.5.

 ## Software

--- a/gpt4all-bindings/python/docs/gpt4all_python/home.md
+++ b/gpt4all-bindings/python/docs/gpt4all_python/home.md
@@ -23,6 +23,15 @@ Models are loaded by name via the `GPT4All` class. If it's your first time loadi
        print(model.generate("How can I run LLMs efficiently on my laptop?", max_tokens=1024))
    ```

+| `GPT4All` model name| Filesize| RAM Required| Parameters| Quantization| Developer| License| MD5 Sum (Unique Hash)|
+|------|---------|-------|-------|-----------|----------|--------|----------------------|
+|  `Meta-Llama-3-8B-Instruct.Q4_0.gguf`| 4.66 GB| 8 GB| 8 Billion| q4_0| Meta| [Llama 3 License](https://llama.meta.com/llama3/license/)| c87ad09e1e4c8f9c35a5fcef52b6f1c9|
+| `Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf`| 4.11 GB| 8 GB| 7 Billion| q4_0| Mistral & Nous Research | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)| Coa5f6b4eabd3992da4d7fb7f020f921eb|
+| `Phi-3-mini-4k-instruct.Q4_0.gguf` | 2.18 GB| 4 GB| 3.8 billion| q4_0| Microsoft| [MIT](https://opensource.org/license/mit)| f8347badde9bfc2efbe89124d78ddaf5|
+| `orca-mini-3b-gguf2-q4_0.gguf`| 1.98 GB| 4 GB| 3 billion| q4_0| Microsoft | [CC-BY-NC-SA-4.0](https://spdx.org/licenses/CC-BY-NC-SA-4.0)| 0e769317b90ac30d6e09486d61fefa26|
+| `gpt4all-13b-snoozy-q4_0.gguf`| 7.37 GB| 16 GB| 13 billion| q4_0| Nomic AI| [GPL](https://www.gnu.org/licenses/gpl-3.0.en.html)| 40388eb2f8d16bb5d08c96fdfaac6b2c|
+
+
 ## Chat Session Generation

 Most of the language models you will be able to access from HuggingFace have been trained as assistants. This guides language models to not just answer with relevant text, but *helpful* text.
@@ -75,16 +84,6 @@ If you want your LLM's responses to be helpful in the typical sense, we recommen
        b = 5
        ```

-## Example Models
-
-| Model| Filesize| RAM Required| Parameters| Developer| License| MD5 Sum (Unique Hash)|
-|------|---------|-------------|-----------|----------|--------|----------------------|
-| `Meta-Llama-3-8B-Instruct.Q4_0.gguf`  | 4.66 GB| 8 GB| 8 Billion| Meta| [Llama 3 License](https://llama.meta.com/llama3/license/)| c87ad09e1e4c8f9c35a5fcef52b6f1c9|
-| Nous Hermes 2 Mistral DPO| 4.21 GB| 8 GB| 7 Billion| Mistral & Nous Research | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)| Coa5f6b4eabd3992da4d7fb7f020f921eb|
-| Phi-3 Mini Instruct | 2.03 GB| 4 GB| 4 billion| Microsoft| [MIT](https://opensource.org/license/mit)| f8347badde9bfc2efbe89124d78ddaf5|
-| Mini Orca (Small)| 1.84 GB| 4 GB| 3 billion| Microsoft | [CC-BY-NC-SA-4.0](https://spdx.org/licenses/CC-BY-NC-SA-4.0)| 0e769317b90ac30d6e09486d61fefa26|
-| GPT4All Snoozy| 7.36 GB| 16 GB| 13 billion| Nomic AI| [GPL](https://www.gnu.org/licenses/gpl-3.0.en.html)| 40388eb2f8d16bb5d08c96fdfaac6b2c|
-
 ## Direct Generation

 Directly calling `model.generate()` prompts the model without applying any templates. 
@@ -150,3 +149,11 @@ The easiest way to run the text embedding model locally uses the [`nomic`](https
 ![Nomic embed text local inference](../assets/local_embed.gif)

 To learn more about making embeddings locally with `nomic`, visit our [embeddings guide](https://docs.nomic.ai/atlas/guides/embeddings#local-inference).
+
+The following embedding models can be used within the application and with the `Embed4All` class from the `gpt4all` Python library. The default context length as GGUF files is 2048 but can be [extended](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF#description).
+
+| Name| Using with `nomic`| `Embed4All` model name| Context Length| # Embedding Dimensions| File Size|
+|--------------------|-|------------------------------------------------------|---------------:|-----------------:|----------:|
+| [Nomic Embed v1](https://huggingface.co/nomic-ai/nomic-embed-text-v1-GGUF)   | ```embed.text(strings, model="nomic-embed-text-v1", inference_mode="local")```| ```Embed4All("nomic-embed-text-v1.f16.gguf")```|           2048 |              768 |   262 MiB |
+| [Nomic Embed v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF) | ```embed.text(strings, model="nomic-embed-text-v1.5", inference_mode="local")```| ```Embed4All("nomic-embed-text-v1.5.f16.gguf")``` |           2048| 64-768 |   262 MiB |
+| [SBert](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)| n/a| ```Embed4All("all-MiniLM-L6-v2.gguf2.f16.gguf")```|            512 |              384 |    44 MiB |