WIP: remove bindings and all references to them

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2025-09-22 20:10:36 +00:00 · 2025-02-13 17:38:06 -05:00
parent 329e63c5fb
commit eba19ea492
138 changed files with 18 additions and 14969 deletions
--- a/docs/assets/add.png
+++ b/docs/assets/add.png
--- a/docs/assets/add_model_gpt4.png
+++ b/docs/assets/add_model_gpt4.png
--- a/docs/assets/attach_spreadsheet.png
+++ b/docs/assets/attach_spreadsheet.png
--- a/docs/assets/baelor.png
+++ b/docs/assets/baelor.png
--- a/docs/assets/before_first_chat.png
+++ b/docs/assets/before_first_chat.png
--- a/docs/assets/chat_window.png
+++ b/docs/assets/chat_window.png
--- a/docs/assets/closed_chat_panel.png
+++ b/docs/assets/closed_chat_panel.png
--- a/docs/assets/configure_doc_collection.png
+++ b/docs/assets/configure_doc_collection.png
--- a/docs/assets/disney_spreadsheet.png
+++ b/docs/assets/disney_spreadsheet.png
--- a/docs/assets/download.png
+++ b/docs/assets/download.png
--- a/docs/assets/download_llama.png
+++ b/docs/assets/download_llama.png
--- a/docs/assets/explore.png
+++ b/docs/assets/explore.png
--- a/docs/assets/explore_models.png
+++ b/docs/assets/explore_models.png
--- a/docs/assets/favicon.ico
+++ b/docs/assets/favicon.ico
--- a/docs/assets/good_tyrion.png
+++ b/docs/assets/good_tyrion.png
--- a/docs/assets/got_docs_ready.png
+++ b/docs/assets/got_docs_ready.png
--- a/docs/assets/got_done.png
+++ b/docs/assets/got_done.png
--- a/docs/assets/gpt4all_home.png
+++ b/docs/assets/gpt4all_home.png
--- a/docs/assets/gpt4all_xlsx_attachment.mp4
+++ b/docs/assets/gpt4all_xlsx_attachment.mp4
--- a/docs/assets/installed_models.png
+++ b/docs/assets/installed_models.png
--- a/docs/assets/linux.png
+++ b/docs/assets/linux.png
--- a/docs/assets/local_embed.gif
+++ b/docs/assets/local_embed.gif
--- a/docs/assets/mac.png
+++ b/docs/assets/mac.png
--- a/docs/assets/models_page_icon.png
+++ b/docs/assets/models_page_icon.png
--- a/docs/assets/new_docs_annotated.png
+++ b/docs/assets/new_docs_annotated.png
--- a/docs/assets/new_docs_annotated_filled.png
+++ b/docs/assets/new_docs_annotated_filled.png
--- a/docs/assets/new_first_chat.png
+++ b/docs/assets/new_first_chat.png
--- a/docs/assets/no_docs.png
+++ b/docs/assets/no_docs.png
--- a/docs/assets/no_models.png
+++ b/docs/assets/no_models.png
--- a/docs/assets/no_models_tiny.png
+++ b/docs/assets/no_models_tiny.png
--- a/docs/assets/nomic.png
+++ b/docs/assets/nomic.png
--- a/docs/assets/obsidian_adding_collection.png
+++ b/docs/assets/obsidian_adding_collection.png
--- a/docs/assets/obsidian_docs.png
+++ b/docs/assets/obsidian_docs.png
--- a/docs/assets/obsidian_response.png
+++ b/docs/assets/obsidian_response.png
--- a/docs/assets/obsidian_sources.png
+++ b/docs/assets/obsidian_sources.png
--- a/docs/assets/open_chat_panel.png
+++ b/docs/assets/open_chat_panel.png
--- a/docs/assets/open_local_docs.png
+++ b/docs/assets/open_local_docs.png
--- a/docs/assets/open_sources.png
+++ b/docs/assets/open_sources.png
--- a/docs/assets/osbsidian_user_interaction.png
+++ b/docs/assets/osbsidian_user_interaction.png
--- a/docs/assets/search_mistral.png
+++ b/docs/assets/search_mistral.png
--- a/docs/assets/search_settings.png
+++ b/docs/assets/search_settings.png
--- a/docs/assets/spreadsheet_chat.png
+++ b/docs/assets/spreadsheet_chat.png
--- a/docs/assets/syrio_snippets.png
+++ b/docs/assets/syrio_snippets.png
--- a/docs/assets/three_model_options.png
+++ b/docs/assets/three_model_options.png
--- a/docs/assets/ubuntu.svg
+++ b/docs/assets/ubuntu.svg
@@ -0,0 +1,5 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<svg xmlns="http://www.w3.org/2000/svg" width="285" height="285" viewBox="-142.5 -142.5 285 285" xmlns:xlink="http://www.w3.org/1999/xlink">
+<circle fill="#FFFFFF" r="141.732"/><g id="U" fill="#DD4814"><circle cx="-96.3772" r="18.9215"/>
+<path d="M-45.6059,68.395C-62.1655,57.3316-74.4844,40.4175-79.6011,20.6065-73.623,15.7354-69.8047,8.3164-69.8047,0-69.8047-8.3164-73.623-15.7354-79.6011-20.6065-74.4844-40.4175-62.1655-57.3316-45.6059-68.395L-31.7715-45.2212C-45.9824-35.2197-55.2754-18.7026-55.2754,0-55.2754,18.7026-45.9824,35.2197-31.7715,45.2212Z"/></g>
+<use xlink:href="#U" transform="rotate(120)"/><use xlink:href="#U" transform="rotate(240)"/></svg>
--- a/docs/assets/windows.png
+++ b/docs/assets/windows.png
--- a/docs/css/custom.css
+++ b/docs/css/custom.css
@@ -0,0 +1,5 @@
+.md-content h1,
+.md-content h2 {
+  margin-top: 0.5em;
+  margin-bottom: 0.5em;
+}
--- a/docs/gpt4all_api_server/home.md
+++ b/docs/gpt4all_api_server/home.md
@@ -0,0 +1,86 @@
+# GPT4All API Server
+
+GPT4All provides a local API server that allows you to run LLMs over an HTTP API. 
+
+## Key Features
+
+- **Local Execution**: Run models on your own hardware for privacy and offline use.
+- **LocalDocs Integration**: Run the API with relevant text snippets provided to your LLM from a [LocalDocs collection](../gpt4all_desktop/localdocs.md).
+- **OpenAI API Compatibility**: Use existing OpenAI-compatible clients and tools with your local models.
+
+## Activating the API Server
+
+1. Open the GPT4All Chat Desktop Application.
+2. Go to `Settings` > `Application` and scroll down to `Advanced`.
+3. Check the box for the `"Enable Local API Server"` setting.
+4. The server listens on port 4891 by default. You can choose another port number in the `"API Server Port"` setting.
+
+## Connecting to the API Server
+
+The base URL used for the API server is `http://localhost:4891/v1` (or `http://localhost:<PORT_NUM>/v1` if you are using a different port number). 
+
+The server only accepts HTTP connections (not HTTPS) and only listens on localhost (127.0.0.1) (e.g. not to the IPv6 localhost address `::1`.)
+
+## Examples
+
+!!! note "Example GPT4All API calls"
+
+    === "cURL"
+
+        ```bash
+        curl -X POST http://localhost:4891/v1/chat/completions -d '{
+        "model": "Phi-3 Mini Instruct",
+        "messages": [{"role":"user","content":"Who is Lionel Messi?"}],
+        "max_tokens": 50,
+        "temperature": 0.28
+        }'
+        ```
+
+    === "PowerShell"
+
+        ```powershell
+        Invoke-WebRequest -URI http://localhost:4891/v1/chat/completions -Method POST -ContentType application/json -Body '{
+        "model": "Phi-3 Mini Instruct",
+        "messages": [{"role":"user","content":"Who is Lionel Messi?"}],
+        "max_tokens": 50,
+        "temperature": 0.28
+        }'
+        ```
+
+## API Endpoints
+
+| Method | Path | Description |
+|--------|------|-------------|
+| GET | `/v1/models` | List available models |
+| GET | `/v1/models/<name>` | Get details of a specific model |
+| POST | `/v1/completions` | Generate text completions |
+| POST | `/v1/chat/completions` | Generate chat completions |
+
+## LocalDocs Integration
+
+You can use LocalDocs with the API server:
+
+1. Open the Chats view in the GPT4All application.
+2. Scroll to the bottom of the chat history sidebar.
+3. Select the server chat (it has a different background color).
+4. Activate LocalDocs collections in the right sidebar.
+
+(Note: LocalDocs can currently only be activated through the GPT4All UI, not via the API itself).
+
+Now, your API calls to your local LLM will have relevant references from your LocalDocs collection retrieved and placed in the input message for the LLM to respond to.
+
+The references retrieved for your API call can be accessed in the API response object at 
+
+`response["choices"][0]["references"]`
+
+The data included in the `references` are:
+
+- `text`: the actual text content from the snippet that was extracted from the reference document
+
+- `author`: the author of the reference document (if available)
+
+- `date`: the date of creation of the reference document (if available)
+
+- `page`: the page number the snippet is from (only available for PDF documents for now)
+
+- `title`: the title of the reference document (if available)
--- a/docs/gpt4all_desktop/chat_templates.md
+++ b/docs/gpt4all_desktop/chat_templates.md
@@ -0,0 +1,206 @@
+## What are chat templates?
+Natively, large language models only know how to complete plain text and do not know the difference between their input and their output. In order to support a chat with a person, LLMs are designed to use a template to convert the conversation to plain text using a specific format.
+
+For a given model, it is important to use an appropriate chat template, as each model is designed to work best with a specific format. The chat templates included with the built-in models should be sufficient for most purposes.
+
+There are two reasons you would want to alter the chat template:
+
+- You are sideloading a model and there is no chat template available,
+- You would like to have greater control over the input to the LLM than a system message provides.
+
+
+## What is a system message?
+A system message is a message that controls the responses from the LLM in a way that affects the entire conversation. System messages can be short, such as "Speak like a pirate.", or they can be long and contain a lot of context for the LLM to keep in mind.
+
+Not all models are designed to use a system message, so they work with some models better than others.
+
+
+## How do I customize the chat template or system message?
+To customize the chat template or system message, go to Settings > Model. Make sure to select the correct model at the top. If you clone a model, you can use a different chat template or system message from the base model, enabling you to use different settings for each conversation.
+
+These settings take effect immediately. After changing them, you can click "Redo last response" in the chat view, and the response will take the new settings into account.
+
+
+## Do I need to write a chat template?
+You typically do not need to write your own chat template. The exception is models that are not in the official model list and do not come with a chat template built-in. These will show a "Clear" option above the chat template field in the Model Settings page instead of a "Reset" option. See the section on [finding] or [creating] a chat template.
+
+[finding]: #how-do-i-find-a-chat-template
+[creating]: #advanced-how-do-chat-templates-work
+
+
+## What changed in GPT4All v3.5?
+GPT4All v3.5 overhauled the chat template system. There are three crucial differences:
+
+- The chat template now formats an entire conversation instead of a single pair of messages,
+- The chat template now uses Jinja syntax instead of `%1` and `%2` placeholders,
+- And the system message should no longer contain control tokens or trailing whitespace.
+
+If you are using any chat templates or system messages that had been added or altered from the default before upgrading to GPT4All v3.5 or newer, these will no longer work. See below for how to solve common errors you may see after upgrading.
+
+
+## Error/Warning: System message is not plain text.
+This is easy to fix. Go to the model's settings and look at the system prompt. There are three things to look for:
+
+- Control tokens such as `<|im_start|>`, `<|start_header_id|>`, or `<|system|>`
+- A prefix such as `### System` or `SYSTEM:`
+- Trailing whitespace, such as a space character or blank line.
+
+If you see any of these things, remove them. For example, this legacy system prompt:
+```
+<|start_header_id|>system<|end_header_id|>
+You are a helpful assistant.<|eot_id|>
+```
+
+Should become this:
+```
+You are a helpful assistant.
+```
+
+If you do not see anything that needs to be changed, you can dismiss the error by making a minor modification to the message and then changing it back.
+
+If you see a warning, your system message does not appear to be plain text. If you believe this warning is incorrect, it can be safely ignored. If in doubt, ask on the [Discord].
+
+[Discord]: https://discord.gg/mGZE39AS3e
+
+
+## Error: Legacy system prompt needs to be updated in Settings.
+This is the same as [above][above-1], but appears on the chat page.
+
+[above-1]: #errorwarning-system-message-is-not-plain-text
+
+
+## Error/Warning: Chat template is not in Jinja format.
+This is the result of attempting to use an old-style template (possibly from a previous version) in GPT4All 3.5+.
+
+Go to the Model Settings page and select the affected model. If you see a "Reset" button, and you have not intentionally modified the prompt template, you can click "Reset". Otherwise, this is what you can do:
+
+1. Back up your chat template by copying it safely to a text file and saving it. In the next step, it will be removed from GPT4All.
+2. Click "Reset" or "Clear".
+3. If you clicked "Clear", the chat template is now gone. Follow the steps to [find][finding] or [create][creating] a basic chat template for your model.
+4. Customize the chat template to suit your needs. For help, read the section about [creating] a chat template.
+
+
+## Error: Legacy prompt template needs to be updated in Settings.
+This is the same as [above][above-2], but appears on the chat page.
+
+[above-2]: #errorwarning-chat-template-is-not-in-jinja-format
+
+
+## The chat template has a syntax error.
+If there is a syntax error while editing the chat template, the details will be displayed in an error message above the input box. This could be because the chat template is not actually in Jinja format (see [above][above-2]).
+
+Otherwise, you have either typed something correctly, or the model comes with a template that is incompatible with GPT4All. See [the below section][creating] on creating chat templates and make sure that everything is correct. When in doubt, ask on the [Discord].
+
+
+## Error: No chat template configured.
+This may appear for models that are not from the official model list and do not include a chat template. Older versions of GPT4All picked a poor default in this case. You will get much better results if you follow the steps to [find][finding] or [create][creating] a chat template for your model.
+
+
+## Error: The chat template cannot be blank.
+If the button above the chat template on the Model Settings page says "Clear", see [above][above-3]. If you see "Reset", click that button to restore a reasonable default. Also see the section on [syntax errors][chat-syntax-error].
+
+[above-3]: #error-no-chat-template-configured
+[chat-syntax-error]: #the-chat-template-has-a-syntax-error
+
+
+## How do I find a chat template?
+When in doubt, you can always ask the [Discord] community for help. Below are the instructions to find one on your own.
+
+The authoritative source for a model's chat template is the HuggingFace repo that the original (non-GGUF) model came from. First, you should find this page. If you just have a model file, you can try a google search for the model's name. If you know the page you downloaded the GGUF model from, its README usually links to the original non-GGUF model.
+
+Once you have located the original model, there are two methods you can use to extract its chat template. Pick whichever one you are most comfortable with.
+
+### Using the CLI (all models)
+1. Install `jq` using your preferred package manager - e.g. Chocolatey (Windows), Homebrew (macOS), or apt (Ubuntu).
+2. Download `tokenizer_config.json` from the model's "Files and versions" tab.
+3. Open a command prompt in the directory which you have downloaded the model file.
+4. Run `jq -r ".chat_template" tokenizer_config.json`. This shows the chat template in a human-readable form. You can copy this and paste it into the settings page.
+5. (Optional) You can save the output to a text file like this: `jq -r ".chat_template" tokenizer_config.json >chat_template.txt`
+
+If the output is "null", the model does not provide a chat template. See the [below instructions][creating] on creating a chat template.
+
+### Python (open models)
+1. Install `transformers` using your preferred python package manager, e.g. `pip install transformers`. Make sure it is at least version v4.43.0.
+2. Copy the ID of the HuggingFace model, using the clipboard icon next to the name. For example, if the URL is `https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B`, the ID is `NousResearch/Hermes-2-Pro-Llama-3-8B`.
+3. Open a python interpreter (`python`) and run the following commands. Change the model ID in the example to the one you copied.
+```
+>>> from transformers import AutoTokenizer
+>>> tokenizer = AutoTokenizer.from_pretrained('NousResearch/Hermes-2-Pro-Llama-3-8B')
+>>> print(tokenizer.get_chat_template())
+```
+You can copy the output and paste it into the settings page.
+4. (Optional) You can save the output to a text file like this:
+```
+>>> open('chat_template.txt', 'w').write(tokenizer.get_chat_template())
+```
+
+If you get a ValueError exception, this model does not provide a chat template. See the [below instructions][creating] on creating a chat template.
+
+
+### Python (gated models)
+Some models, such as Llama and Mistral, do not allow public access to their chat template. You must either use the CLI method above, or follow the following instructions to use Python:
+
+1. For these steps, you must have git and git-lfs installed.
+2. You must have a HuggingFace account and be logged in.
+3. You must already have access to the gated model. Otherwise, request access.
+4. You must have an SSH key configured for git access to HuggingFace.
+5. `git clone` the model's HuggingFace repo using the SSH clone URL. There is no need to download the entire model, which is very large. A good way to do this on Linux is:
+```console
+$ GIT_LFS_SKIP_SMUDGE=1 git clone hf.co:meta-llama/Llama-3.1-8B-Instruct.git
+$ cd Llama-3.1-8B-Instruct
+$ git lfs pull -I "tokenizer.*"
+```
+6. Follow the above instructions for open models, but replace the model ID with the path to the directory containing `tokenizer\_config.json`:
+```
+>>> tokenizer = AutoTokenizer.from_pretrained('.')
+```
+
+
+## Advanced: How do chat templates work?
+The chat template is applied to the entire conversation you see in the chat window. The template loops over the list of messages, each containing `role` and `content` fields. `role` is either `user`, `assistant`, or `system`.
+
+GPT4All also supports the special variables `bos_token`, `eos_token`, and `add_generation_prompt`. See the [HuggingFace docs] for what those do.
+
+[HuggingFace docs]: https://huggingface.co/docs/transformers/v4.46.3/en/chat_templating#special-variables
+
+
+## Advanced: How do I make a chat template?
+The best way to create a chat template is to start by using an existing one as a reference. Then, modify it to use the format documented for the given model. Its README page may explicitly give an example of its template. Or, it may mention the name of a well-known standard template, such as ChatML, Alpaca, Vicuna. GPT4All does not yet include presets for these templates, so they will have to be found in other models or taken from the community.
+
+For more information, see the very helpful [HuggingFace guide]. Some of this is not applicable, such as the information about tool calling and RAG - GPT4All implements those features differently.
+
+Some models use a prompt template that does not intuitively map to a multi-turn chat, because it is more intended for single instructions. The [FastChat] implementation of these templates is a useful reference for the correct way to extend them to multiple messages.
+
+[HuggingFace guide]: https://huggingface.co/docs/transformers/v4.46.3/en/chat_templating#advanced-template-writing-tips
+[FastChat]: https://github.com/lm-sys/FastChat/blob/main/fastchat/conversation.py
+
+
+# Advanced: What are GPT4All v1 templates?
+GPT4All supports its own template syntax, which is nonstandard but provides complete control over the way LocalDocs sources and file attachments are inserted into the conversation. These templates begin with `{# gpt4all v1 #}` and look similar to the example below.
+
+For standard templates, GPT4All combines the user message, sources, and attachments into the `content` field. For GPT4All v1 templates, this is not done, so they must be used directly in the template for those features to work correctly.
+
+```jinja
+{# gpt4all v1 #}
+{%- for message in messages %}
+    {{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n' }}
+    {%- if message['role'] == 'user' %}
+        {%- for source in message['sources'] %}
+            {%- if loop.first %}
+                {{- '### Context:\n' }}
+            {%- endif %}
+            {{- 'Collection: ' + source['collection'] + '\n'   +
+                'Path: '       + source['path']       + '\n'   +
+                'Excerpt: '    + source['text']       + '\n\n' }}
+        {%- endfor %}
+    {%- endif %}
+    {%- for attachment in message['prompt_attachments'] %}
+        {{- attachment['processed_content'] + '\n\n' }}
+    {%- endfor %}
+    {{- message['content'] | trim }}
+    {{- '<|eot_id|>' }}
+{%- endfor %}
+{%- if add_generation_prompt %}
+    {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' }}
+{%- endif %}
+```
--- a/docs/gpt4all_desktop/chats.md
+++ b/docs/gpt4all_desktop/chats.md
@@ -0,0 +1,124 @@
+# Chats
+
+Chats are conversations with [language models](models.md) that run locally on your device.
+
+!!! note "Example Chats"
+
+    === "Llama 3"
+        Download `Llama 3` and prompt:
+        ```
+        explain why the sky is blue in a way that is correct and makes sense to a child
+        ```
+
+        With the default sampling settings, you should see text resembling the following:
+
+        ```
+        What a great question!
+        So, you know how we can see different colors like red, yellow, green, and orange? Well, when sunlight enters Earth's atmosphere, it starts to interact with tiny particles called molecules of gases like nitrogen (N2) and oxygen (02). These molecules are really small, but they're all around us!
+        Now, here's the cool part: these molecules scatter, or bounce off, shorter wavelengths of light more than longer wavelengths. Think of it like a game of pool where the cue ball hits other balls with different sizes and speeds.
+        The shortest wavelength of visible light is violet (about 400 nanometers), followed by blue (around 450-495 nanometers). These shorter wavelengths are scattered in all directions, kind of like how a basketball player might bounce the ball around on the court. This scattering effect makes the sky appear blue to our eyes!
+        Here's why it doesn't look as bright or intense as a blue crayon: there are many more molecules in the atmosphere that scatter longer wavelengths (like red and orange) than shorter ones, so those colors get spread out and become less noticeable.
+        To sum it up:
+        1. Sunlight enters Earth's atmosphere.
+        2. Tiny gas molecule particles interact with light, scattering shorter wavelengths like blue and violet.
+        3. This scattered blue light reaches our eyes from all directions, making the sky appear blue!
+        Now, go outside and enjoy that beautiful blue sky!
+        ```
+
+    === "Nous Hermes 2 Mistral DPO"
+        Download `Nous Hermes 2 Mistral DPO` and prompt:
+        ```
+        write me a react app i can run from the command line to play a quick game
+        ```
+
+        With the default sampling settings, you should see text and code blocks resembling the following:
+        
+        ```
+        I'll provide you with a simple React app that allows you to play a guessing game. This app will generate a random number between 1 and 10, and ask you to guess it. You can run this app locally using Create React App. Follow the steps below:
+        Open your terminal or command line interface.
+        Navigate to the directory where you want to create the project (e.g., cd Documents/Projects).
+        Run the following commands:
+        ```
+
+        ```bash
+        npx create-react-app guessing-game
+        cd guessing-game
+        ```
+
+        ```
+        Open the project in your preferred code editor by running code . (for VS Code) or any other command to open the folder with your text editor.
+        Replace the content of the src/App.js file with the following code:
+        ```
+
+        ```javascript
+        import React, { useState } from 'react';
+
+        function App() {
+        const [guess, setGuess] = useState('');
+        const [message, setMessage] = useState('');
+        const [numberToGuess, setNumberToGuess] = useState(Math.floor(Math.random() * 10) + 1);
+
+        function handleChange(e) {
+            setGuess(e.target.value);
+        }
+
+        function handleSubmit(e) {
+            e.preventDefault();
+            if (parseInt(guess) === numberToGuess) {
+            setMessage('You guessed it right!');
+            } else {
+            setMessage(`Wrong guess, the correct number was ${numberToGuess}. Try again next time.`);
+            }
+        }
+
+        return (
+            <div>
+            <h1>Guess a Number</h1>
+            <p>{message}</p>
+            <form onSubmit={handleSubmit}>
+                <input type="number" value={guess} onChange={handleChange} />
+                <button type="submit">Submit</button>
+            </form>
+            </div>
+        );
+        }
+
+        export default App;
+        ```
+
+        ```
+        Save the file and go back to your terminal or command line interface. Run npm start. This will start a local development server, and you can access the app in your browser at http://localhost:3000.
+        Play the guessing game by entering a number between 1 and 10 into the input field and clicking "Submit". The app will tell you if your guess is correct or not.
+        Remember that this is just a simple example, and you can expand upon it to make the game more interesting with additional features like high scores, multiple difficulty levels, etc.
+        ```
+
+## New Chat
+
+Choose a model with the dropdown at the top of the Chats page
+
+If you don't have any models, [download one](models.md#download-models). Once you have models, you can start chats by loading your default model, which you can configure in [settings](settings.md#application-settings)
+
+![Choose a model](../assets/three_model_options.png)
+
+## LocalDocs
+
+Open the [LocalDocs](localdocs.md) panel with the button in the top-right corner to bring your files into the chat. With LocalDocs, your chats are enhanced with semantically related snippets from your files included in the model's context.
+
+![Open LocalDocs](../assets/open_local_docs.png)
+
+## Chat History
+
+View your chat history with the button in the top-left corner of the Chats page.
+
+<table>
+<tr>
+    <td>
+    <img src="../assets/closed_chat_panel.png" alt="Close chats" style="width:100%">
+    </td>
+    <td>
+    <img src="../assets/open_chat_panel.png" alt="Open chats" style="width:100%">
+    </td>
+</tr>
+</table>
+
+You can change a chat name or delete it from your chat history at any time.
--- a/docs/gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-Obsidian.md
+++ b/docs/gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-Obsidian.md
@@ -0,0 +1,106 @@
+# Using GPT4All to Privately Chat with your Obsidian Vault
+
+Obsidian for Desktop is a powerful management and note-taking software designed to create and organize markdown notes. This tutorial allows you to sync and access your Obsidian note files directly on your computer. By connecting it to LocalDocs, you can integrate these files into your LLM chats for private access and enhanced context.
+
+## Download Obsidian for Desktop
+
+!!! note "Download Obsidian for Desktop"
+
+      1. **Download Obsidian for Desktop**:
+         - Visit the [Obsidian website](https://obsidian.md) and create an account account.
+         - Click the Download button in the center of the homepage
+         - For more help with installing Obsidian see [Getting Started with Obsidian](https://help.obsidian.md/Getting+started/Download+and+install+Obsidian)
+      
+      2. **Set Up Obsidian**:
+         - Launch Obsidian from your Applications folder (macOS), Start menu (Windows), or equivalent location (Linux).
+         - On the welcome screen, you can either create a new vault (a collection of notes) or open an existing one.
+         - To create a new vault, click Create a new vault, name your vault, choose a location on your computer, and click Create.
+   
+   
+      3. **Sign in and Sync**:
+            - Once installed, you can start adding and organizing notes.
+            - Choose the folders you want to sync to your computer.
+   
+
+
+## Connect Obsidian to LocalDocs
+
+!!! note "Connect Obsidian to LocalDocs"
+
+      1. **Open LocalDocs**:
+         - Navigate to the LocalDocs feature within GPT4All.
+
+         <table>
+            <tr>
+               <td>
+                  <!-- Screenshot of LocalDocs interface -->
+                  <img width="1348" alt="LocalDocs interface" src="https://github.com/nomic-ai/gpt4all/assets/132290469/d8fb2d79-2063-45d4-bcce-7299fb75b144">
+               </td>
+            </tr>
+         </table>
+   
+      2. **Add Collection**:
+         - Click on **+ Add Collection** to begin linking your Obsidian Vault.
+      
+         <table>
+            <tr>
+               <td>
+                  <!-- Screenshot of adding collection in LocalDocs -->
+                  <img width="1348" alt="Screenshot of adding collection" src="https://raw.githubusercontent.com/nomic-ai/gpt4all/main/docs/assets/obsidian_adding_collection.png">
+               </td>
+            </tr>
+         </table>
+   
+         - Name your collection
+   
+   
+      3. **Create Collection**:
+         - Click **Create Collection** to initiate the embedding process. Progress will be displayed within the LocalDocs interface.
+   
+      4. **Access Files in Chats**:
+         - Load a model to chat with your files (Llama 3 Instruct is the fastest)
+         - In your chat, open 'LocalDocs' with the button in the top-right corner to provide context from your synced Obsidian notes.
+      
+         <table>
+            <tr>
+               <td>
+                  <!-- Screenshot of accessing LocalDocs in chats -->
+                  <img width="1447" alt="Accessing LocalDocs in chats" src="https://raw.githubusercontent.com/nomic-ai/gpt4all/main/docs/assets/obsidian_docs.png">
+               </td>
+            </tr>
+         </table>
+   
+      5. **Interact With Your Notes:**
+         - Use the model to interact with your files
+         <table>
+            <tr>
+               <td>
+                  <!-- Screenshot of interacting sources -->
+                  <img width="662" alt="osbsidian user interaction" src="https://raw.githubusercontent.com/nomic-ai/gpt4all/main/docs/assets/osbsidian_user_interaction.png">
+               </td>
+            </tr>
+         </table>
+         <table>
+            <tr>
+               <td>
+                  <!-- Screenshot of viewing sources -->
+                  <img width="662" alt="osbsidian GPT4ALL response" src="https://raw.githubusercontent.com/nomic-ai/gpt4all/main/docs/assets/obsidian_response.png">
+               </td>
+            </tr>
+         </table>
+   
+      6. **View Referenced Files**:
+         - Click on **Sources** below LLM responses to see which Obsidian Notes were referenced.
+      
+         <table>
+            <tr>
+               <td>
+                  <!-- Referenced Files  -->
+                  <img width="643" alt="Referenced Files" src="https://raw.githubusercontent.com/nomic-ai/gpt4all/main/docs/assets/obsidian_sources.png">
+               </td>
+            </tr>
+         </table>
+
+## How It Works
+
+Obsidian for Desktop syncs your Obsidian notes to your computer, while LocalDocs integrates these files into your LLM chats using embedding models. These models find semantically similar snippets from your files to enhance the context of your interactions.
--- a/docs/gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-One-Drive.md
+++ b/docs/gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-One-Drive.md
@@ -0,0 +1,112 @@
+# Using GPT4All to Privately Chat with your OneDrive Data
+
+Local and Private AI Chat with your OneDrive Data
+
+OneDrive for Desktop allows you to sync and access your OneDrive files directly on your computer. By connecting your synced directory to LocalDocs, you can start using GPT4All to privately chat with data stored in your OneDrive.
+
+## Download OneDrive for Desktop
+
+!!! note "Download OneDrive for Desktop"
+
+    1. **Download OneDrive for Desktop**:
+    - Visit [Microsoft OneDrive](https://www.microsoft.com/en-us/microsoft-365/onedrive/download).
+    - Press 'download' for your respective device type.
+    - Download the OneDrive for Desktop application.
+    
+    2. **Install OneDrive for Desktop**
+    - Run the installer file you downloaded.
+    - Follow the prompts to complete the installation process.
+    
+    3. **Sign in and Sync**
+    - Once installed, sign in to OneDrive for Desktop with your Microsoft account credentials.
+    - Choose the folders you want to sync to your computer.
+
+## Connect OneDrive to LocalDocs
+
+!!! note "Connect OneDrive to LocalDocs"
+
+    1. **Install GPT4All and Open LocalDocs**:
+    
+        - Go to [nomic.ai/gpt4all](https://nomic.ai/gpt4all) to install GPT4All for your operating system.
+        
+        - Navigate to the LocalDocs feature within GPT4All to configure it to use your synced OneDrive directory.
+
+        <table>
+        <tr>
+            <td>
+                <!-- Placeholder for screenshot of LocalDocs interface -->
+                <img width="1348" alt="Screenshot 2024-07-10 at 10 55 41 AM" src="https://github.com/nomic-ai/gpt4all/assets/132290469/54254bc0-d9a0-40c4-9fd1-5059abaad583">
+            </td>
+        </tr>
+        </table>
+
+    2. **Add Collection**:
+    
+        - Click on **+ Add Collection** to begin linking your OneDrive folders.
+
+        <table>
+        <tr>
+            <td>
+                <!-- Placeholder for screenshot of adding collection in LocalDocs -->
+               <img width="1348" alt="Screenshot 2024-07-10 at 10 56 29 AM" src="https://github.com/nomic-ai/gpt4all/assets/132290469/7f12969a-753a-4757-bb9e-9b607cf315ca">
+            </td>
+        </tr>
+        </table>
+
+        - Name the Collection and specify the OneDrive folder path.
+
+    3. **Create Collection**:
+    
+        - Click **Create Collection** to initiate the embedding process. Progress will be displayed within the LocalDocs interface.
+
+    4. **Access Files in Chats**:
+    
+        - Load a model within GPT4All to chat with your files.
+        
+        - In your chat, open 'LocalDocs' using the button in the top-right corner to provide context from your synced OneDrive files.
+
+        <table>
+        <tr>
+            <td>
+                <!-- Placeholder for screenshot of accessing LocalDocs in chats -->
+                <img width="1447" alt="Screenshot 2024-07-10 at 10 58 55 AM" src="https://github.com/nomic-ai/gpt4all/assets/132290469/b5a67fe6-0d6a-42ae-b3b8-cc0f91cbf5b1">
+            </td>
+        </tr>
+        </table>
+
+    5. **Interact With Your OneDrive**:
+    
+        - Use the model to interact with your files directly from OneDrive.
+        
+        <table>
+        <tr>
+            <td>
+                <!-- Placeholder for screenshot of interacting with sources -->
+                <img width="662" alt="Screenshot 2024-07-10 at 11 04 55 AM" src="https://github.com/nomic-ai/gpt4all/assets/132290469/2c9815b8-3d1c-4179-bf76-3ddbafb193bf">
+            </td>
+        </tr>
+        </table>
+        
+        <table>
+        <tr>
+            <td>
+                <img width="662" alt="Screenshot 2024-07-11 at 11 21 46 AM" src="https://github.com/nomic-ai/gpt4all/assets/132290469/ce8be292-b025-415a-bd54-f11868e0cd0a">
+            </td>
+        </tr>
+        </table>
+
+    6. **View Referenced Files**:
+    
+        - Click on **Sources** below responses to see which OneDrive files were referenced.
+
+        <table>
+        <tr>
+            <td>
+              <img width="643" alt="Screenshot 2024-07-11 at 11 22 49 AM" src="https://github.com/nomic-ai/gpt4all/assets/132290469/6fe3f10d-2791-4153-88a7-2198ab3ac945">
+            </td>
+        </tr>
+        </table>
+
+## How It Works
+
+OneDrive for Desktop syncs your OneDrive files to your computer, while LocalDocs maintains a database of these synced files for use by your local GPT4All model. As your OneDrive updates, LocalDocs will automatically detect file changes and stay up to date. LocalDocs leverages [Nomic Embedding](https://docs.nomic.ai/atlas/capabilities/embeddings) models to find semantically similar snippets from your files, enhancing the context of your interactions.
--- a/docs/gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-google-drive.md
+++ b/docs/gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-google-drive.md
@@ -0,0 +1,113 @@
+# Using GPT4All to Privately Chat with your Google Drive Data
+Local and Private AI Chat with your Google Drive Data
+
+Google Drive for Desktop allows you to sync and access your Google Drive files directly on your computer. By connecting your synced directory to LocalDocs, you can start using GPT4All to privately chat with data stored in your Google Drive.
+
+## Download Google Drive for Desktop
+
+!!! note "Download Google Drive for Desktop"
+
+    1. **Download Google Drive for Desktop**:
+    - Visit [drive.google.com](https://drive.google.com) and sign in with your Google account.
+    - Navigate to the **Settings** (gear icon) and select **Settings** from the dropdown menu.
+    - Scroll down to **Google Drive for desktop** and click **Download**.
+
+    2. **Install Google Drive for Desktop**
+    - Run the installer file you downloaded.
+    - Follow the prompts to complete the installation process.
+
+    3. **Sign in and Sync**
+    - Once installed, sign in to Google Drive for Desktop with your Google account credentials.
+    - Choose the folders you want to sync to your computer.
+
+For advanced help, see [Setting up Google Drive for Desktop](https://support.google.com/drive/answer/10838124?hl=en)
+## Connect Google Drive to LocalDocs
+
+!!! note "Connect Google Drive to LocalDocs"
+
+    1. **Install GPT4All and Open LocalDocs**:
+    
+        - Go to [nomic.ai/gpt4all](https://nomic.ai/gpt4all) to install GPT4All for your operating system.
+        
+        - Navigate to the LocalDocs feature within GPT4All to configure it to use your synced directory.
+
+        <table>
+        <tr>
+            <td>
+                <!-- Screenshot of LocalDocs interface -->
+                <img width="1348" alt="Screenshot 2024-07-09 at 3 15 35 PM" src="https://github.com/nomic-ai/gpt4all/assets/132290469/d8fb2d79-2063-45d4-bcce-7299fb75b144">
+            </td>
+        </tr>
+        </table>
+
+    2. **Add Collection**:
+    
+        - Click on **+ Add Collection** to begin linking your Google Drive folders.
+
+        <table>
+        <tr>
+            <td>
+                <!-- Screenshot of adding collection in LocalDocs -->
+                <img width="1348" alt="Screenshot 2024-07-09 at 3 17 24 PM" src="https://github.com/nomic-ai/gpt4all/assets/132290469/39063615-9eb6-4c47-bde7-c9f04f9b168b">
+            </td>
+        </tr>
+        </table>
+
+        - Name Collection
+
+
+    3. **Create Collection**:
+    
+        - Click **Create Collection** to initiate the embedding process. Progress will be displayed within the LocalDocs interface.
+
+    4. **Access Files in Chats**:
+    
+        - Load a model to chat with your files (Llama 3 Instruct performs best)
+        
+        - In your chat, open 'LocalDocs' with the button in the top-right corner to provide context from your synced Google Drive files.
+
+        <table>
+        <tr>
+            <td>
+                <!-- Screenshot of accessing LocalDocs in chats -->
+                <img width="1447" alt="Screenshot 2024-07-09 at 3 20 53 PM" src="https://github.com/nomic-ai/gpt4all/assets/132290469/ce68811f-9abd-451b-ac0a-fb941e185d7a">
+            </td>
+        </tr>
+        </table>
+
+    5. **Interact With Your Drive:**
+    
+        - Use the model to interact with your files
+        
+        <table>
+        <tr>
+            <td>
+                <!-- Screenshot of interacting sources -->
+                <img width="662" alt="Screenshot 2024-07-09 at 3 36 51 PM" src="https://github.com/nomic-ai/gpt4all/assets/132290469/bc55bc36-e613-419d-a568-adb1cd993854">
+            </td>
+        </tr>
+        </table>
+
+        <table>
+        <tr>
+            <td>
+              <img width="662" alt="Screenshot 2024-07-11 at 11 34 00 AM" src="https://github.com/nomic-ai/gpt4all/assets/132290469/1c0fd19a-5a22-4726-a841-d26c1bea81fc">
+            </td>
+        </tr>
+        </table>
+    
+    6. **View Referenced Files**:
+    
+        - Click on **Sources** below LLM responses to see which Google Drive files were referenced.
+
+        <table>
+        <tr>
+            <td>  
+           <img width="643" alt="Screenshot 2024-07-11 at 11 34 37 AM" src="https://github.com/nomic-ai/gpt4all/assets/132290469/78527d30-8d24-4b4c-8311-b611a2d66fcd">
+            </td>
+        </tr>
+        </table>
+
+## How It Works
+
+Google Drive for Desktop syncs your Google Drive files to your computer, while LocalDocs maintains a database of these synced files for use by your local LLM. As your Google Drive updates, LocalDocs will automatically detect file changes and get up to date. LocalDocs is powered by [Nomic Embedding](https://docs.nomic.ai/atlas/capabilities/embeddings) models which find semantically similar snippets from your files to enhance the context of your interactions.
--- a/docs/gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-microsoft-excel.md
+++ b/docs/gpt4all_desktop/cookbook/use-local-ai-models-to-privately-chat-with-microsoft-excel.md
@@ -0,0 +1,85 @@
+# Using GPT4All to Privately Chat with your Microsoft Excel Spreadsheets
+Local and Private AI Chat with your Microsoft Excel Spreadsheets
+
+Microsoft Excel allows you to create, manage, and analyze data in spreadsheet format. By attaching your spreadsheets directly to GPT4All, you can privately chat with the AI to query and explore the data, enabling you to summarize, generate reports, and glean insights from your files—all within your conversation.
+
+<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
+  <iframe src="../../assets/gpt4all_xlsx_attachment.mp4" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" allowfullscreen title="YouTube Video"></iframe>
+</div>
+
+
+## Attach Microsoft Excel to your GPT4All Conversation
+
+!!! note "Attach Microsoft Excel to your GPT4All Conversation"
+
+    1. **Install GPT4All and Open **:
+
+        - Go to [nomic.ai/gpt4all](https://nomic.ai/gpt4all) to install GPT4All for your operating system.
+
+        - Navigate to the Chats view within GPT4All.
+
+        <table>
+            <tr>
+               <td>
+                  <!-- Screenshot of Chat view -->
+                  <img width="1348" alt="Chat view" src="../../assets/chat_window.png">
+               </td>
+            </tr>
+         </table>
+
+    2. **Example Spreadsheet **:
+
+        <table>
+            <tr>
+               <td>
+                  <!-- Screenshot of Spreadsheet view -->
+                  <img width="1348" alt="Spreadsheet view" src="../../assets/disney_spreadsheet.png">
+               </td>
+            </tr>
+         </table>
+
+    3. **Attach to GPT4All conversration**
+        <table>
+            <tr>
+               <td>
+                  <!-- Screenshot of Attach view -->
+                  <img width="1348" alt="Attach view" src="../../assets/attach_spreadsheet.png">
+               </td>
+            </tr>
+         </table>
+
+    4. **Have GPT4All Summarize and Generate a Report**
+        <table>
+            <tr>
+               <td>
+                  <!-- Screenshot of Attach view -->
+                  <img width="1348" alt="Attach view" src="../../assets/spreadsheet_chat.png">
+               </td>
+            </tr>
+         </table>
+
+
+## How It Works
+
+GPT4All parses your attached excel spreadsheet into Markdown, a format understandable to LLMs, and adds the markdown text to the context for your LLM chat. You can view the code that converts `.xslx` to Markdown [here](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/src/xlsxtomd.cpp) in the GPT4All github repo.
+
+For example, the above spreadsheet titled `disney_income_stmt.xlsx` would be formatted the following way:
+
+```markdown
+## disney_income_stmt
+
+|Walt Disney Co.|||||||
+|---|---|---|---|---|---|---|
+|Consolidated Income Statement|||||||
+|||||||||
+|US$ in millions|||||||
+|12 months ended:|2023-09-30 00:00:00|2022-10-01 00:00:00|2021-10-02 00:00:00|2020-10-03 00:00:00|2019-09-28 00:00:00|2018-09-29 00:00:00|
+|Services|79562|74200|61768|59265|60542|50869|
+...
+...
+...
+```
+
+## Limitations
+
+It is important to double-check the claims LLMs make about the spreadsheets you provide. LLMs can make mistakes about the data they are presented, particularly for the LLMs with smaller parameter counts (~8B) that fit within the memory of consumer hardware.
--- a/docs/gpt4all_desktop/localdocs.md
+++ b/docs/gpt4all_desktop/localdocs.md
@@ -0,0 +1,46 @@
+# LocalDocs
+
+LocalDocs brings the information you have from files on-device into your LLM chats - **privately**.
+
+## Create LocalDocs
+
+!!! note "Create LocalDocs"
+
+    1. Click `+ Add Collection`.
+    
+    2. Name your collection and link it to a folder.
+
+        <table>
+        <tr>
+            <td>
+            <img src="../assets/new_docs_annotated.png" alt="new GOT Docs" style="width:100%">
+            </td>
+            <td>
+            <img src="../assets/new_docs_annotated_filled.png" alt="new GOT Docs filled out" style="width:100%">
+            </td>
+        </tr>
+        </table>
+
+    3. Click `Create Collection`. Progress for the collection is displayed on the LocalDocs page. 
+
+        ![Embedding in progress](../assets/baelor.png)
+
+        You will see a green `Ready` indicator when the entire collection is ready. 
+
+        Note: you can still chat with the files that are ready before the entire collection is ready.
+
+        ![Embedding complete](../assets/got_done.png)
+
+        Later on if you modify your LocalDocs settings you can rebuild your collections with your new settings.
+
+    4. In your chats, open `LocalDocs` with button in top-right corner to give your LLM context from those files.
+
+        ![LocalDocs result](../assets/syrio_snippets.png)
+
+    5. See which files were referenced by clicking `Sources` below the LLM responses.
+
+        ![Sources](../assets/open_sources.png)
+
+## How It Works
+
+A LocalDocs collection uses Nomic AI's free and fast on-device embedding models to index your folder into text snippets that each get an **embedding vector**. These vectors allow us to find snippets from your files that are semantically similar to the questions and prompts you enter in your chats. We then include those semantically similar snippets in the prompt to the LLM.
--- a/docs/gpt4all_desktop/models.md
+++ b/docs/gpt4all_desktop/models.md
@@ -0,0 +1,79 @@
+# Models
+
+GPT4All is optimized to run LLMs in the 3-13B parameter range on consumer-grade hardware.
+
+LLMs are downloaded to your device so you can run them locally and privately. With our backend anyone can interact with LLMs efficiently and securely on their own hardware.
+
+## Download Models
+
+!!! note "Download Models"
+
+    <div style="text-align: center; margin-top: 20px;">
+        <table style="margin-left: auto; margin-right: auto;">
+            <tr>
+                <td style="text-align: right; padding-right: 10px;">1.</td>
+                <td style="text-align: left;">Click `Models` in the menu on the left (below `Chats` and above `LocalDocs`)</td>
+                <td><img src="../assets/models_page_icon.png" alt="Models Page Icon" style="width: 80px; height: auto;"></td>
+            </tr>
+            <tr>
+                <td style="text-align: right; padding-right: 10px;">2.</td>
+                <td style="text-align: left;">Click `+ Add Model` to navigate to the `Explore Models` page</td>
+                <td><img src="../assets/add.png" alt="Add Model button" style="width: 100px; height: auto;"></td>
+            </tr>
+            <tr>
+                <td style="text-align: right; padding-right: 10px;">3.</td>
+                <td style="text-align: left;">Search for models available online</td>
+                <td><img src="../assets/explore.png" alt="Explore Models search" style="width: 120px; height: auto;"></td>
+            </tr>
+            <tr>
+                <td style="text-align: right; padding-right: 10px;">4.</td>
+                <td style="text-align: left;">Hit `Download` to save a model to your device</td>
+                <td><img src="../assets/download.png" alt="Download Models button" style="width: 120px; height: auto;"></td>
+            </tr>
+            <tr>
+                <td style="text-align: right; padding-right: 10px;">5.</td>
+                <td style="text-align: left;">Once the model is downloaded you will see it in `Models`.</td>
+                <td><img src="../assets/installed_models.png" alt="Download Models button" style="width: 120px; height: auto;"></td>
+            </tr>
+        </table>
+    </div>
+
+## Explore Models
+
+GPT4All connects you with LLMs from HuggingFace with a [`llama.cpp`](https://github.com/ggerganov/llama.cpp) backend so that they will run efficiently on your hardware. Many of these models can be identified by the file type `.gguf`.
+
+![Explore models](../assets/search_mistral.png)
+
+## Example Models
+
+Many LLMs are available at various sizes, quantizations, and licenses. 
+
+- LLMs with more parameters tend to be better at coherently responding to instructions
+
+- LLMs with a smaller quantization (e.g. 4bit instead of 16bit) are much faster and less memory intensive, and tend to have slightly worse performance
+
+- Licenses vary in their terms for personal and commercial use
+
+Here are a few examples:
+
+| Model| Filesize| RAM Required| Parameters| Quantization| Developer| License| MD5 Sum (Unique Hash)|
+|------|---------|-------------|-----------|-------------|----------|--------|----------------------|
+| Llama 3 Instruct  | 4.66 GB| 8 GB| 8 Billion| q4_0| Meta| [Llama 3 License](https://llama.meta.com/llama3/license/)| c87ad09e1e4c8f9c35a5fcef52b6f1c9|
+| Nous Hermes 2 Mistral DPO| 4.11 GB| 8 GB| 7 Billion| q4_0| Mistral & Nous Research | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)| Coa5f6b4eabd3992da4d7fb7f020f921eb|
+| Phi-3 Mini Instruct | 2.18 GB| 4 GB| 4 billion| q4_0| Microsoft| [MIT](https://opensource.org/license/mit)| f8347badde9bfc2efbe89124d78ddaf5|
+| Mini Orca (Small)| 1.98 GB| 4 GB| 3 billion| q4_0| Microsoft | [CC-BY-NC-SA-4.0](https://spdx.org/licenses/CC-BY-NC-SA-4.0)| 0e769317b90ac30d6e09486d61fefa26|
+| GPT4All Snoozy| 7.37 GB| 16 GB| 13 billion| q4_0| Nomic AI| [GPL](https://www.gnu.org/licenses/gpl-3.0.en.html)| 40388eb2f8d16bb5d08c96fdfaac6b2c|
+
+### Search Results
+
+You can click the gear icon in the search bar to sort search results by their # of likes, # of downloads, or date of upload (all from HuggingFace).
+
+![Sort search results](../assets/search_settings.png)
+
+## Connect Model APIs
+
+You can add your API key for remote model providers.
+
+**Note**: this does not download a model file to your computer to use securely. Instead, this way of interacting with models has your prompts leave your computer to the API provider and returns the response to your computer.
+
+![Connect APIs](../assets/add_model_gpt4.png)
--- a/docs/gpt4all_desktop/quickstart.md
+++ b/docs/gpt4all_desktop/quickstart.md
@@ -0,0 +1,42 @@
+# GPT4All Desktop
+
+The GPT4All Desktop Application allows you to download and run large language models (LLMs) locally & privately on your device.
+
+With GPT4All, you can chat with models, turn your local files into information sources for models [(LocalDocs)](localdocs.md), or browse models available online to download onto your device.
+
+[Official Video Tutorial](https://www.youtube.com/watch?v=gQcZDXRVJok)
+
+## Quickstart
+
+!!! note "Quickstart"
+
+    1. Install GPT4All for your operating system and open the application.
+
+        <div style="text-align: center; margin-top: 20px;">
+            [Download for Windows](https://gpt4all.io/installers/gpt4all-installer-win64.exe) &nbsp;&nbsp;&nbsp;&nbsp;
+            [Download for Mac](https://gpt4all.io/installers/gpt4all-installer-darwin.dmg) &nbsp;&nbsp;&nbsp;&nbsp;
+            [Download for Linux](https://gpt4all.io/installers/gpt4all-installer-linux.run)
+        </div>
+
+    2. Hit `Start Chatting`. ![GPT4All home page](../assets/gpt4all_home.png)
+
+    3. Click `+ Add Model`.
+
+    4. Download a model. We recommend starting with Llama 3, but you can [browse more models](models.md). ![Download a model](../assets/download_llama.png)
+
+    5. Once downloaded, go to Chats (below Home and above Models in the menu on the left).  
+
+    6. Click "Load Default Model" (will be Llama 3 or whichever model you downloaded). 
+
+        <table>
+        <tr>
+            <td>
+            <img src="../assets/before_first_chat.png" alt="Before first chat" style="width:100%">
+            </td>
+            <td>
+            <img src="../assets/new_first_chat.png" alt="New first chat" style="width:100%">
+            </td>
+        </tr>
+        </table>
+
+    7. Try the [example chats](chats.md) or your own prompts!
--- a/docs/gpt4all_desktop/settings.md
+++ b/docs/gpt4all_desktop/settings.md
@@ -0,0 +1,79 @@
+# Settings
+
+## Application Settings
+
+!!! note "General Application Settings"
+
+    | Setting | Description | Default Value |
+    | --- | --- | --- |
+    | **Theme** | Color theme for the application. Options are `Light`, `Dark`, and `LegacyDark` | `Light` |
+    | **Font Size** | Font size setting for text throughout the application. Options are Small, Medium, and Large | Small |
+    | **Language and Locale** | The language and locale of that language you wish to use | System Locale |
+    | **Device** | Device that will run your models. Options are `Auto` (GPT4All chooses), `Metal` (Apple Silicon M1+), `CPU`, and `GPU` | `Auto` |
+    | **Default Model** | Choose your preferred LLM to load by default on startup| Auto |
+    | **Suggestion Mode** | Generate suggested follow up questions at the end of responses | When chatting with LocalDocs | 
+    | **Download Path** | Select a destination on your device to save downloaded models | Windows: `C:\Users\{username}\AppData\Local\nomic.ai\GPT4All`<br><br>Mac: `/Users/{username}/Library/Application Support/nomic.ai/GPT4All/`<br><br>Linux: `/home/{username}/.local/share/nomic.ai/GPT4All` |
+    | **Enable Datalake** | Opt-in to sharing interactions with GPT4All community (**anonymous** and **optional**) | Off |
+
+!!! note "Advanced Application Settings"
+
+    | Setting | Description | Default Value |
+    | --- | --- | --- |
+    | **CPU Threads** | Number of concurrently running CPU threads (more can speed up responses) | 4 |
+    | **Enable System Tray** | The application will minimize to the system tray / taskbar when the window is closed | Off |
+    | **Enable Local Server** | Allow any application on your device to use GPT4All via an OpenAI-compatible GPT4All API | Off |
+    | **API Server Port** | Local HTTP port for the local API server | 4891 |
+
+## Model Settings
+
+!!! note "Model / Character Settings"
+
+    | Setting | Description | Default Value |
+    | --- | --- | --- |
+    | **Name** | Unique name of this model / character| set by model uploader |
+    | **Model File** | Filename (.gguf) of the model | set by model uploader |
+    | **System Message** | General instructions for the chats this model will be used for | set by model uploader |
+    | **Chat Template** | Format of user <-> assistant interactions for the chats this model will be used for | set by model uploader |
+    | **Chat Name Prompt** | Prompt used to automatically generate chat names | Describe the above conversation in seven words or less. |
+    | **Suggested FollowUp Prompt** | Prompt used to automatically generate follow up questions after a chat response | Suggest three very short factual follow-up questions that have not been answered yet or cannot be found inspired by the previous conversation and excerpts. |
+
+### Clone
+
+You can **clone** an existing model, which allows you to save a configuration of a model file with different prompt templates and sampling settings.
+
+### Sampling Settings
+
+!!! note "Model Sampling Settings"
+
+    | Setting             | Description                          | Default Value |
+    |----------------------------|------------------------------------------|-----------|
+    | **Context Length**         | Maximum length of input sequence in tokens        | 2048      |
+    | **Max Length**             | Maximum length of response in tokens     | 4096      |
+    | **Prompt Batch Size**      | Token batch size for parallel processing | 128      |
+    | **Temperature**            | Lower temperature gives more likely generations | 0.7       |
+    | **Top P**                  | Prevents choosing highly unlikely tokens  | 0.4       |
+    | **Top K**                  | Size of selection pool for tokens         | 40        |
+    | **Min P**                  | Minimum relative probability              | 0         |
+    | **Repeat Penalty Tokens**  | Length to apply penalty                   | 64        |
+    | **Repeat Penalty**         | Penalize repetitiveness                   | 1.18      |
+    | **GPU Layers**             | How many model layers to load into VRAM     | 32        |
+
+## LocalDocs Settings
+
+!!! note "General LocalDocs Settings"
+
+    | Setting | Description | Default Value |
+    | --- | --- | --- |
+    | **Allowed File Extensions** | Choose which file types will be indexed into LocalDocs collections as text snippets with embedding vectors | `.txt`, `.pdf`, `.md`, `.rst` |
+    | **Use Nomic Embed API** | Use Nomic API to create LocalDocs collections fast and off-device; [Nomic API Key](https://atlas.nomic.ai/) required | Off |
+    | **Embeddings Device** | Device that will run embedding models. Options are `Auto` (GPT4All chooses), `Metal` (Apple Silicon M1+), `CPU`, and `GPU` | `Auto` |
+    | **Show Sources** | Titles of source files retrieved by LocalDocs will be displayed directly in your chats.| On |
+
+!!! note "Advanced LocalDocs Settings"
+
+    Note that increasing these settings can increase the likelihood of factual responses, but may result in slower generation times.
+
+    | Setting | Description | Default Value |
+    | --- | --- | --- |
+    | **Document Snippet Size** | Number of string characters per document snippet | 512 |
+    | **Maximum Document Snippets Per Prompt** | Upper limit for the number of snippets from your files LocalDocs can retrieve for LLM context | 3 |
--- a/docs/gpt4all_help/faq.md
+++ b/docs/gpt4all_help/faq.md
@@ -0,0 +1,27 @@
+# Frequently Asked Questions
+
+## Models
+
+### Which language models are supported?
+
+We support models with a `llama.cpp` implementation which have been uploaded to [HuggingFace](https://huggingface.co/).
+
+## Software
+
+### What software do I need?
+
+All you need is to [install GPT4all](../index.md) onto you Windows, Mac, or Linux computer.
+
+### Is there an API?
+
+Yes, you can run your model in server-mode with our [OpenAI-compatible API](https://platform.openai.com/docs/api-reference/completions), which you can configure in [settings](../gpt4all_desktop/settings.md#application-settings)
+
+## Hardware
+
+### What hardware do I need?
+
+GPT4All can run on CPU, Metal (Apple Silicon M1+), and GPU.
+
+### What are the system requirements?
+
+Your CPU needs to support [AVX or AVX2 instructions](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) and you need enough RAM to load a model into memory.
--- a/docs/gpt4all_help/troubleshooting.md
+++ b/docs/gpt4all_help/troubleshooting.md
@@ -0,0 +1,27 @@
+# Troubleshooting
+
+## Error Loading Models
+
+It is possible you are trying to load a model from HuggingFace whose weights are not compatible with our [backend](https://github.com/nomic-ai/gpt4all/tree/main/gpt4all-backend).
+
+Try downloading one of the officially supported models listed on the main models page in the application. If the problem persists, please share your experience on our [Discord](https://discord.com/channels/1076964370942267462).
+
+## Bad Responses 
+
+Try the [example chats](../gpt4all_desktop/chats.md) to double check that your system is implementing models correctly.
+
+### Responses Incoherent
+
+If you are seeing something **not at all** resembling the [example chats](../gpt4all_desktop/chats.md) - for example, if the responses you are seeing look nonsensical - try [downloading a different model](../gpt4all_desktop/models.md), and please share your experience on our [Discord](https://discord.com/channels/1076964370942267462).
+
+### Responses Incorrect
+
+LLMs can be unreliable. It's helpful to know what their training data was - they are less likely to be correct when asking about data they were not trained on unless you give the necessary information in the prompt as **context**.
+
+Giving LLMs additional context, like chatting using [LocalDocs](../gpt4all_desktop/localdocs.md), can help merge the language model's ability to understand text with the files that you trust to contain the information you need. 
+
+Including information in a prompt is not a guarantee that it will be used correctly, but the more clear and concise your prompts, and the more relevant your prompts are to your files, the better.
+
+### LocalDocs Issues
+
+Occasionally a model - particularly a smaller or overall weaker LLM - may not use the relevant text snippets from the files that were referenced via LocalDocs. If you are seeing this, it can help to use phrases like "in the docs" or "from the provided files" when prompting your model.
--- a/docs/index.md
+++ b/docs/index.md
@@ -0,0 +1,14 @@
+# GPT4All Documentation
+
+GPT4All runs large language models (LLMs) privately on everyday desktops & laptops. 
+
+No API calls or GPUs required - you can just download the application and [get started](gpt4all_desktop/quickstart.md#quickstart).
+
+!!! note "Desktop Application"
+    GPT4All runs LLMs as an application on your computer. Nomic's embedding models can bring information from your local documents and files into your chats. It's fast, on-device, and completely **private**.
+
+    <div style="text-align: center; margin-top: 20px;">
+        [Download for Windows](https://gpt4all.io/installers/gpt4all-installer-win64.exe) &nbsp;&nbsp;&nbsp;&nbsp;
+        [Download for Mac](https://gpt4all.io/installers/gpt4all-installer-darwin.dmg) &nbsp;&nbsp;&nbsp;&nbsp;
+        [Download for Linux](https://gpt4all.io/installers/gpt4all-installer-linux.run)
+    </div>
--- a/docs/old/gpt4all_chat.md
+++ b/docs/old/gpt4all_chat.md
@@ -0,0 +1,140 @@
+# GPT4All Chat UI
+
+The [GPT4All Chat Client](https://gpt4all.io) lets you easily interact with any local large language model.
+
+It is optimized to run 7-13B parameter LLMs on the CPU's of any computer running OSX/Windows/Linux.
+
+## Running LLMs on CPU
+The GPT4All Chat UI supports models from all newer versions of `llama.cpp` with `GGUF` models including the `Mistral`, `LLaMA2`, `LLaMA`, `OpenLLaMa`, `Falcon`, `MPT`, `Replit`, `Starcoder`, and `Bert` architectures
+
+GPT4All maintains an official list of recommended models located in [models3.json](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models3.json). You can pull request new models to it and if accepted they will show up in the official download dialog.
+
+#### Sideloading any GGUF model
+If a model is compatible with the gpt4all-backend, you can sideload it into GPT4All Chat by:
+
+1. Downloading your model in GGUF format. It should be a 3-8 GB file similar to the ones [here](https://huggingface.co/TheBloke/Orca-2-7B-GGUF/tree/main).
+2. Identifying your GPT4All model downloads folder. This is the path listed at the bottom of the downloads dialog.
+3. Placing your downloaded model inside GPT4All's model downloads folder.
+4. Restarting your GPT4ALL app. Your model should appear in the model selection list.
+
+## Plugins
+GPT4All Chat Plugins allow you to expand the capabilities of Local LLMs.
+
+### LocalDocs Plugin (Chat With Your Data)
+LocalDocs is a GPT4All feature that allows you to chat with your local files and data.
+It allows you to utilize powerful local LLMs to chat with private data without any data leaving your computer or server.
+When using LocalDocs, your LLM will cite the sources that most likely contributed to a given output. Note, even an LLM equipped with LocalDocs can hallucinate. The LocalDocs plugin will utilize your documents to help answer prompts and you will see references appear below the response.
+
+<p align="center">
+  <img width="70%" src="https://github.com/nomic-ai/gpt4all/assets/10168/fe5dd3c0-b3cc-4701-98d3-0280dfbcf26f">
+</p>
+
+#### Enabling LocalDocs
+1. Install the latest version of GPT4All Chat from [GPT4All Website](https://gpt4all.io).
+2. Go to `Settings > LocalDocs tab`.
+3. Download the SBert model
+4. Configure a collection (folder) on your computer that contains the files your LLM should have access to. You can alter the contents of the folder/directory at anytime. As you
+add more files to your collection, your LLM will dynamically be able to access them.
+5. Spin up a chat session with any LLM (including external ones like ChatGPT but warning data will leave your machine!)
+6. At the top right, click the database icon and select which collection you want your LLM to know about during your chat session.
+7. You can begin searching with your localdocs even before the collection has completed indexing, but note the search will not include those parts of the collection yet to be indexed.
+
+#### LocalDocs Capabilities
+LocalDocs allows your LLM to have context about the contents of your documentation collection.
+
+LocalDocs **can**:
+
+- Query your documents based upon your prompt / question. Your documents will be searched for snippets that can be used to provide context for an answer. The most relevant snippets will be inserted into your prompts context, but it will be up to the underlying model to decide how best to use the provided context.
+
+LocalDocs **cannot**:
+
+- Answer general metadata queries (e.g. `What documents do you know about?`, `Tell me about my documents`)
+- Summarize a single document (e.g. `Summarize my magna carta PDF.`)
+
+See the Troubleshooting section for common issues.
+
+#### How LocalDocs Works
+LocalDocs works by maintaining an index of all data in the directory your collection is linked to. This index
+consists of small chunks of each document that the LLM can receive as additional input when you ask it a question.
+The general technique this plugin uses is called [Retrieval Augmented Generation](https://arxiv.org/abs/2005.11401).
+
+These document chunks help your LLM respond to queries with knowledge about the contents of your data.
+The number of chunks and the size of each chunk can be configured in the LocalDocs plugin settings tab.
+
+LocalDocs currently supports plain text files (`.txt`, `.md`, and `.rst`) and PDF files (`.pdf`).
+
+#### Troubleshooting and FAQ
+*My LocalDocs plugin isn't using my documents*
+
+- Make sure LocalDocs is enabled for your chat session (the DB icon on the top-right should have a border)
+- If your document collection is large, wait 1-2 minutes for it to finish indexing.
+
+
+#### LocalDocs Roadmap
+- Customize model fine-tuned with retrieval in the loop.
+- Plugin compatibility with chat client server mode.
+
+## Server Mode
+
+GPT4All Chat comes with a built-in server mode allowing you to programmatically interact
+with any supported local LLM through a *very familiar* HTTP API. You can find the API documentation [here](https://platform.openai.com/docs/api-reference/completions).
+
+Enabling server mode in the chat client will spin-up on an HTTP server running on `localhost` port
+`4891` (the reverse of 1984). You can enable the webserver via `GPT4All Chat > Settings > Enable web server`.
+
+Begin using local LLMs in your AI powered apps by changing a single line of code: the base path for requests.
+
+```python
+import openai
+
+openai.api_base = "http://localhost:4891/v1"
+#openai.api_base = "https://api.openai.com/v1"
+
+openai.api_key = "not needed for a local LLM"
+
+# Set up the prompt and other parameters for the API request
+prompt = "Who is Michael Jordan?"
+
+# model = "gpt-3.5-turbo"
+#model = "mpt-7b-chat"
+model = "gpt4all-j-v1.3-groovy"
+
+# Make the API request
+response = openai.Completion.create(
+    model=model,
+    prompt=prompt,
+    max_tokens=50,
+    temperature=0.28,
+    top_p=0.95,
+    n=1,
+    echo=True,
+    stream=False
+)
+
+# Print the generated completion
+print(response)
+```
+
+which gives the following response
+
+```json
+{
+  "choices": [
+    {
+      "finish_reason": "stop",
+      "index": 0,
+      "logprobs": null,
+      "text": "Who is Michael Jordan?\nMichael Jordan is a former professional basketball player who played for the Chicago Bulls in the NBA. He was born on December 30, 1963, and retired from playing basketball in 1998."
+    }
+  ],
+  "created": 1684260896,
+  "id": "foobarbaz",
+  "model": "gpt4all-j-v1.3-groovy",
+  "object": "text_completion",
+  "usage": {
+    "completion_tokens": 35,
+    "prompt_tokens": 39,
+    "total_tokens": 74
+  }
+}
+```