This commit introduces several improvements to the prompt formatting logic in `private_gpt/components/llm/prompt_helper.py`:
1. **Llama3PromptStyle**:
* Implemented tool handling capabilities, allowing for the formatting of tool call and tool result messages within the Llama 3 prompt structure.
* Ensured correct usage of BOS, EOT, and other Llama 3 specific tokens.
2. **MistralPromptStyle**:
* Refactored the `_messages_to_prompt` method for more robust handling of various conversational scenarios, including consecutive user messages and initial assistant messages.
* Ensured correct application of `<s>`, `</s>`, and `[INST]` tags.
3. **ChatMLPromptStyle**:
* Corrected the logic for handling system messages to prevent duplication and ensure accurate ChatML formatting (`<|im_start|>role\ncontent<|im_end|>`).
4. **TagPromptStyle**:
* Addressed a FIXME comment by incorporating `<s>` (BOS) and `</s>` (EOS) tokens, making it more suitable for Llama-based models like Vigogne.
* Fixed a minor bug related to enum string conversion.
5. **Unit Tests**:
* Added a new test suite in `tests/components/llm/test_prompt_helper.py`.
* These tests provide comprehensive coverage for all modified prompt styles, verifying correct prompt generation for various inputs, edge cases, and special token placements.
These changes improve the correctness, robustness, and feature set of the supported prompt styles, leading to better compatibility and interaction with the respective language models.
When running private gpt with external ollama API, ollama service
returns 503 on startup because ollama service (traefik) might not be
ready.
- Add healthcheck to ollama service to test for connection to external
ollama
- private-gpt-ollama service depends on ollama being service_healthy
Co-authored-by: Koh Meng Hui <kohmh@duck.com>
* Add default mode option to settings
* Revise default_mode to Literal (enum) and add to settings.yaml
* Revise to pass make check/test
* Default mode: RAG
---------
Co-authored-by: Jason <jason@sowinsight.solutions>
* feat: add retry connection to ollama
When Ollama is running in the docker-compose, traefik is not ready sometimes to route the request, and it fails
* fix: mypy
* chore: block matplotlib to fix installation in window machines
* chore: remove workaround, just update poetry.lock
* fix: update matplotlib to last version
* feat: change ollama default model to llama3.1
* chore: bump versions
* feat: Change default model in local mode to llama3.1
* chore: make sure last poetry version is used
* fix: mypy
* fix: do not add BOS (with last llamacpp-python version)
* Update README.md
Remove the outdated contact form and point to Zylon website for those looking for a ready-to-use enterprise solution built on top of PrivateGPT
* Update README.md
Update text to address the comments
* Update README.md
Improve text
* docs: add troubleshooting
* fix: pass HF token to setup script and prevent to download tokenizer when it is empty
* fix: improve log and disable specific tokenizer by default
* chore: change HF_TOKEN environment to be aligned with default config
* ifx: mypy
* Fix/update concepts.mdx referencing to installation page
The link for `/installation` is broken in the "Main Concepts" page.
The correct path would be `./installation` or maybe `/installation/getting-started/installation`
* fix: docs
---------
Co-authored-by: Javier Martinez <javiermartinezalvarez98@gmail.com>
* Support for Google Gemini LLMs and Embeddings
Initial support for Gemini, enables usage of Google LLMs and embedding models (see settings-gemini.yaml)
Install via
poetry install --extras "llms-gemini embeddings-gemini"
Notes:
* had to bump llama-index-core to later version that supports Gemini
* poetry --no-update did not work: Gemini/llama_index seem to require more (transient) updates to make it work...
* fix: crash when gemini is not selected
* docs: add gemini llm
---------
Co-authored-by: Javier Martinez <javiermartinezalvarez98@gmail.com>