Update README.md

fix: 503 when private gpt gets ollama service (#2104 )
When running private gpt with external ollama API, ollama service returns 503 on startup because ollama service (traefik) might not be ready. - Add healthcheck to ollama service to test for connection to external ollama - private-gpt-ollama service depends on ollama being service_healthy Co-authored-by: Koh Meng Hui <kohmh@duck.com>
2025-05-15 11:49:27 +00:00 · 2024-11-13 20:29:56 +01:00 · 2024-10-17 12:44:28 +02:00 · 2024-09-26 16:29:52 +02:00 · 2024-09-25 12:00:03 +02:00 · 2024-09-24 08:33:02 +02:00
33 changed files with 3035 additions and 2464 deletions
--- a/.github/release_please/.release-please-config.json
+++ b/.github/release_please/.release-please-config.json
@ -0,0 +1,19 @@
+{
+    "$schema": "https://raw.githubusercontent.com/googleapis/release-please/main/schemas/config.json",
+    "release-type": "simple",
+    "version-file": "version.txt",
+    "extra-files": [
+      {
+        "type": "toml",
+        "path": "pyproject.toml",
+        "jsonpath": "$.tool.poetry.version"
+      },
+      {
+        "type": "generic",
+        "path": "docker-compose.yaml"
+      }
+    ],
+    "packages": {
+      ".": {}
+    }
+  }
--- a/.github/release_please/.release-please-manifest.json
+++ b/.github/release_please/.release-please-manifest.json
@ -0,0 +1,3 @@
+{
+  ".": "0.6.2"
+}
--- a/.github/workflows/generate-release.yml
+++ b/.github/workflows/generate-release.yml
@ -7,7 +7,7 @@ on:

 env:
  REGISTRY: docker.io
-  IMAGE_NAME: ${{ github.repository }}
+  IMAGE_NAME: zylonai/private-gpt
  platforms: linux/amd64,linux/arm64
  DEFAULT_TYPE: "ollama"

--- a/.github/workflows/release-please.yml
+++ b/.github/workflows/release-please.yml
@ -13,7 +13,8 @@ jobs:
  release-please:
    runs-on: ubuntu-latest
    steps:
-      - uses: google-github-actions/release-please-action@v3
+      - uses: google-github-actions/release-please-action@v4
+        id: release
        with:
-          release-type: simple
-          version-file: version.txt
+          config-file: .github/release_please/.release-please-config.json
+          manifest-file: .github/release_please/.release-please-manifest.json
--- a/.github/workflows/tests.yml
+++ b/.github/workflows/tests.yml
@ -14,7 +14,7 @@ jobs:
  setup:
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
      - uses: ./.github/workflows/actions/install_dependencies

  checks:
@ -28,7 +28,7 @@ jobs:
          - ruff
          - mypy
    steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
      - uses: ./.github/workflows/actions/install_dependencies
      - name: run ${{ matrix.quality-command }}
        run: make ${{ matrix.quality-command }}
@ -38,7 +38,7 @@ jobs:
    runs-on: ubuntu-latest
    name: test
    steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
      - uses: ./.github/workflows/actions/install_dependencies
      - name: run test
        run: make test-coverage
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -1,5 +1,15 @@
 # Changelog

+## [0.6.2](https://github.com/zylon-ai/private-gpt/compare/v0.6.1...v0.6.2) (2024-08-08)
+
+
+### Bug Fixes
+
+* add numpy issue to troubleshooting ([#2048](https://github.com/zylon-ai/private-gpt/issues/2048)) ([4ca6d0c](https://github.com/zylon-ai/private-gpt/commit/4ca6d0cb556be7a598f7d3e3b00d2a29214ee1e8))
+* auto-update version ([#2052](https://github.com/zylon-ai/private-gpt/issues/2052)) ([7fefe40](https://github.com/zylon-ai/private-gpt/commit/7fefe408b4267684c6e3c1a43c5dc2b73ec61fe4))
+* publish image name ([#2043](https://github.com/zylon-ai/private-gpt/issues/2043)) ([b1acf9d](https://github.com/zylon-ai/private-gpt/commit/b1acf9dc2cbca2047cd0087f13254ff5cda6e570))
+* update matplotlib to 3.9.1-post1 to fix win install ([b16abbe](https://github.com/zylon-ai/private-gpt/commit/b16abbefe49527ac038d235659854b98345d5387))
+
 ## [0.6.1](https://github.com/zylon-ai/private-gpt/compare/v0.6.0...v0.6.1) (2024-08-05)


--- a/Dockerfile.llamacpp-cpu
+++ b/Dockerfile.llamacpp-cpu
@ -1,6 +1,6 @@
 ### IMPORTANT, THIS IMAGE CAN ONLY BE RUN IN LINUX DOCKER
 ### You will run into a segfault in mac
-FROM python:3.11.6-slim-bookworm as base
+FROM python:3.11.6-slim-bookworm AS base

 # Install poetry
 RUN pip install pipx
@ -20,14 +20,14 @@ RUN apt update && apt install -y \
 # https://python-poetry.org/docs/configuration/#virtualenvsin-project
 ENV POETRY_VIRTUALENVS_IN_PROJECT=true

-FROM base as dependencies
+FROM base AS dependencies
 WORKDIR /home/worker/app
 COPY pyproject.toml poetry.lock ./

 ARG POETRY_EXTRAS="ui embeddings-huggingface llms-llama-cpp vector-stores-qdrant"
 RUN poetry install --no-root --extras "${POETRY_EXTRAS}"

-FROM base as app
+FROM base AS app

 ENV PYTHONUNBUFFERED=1
 ENV PORT=8080
--- a/Dockerfile.ollama
+++ b/Dockerfile.ollama
@ -1,4 +1,4 @@
-FROM python:3.11.6-slim-bookworm as base
+FROM python:3.11.6-slim-bookworm AS base

 # Install poetry
 RUN pip install pipx
@ -10,14 +10,14 @@ ENV PATH=".venv/bin/:$PATH"
 # https://python-poetry.org/docs/configuration/#virtualenvsin-project
 ENV POETRY_VIRTUALENVS_IN_PROJECT=true

-FROM base as dependencies
+FROM base AS dependencies
 WORKDIR /home/worker/app
 COPY pyproject.toml poetry.lock ./

 ARG POETRY_EXTRAS="ui vector-stores-qdrant llms-ollama embeddings-ollama"
 RUN poetry install --no-root --extras "${POETRY_EXTRAS}"

-FROM base as app
+FROM base AS app
 ENV PYTHONUNBUFFERED=1
 ENV PORT=8080
 ENV APP_ENV=prod
--- a/README.md
+++ b/README.md
@ -1,4 +1,6 @@
-# 🔒 PrivateGPT 📑
+# PrivateGPT 
+
+<a href="https://trendshift.io/repositories/2601" target="_blank"><img src="https://trendshift.io/api/badge/repositories/2601" alt="imartinez%2FprivateGPT | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>

 [![Tests](https://github.com/zylon-ai/private-gpt/actions/workflows/tests.yml/badge.svg)](https://github.com/zylon-ai/private-gpt/actions/workflows/tests.yml?query=branch%3Amain)
 [![Website](https://img.shields.io/website?up_message=check%20it&down_message=down&url=https%3A%2F%2Fdocs.privategpt.dev%2F&label=Documentation)](https://docs.privategpt.dev/)
--- a/docker-compose.yaml
+++ b/docker-compose.yaml
@ -7,12 +7,13 @@ services:
  # Private-GPT service for the Ollama CPU and GPU modes
  # This service builds from an external Dockerfile and runs the Ollama mode.
  private-gpt-ollama:
-    image: ${PGPT_IMAGE:-zylonai/private-gpt}${PGPT_TAG:-0.6.1}-ollama
+    image: ${PGPT_IMAGE:-zylonai/private-gpt}:${PGPT_TAG:-0.6.2}-ollama  # x-release-please-version
+    user: root
    build:
      context: .
      dockerfile: Dockerfile.ollama
    volumes:
-      - ./local_data/:/home/worker/app/local_data
+      - ./local_data:/home/worker/app/local_data
    ports:
      - "8001:8001"
    environment:
@ -27,11 +28,15 @@ services:
      - ollama-cpu
      - ollama-cuda
      - ollama-api
+    depends_on:
+      ollama:
+        condition: service_healthy

  # Private-GPT service for the local mode
  # This service builds from a local Dockerfile and runs the application in local mode.
  private-gpt-llamacpp-cpu:
-    image: ${PGPT_IMAGE:-zylonai/private-gpt}${PGPT_TAG:-0.6.1}-llamacpp-cpu
+    image: ${PGPT_IMAGE:-zylonai/private-gpt}:${PGPT_TAG:-0.6.2}-llamacpp-cpu # x-release-please-version
+    user: root
    build:
      context: .
      dockerfile: Dockerfile.llamacpp-cpu
@ -44,7 +49,7 @@ services:
    environment:
      PORT: 8001
      PGPT_PROFILES: local
-      HF_TOKEN: ${HF_TOKEN}
+      HF_TOKEN: ${HF_TOKEN:-}
    profiles:
      - llamacpp-cpu

@ -56,9 +61,14 @@ services:
  # This will route requests to the Ollama service based on the profile.
  ollama:
    image: traefik:v2.10
+    healthcheck:
+      test: ["CMD", "sh", "-c", "wget -q --spider http://ollama:11434 || exit 1"]
+      interval: 10s
+      retries: 3
+      start_period: 5s
+      timeout: 5s
    ports:
-      - "11435:11434"
-      - "8081:8080"
+      - "8080:8080"
    command:
      - "--providers.file.filename=/etc/router.yml"
      - "--log.level=ERROR"
@ -80,15 +90,19 @@ services:
  # Ollama service for the CPU mode
  ollama-cpu:
    image: ollama/ollama:latest
+    ports:
+      - "11434:11434"
    volumes:
      - ./models:/root/.ollama
    profiles:
      - ""
-      - ollama
+      - ollama-cpu

  # Ollama service for the CUDA mode
  ollama-cuda:
    image: ollama/ollama:latest
+    ports:
+      - "11434:11434"
    volumes:
      - ./models:/root/.ollama
    deploy:
--- a/fern/docs/pages/installation/installation.mdx
+++ b/fern/docs/pages/installation/installation.mdx
@ -307,11 +307,12 @@ If you have all required dependencies properly configured running the
 following powershell command should succeed.

 ```powershell
-$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python
+$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
 ```

 If your installation was correct, you should see a message similar to the following next
-time you start the server `BLAS = 1`.
+time you start the server `BLAS = 1`. If there is some issue, please refer to the
+[troubleshooting](/installation/getting-started/troubleshooting#building-llama-cpp-with-nvidia-gpu-support) section.

 ```console
 llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)
@ -339,11 +340,12 @@ Some tips:
 After that running the following command in the repository will install llama.cpp with GPU support:

 ```bash
-CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python
+CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
 ```

 If your installation was correct, you should see a message similar to the following next
-time you start the server `BLAS = 1`.
+time you start the server `BLAS = 1`. If there is some issue, please refer to the
+[troubleshooting](/installation/getting-started/troubleshooting#building-llama-cpp-with-nvidia-gpu-support) section.

 ```
 llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)
--- a/fern/docs/pages/installation/troubleshooting.mdx
+++ b/fern/docs/pages/installation/troubleshooting.mdx
@ -47,3 +47,18 @@ embedding:
  embed_dim: 384
 ```
 </Callout>
+
+# Building Llama-cpp with NVIDIA GPU support
+
+## Out-of-memory error
+
+If you encounter an out-of-memory error while running `llama-cpp` with CUDA, you can try the following steps to resolve the issue:
+1. **Set the next environment:**
+    ```bash
+    TOKENIZERS_PARALLELISM=true
+    ```
+2. **Run PrivateGPT:**
+    ```bash
+    poetry run python -m privategpt
+    ```
+Give thanks to [MarioRossiGithub](https://github.com/MarioRossiGithub) for providing the following solution.
--- a/poetry.lock
+++ b/poetry.lock
--- a/private_gpt/components/embedding/embedding_component.py
+++ b/private_gpt/components/embedding/embedding_component.py
@ -144,6 +144,23 @@ class EmbeddingComponent:
                    api_key=settings.gemini.api_key,
                    model_name=settings.gemini.embedding_model,
                )
+            case "mistralai":
+                try:
+                    from llama_index.embeddings.mistralai import (  # type: ignore
+                        MistralAIEmbedding,
+                    )
+                except ImportError as e:
+                    raise ImportError(
+                        "Mistral dependencies not found, install with `poetry install --extras embeddings-mistral`"
+                    ) from e
+
+                api_key = settings.openai.api_key
+                model = settings.openai.embedding_model
+
+                self.embedding_model = MistralAIEmbedding(
+                    api_key=api_key,
+                    model=model,
+                )
            case "mock":
                # Not a random number, is the dimensionality used by
                # the default embedding model
--- a/private_gpt/components/ingest/ingest_component.py
+++ b/private_gpt/components/ingest/ingest_component.py
@ -403,7 +403,7 @@ class PipelineIngestComponent(BaseIngestComponentWithIndex):
                self.transformations,
                show_progress=self.show_progress,
            )
-            self.node_q.put(("process", file_name, documents, nodes))
+            self.node_q.put(("process", file_name, documents, list(nodes)))
        finally:
            self.doc_semaphore.release()
            self.doc_q.task_done()  # unblock Q joins
--- a/private_gpt/components/ingest/ingest_helper.py
+++ b/private_gpt/components/ingest/ingest_helper.py
@ -92,7 +92,13 @@ class IngestionHelper:
            return string_reader.load_data([file_data.read_text()])

        logger.debug("Specific reader found for extension=%s", extension)
-        return reader_cls().load_data(file_data)
+        documents = reader_cls().load_data(file_data)
+
+        # Sanitize NUL bytes in text which can't be stored in Postgres
+        for i in range(len(documents)):
+            documents[i].text = documents[i].text.replace("\u0000", "")
+
+        return documents

    @staticmethod
    def _exclude_metadata(documents: list[Document]) -> None:
--- a/private_gpt/components/llm/llm_component.py
+++ b/private_gpt/components/llm/llm_component.py
@ -120,7 +120,6 @@ class LLMComponent:
                    api_version="",
                    temperature=settings.llm.temperature,
                    context_window=settings.llm.context_window,
-                    max_new_tokens=settings.llm.max_new_tokens,
                    messages_to_prompt=prompt_style.messages_to_prompt,
                    completion_to_prompt=prompt_style.completion_to_prompt,
                    tokenizer=settings.llm.tokenizer,
@ -184,10 +183,10 @@ class LLMComponent:

                        return wrapper

-                    Ollama.chat = add_keep_alive(Ollama.chat)
-                    Ollama.stream_chat = add_keep_alive(Ollama.stream_chat)
-                    Ollama.complete = add_keep_alive(Ollama.complete)
-                    Ollama.stream_complete = add_keep_alive(Ollama.stream_complete)
+                    Ollama.chat = add_keep_alive(Ollama.chat)  # type: ignore
+                    Ollama.stream_chat = add_keep_alive(Ollama.stream_chat)  # type: ignore
+                    Ollama.complete = add_keep_alive(Ollama.complete)  # type: ignore
+                    Ollama.stream_complete = add_keep_alive(Ollama.stream_complete)  # type: ignore

                self.llm = llm

--- a/private_gpt/components/llm/prompt_helper.py
+++ b/private_gpt/components/llm/prompt_helper.py
@ -40,7 +40,8 @@ class AbstractPromptStyle(abc.ABC):
        logger.debug("Got for messages='%s' the prompt='%s'", messages, prompt)
        return prompt

-    def completion_to_prompt(self, completion: str) -> str:
+    def completion_to_prompt(self, prompt: str) -> str:
+        completion = prompt  # Fix: Llama-index parameter has to be named as prompt
        prompt = self._completion_to_prompt(completion)
        logger.debug("Got for completion='%s' the prompt='%s'", completion, prompt)
        return prompt
@ -285,8 +286,9 @@ class ChatMLPromptStyle(AbstractPromptStyle):


 def get_prompt_style(
-    prompt_style: Literal["default", "llama2", "llama3", "tag", "mistral", "chatml"]
-    | None
+    prompt_style: (
+        Literal["default", "llama2", "llama3", "tag", "mistral", "chatml"] | None
+    )
 ) -> AbstractPromptStyle:
    """Get the prompt style to use from the given string.

--- a/private_gpt/components/node_store/node_store_component.py
+++ b/private_gpt/components/node_store/node_store_component.py
@ -38,10 +38,10 @@ class NodeStoreComponent:

            case "postgres":
                try:
-                    from llama_index.core.storage.docstore.postgres_docstore import (
+                    from llama_index.storage.docstore.postgres import (  # type: ignore
                        PostgresDocumentStore,
                    )
-                    from llama_index.core.storage.index_store.postgres_index_store import (
+                    from llama_index.storage.index_store.postgres import (  # type: ignore
                        PostgresIndexStore,
                    )
                except ImportError:
@ -55,6 +55,7 @@ class NodeStoreComponent:
                self.index_store = PostgresIndexStore.from_params(
                    **settings.postgres.model_dump(exclude_none=True)
                )
+
                self.doc_store = PostgresDocumentStore.from_params(
                    **settings.postgres.model_dump(exclude_none=True)
                )
--- a/private_gpt/components/vector_store/batched_chroma.py
+++ b/private_gpt/components/vector_store/batched_chroma.py
@ -1,14 +1,17 @@
-from collections.abc import Generator
-from typing import Any
+from collections.abc import Generator, Sequence
+from typing import TYPE_CHECKING, Any

 from llama_index.core.schema import BaseNode, MetadataMode
 from llama_index.core.vector_stores.utils import node_to_metadata_dict
 from llama_index.vector_stores.chroma import ChromaVectorStore  # type: ignore

+if TYPE_CHECKING:
+    from collections.abc import Mapping
+

 def chunk_list(
-    lst: list[BaseNode], max_chunk_size: int
-) -> Generator[list[BaseNode], None, None]:
+    lst: Sequence[BaseNode], max_chunk_size: int
+) -> Generator[Sequence[BaseNode], None, None]:
    """Yield successive max_chunk_size-sized chunks from lst.

    Args:
@ -60,7 +63,7 @@ class BatchedChromaVectorStore(ChromaVectorStore):  # type: ignore
        )
        self.chroma_client = chroma_client

-    def add(self, nodes: list[BaseNode], **add_kwargs: Any) -> list[str]:
+    def add(self, nodes: Sequence[BaseNode], **add_kwargs: Any) -> list[str]:
        """Add nodes to index, batching the insertion to avoid issues.

        Args:
@ -78,8 +81,8 @@ class BatchedChromaVectorStore(ChromaVectorStore):  # type: ignore

        all_ids = []
        for node_chunk in node_chunks:
-            embeddings = []
-            metadatas = []
+            embeddings: list[Sequence[float]] = []
+            metadatas: list[Mapping[str, Any]] = []
            ids = []
            documents = []
            for node in node_chunk:
--- a/private_gpt/server/chat/chat_service.py
+++ b/private_gpt/server/chat/chat_service.py
@ -1,4 +1,5 @@
 from dataclasses import dataclass
+from typing import TYPE_CHECKING

 from injector import inject, singleton
 from llama_index.core.chat_engine import ContextChatEngine, SimpleChatEngine
@ -26,6 +27,9 @@ from private_gpt.open_ai.extensions.context_filter import ContextFilter
 from private_gpt.server.chunks.chunks_service import Chunk
 from private_gpt.settings.settings import Settings

+if TYPE_CHECKING:
+    from llama_index.core.postprocessor.types import BaseNodePostprocessor
+

 class Completion(BaseModel):
    response: str
@ -114,12 +118,15 @@ class ChatService:
                context_filter=context_filter,
                similarity_top_k=self.settings.rag.similarity_top_k,
            )
-            node_postprocessors = [
+            node_postprocessors: list[BaseNodePostprocessor] = [
                MetadataReplacementPostProcessor(target_metadata_key="window"),
+            ]
+            if settings.rag.similarity_value:
+                node_postprocessors.append(
                    SimilarityPostprocessor(
                        similarity_cutoff=settings.rag.similarity_value
-                ),
-            ]
+                    )
+                )

            if settings.rag.rerank.enabled:
                rerank_postprocessor = SentenceTransformerRerank(
--- a/private_gpt/server/recipes/summarize/summarize_service.py
+++ b/private_gpt/server/recipes/summarize/summarize_service.py
@ -90,9 +90,9 @@ class SummarizeService:
        # Add context documents to summarize
        if use_context:
            # 1. Recover all ref docs
-            ref_docs: dict[
-                str, RefDocInfo
-            ] | None = self.storage_context.docstore.get_all_ref_doc_info()
+            ref_docs: dict[str, RefDocInfo] | None = (
+                self.storage_context.docstore.get_all_ref_doc_info()
+            )
            if ref_docs is None:
                raise ValueError("No documents have been ingested yet.")

--- a/private_gpt/settings/settings.py
+++ b/private_gpt/settings/settings.py
@ -136,9 +136,8 @@ class LLMSettings(BaseModel):
        0.1,
        description="The temperature of the model. Increasing the temperature will make the model answer more creatively. A value of 0.1 would be more factual.",
    )
-    prompt_style: Literal[
-        "default", "llama2", "llama3", "tag", "mistral", "chatml"
-    ] = Field(
+    prompt_style: Literal["default", "llama2", "llama3", "tag", "mistral", "chatml"] = (
+        Field(
            "llama2",
            description=(
                "The prompt style to use for the chat engine. "
@ -150,6 +149,7 @@ class LLMSettings(BaseModel):
                "`llama2` is the historic behaviour. `default` might work better with your custom models."
            ),
        )
+    )


 class VectorstoreSettings(BaseModel):
@ -197,7 +197,14 @@ class HuggingFaceSettings(BaseModel):

 class EmbeddingSettings(BaseModel):
    mode: Literal[
-        "huggingface", "openai", "azopenai", "sagemaker", "ollama", "mock", "gemini"
+        "huggingface",
+        "openai",
+        "azopenai",
+        "sagemaker",
+        "ollama",
+        "mock",
+        "gemini",
+        "mistralai",
    ]
    ingest_mode: Literal["simple", "batch", "parallel", "pipeline"] = Field(
        "simple",
@ -350,6 +357,10 @@ class AzureOpenAISettings(BaseModel):
 class UISettings(BaseModel):
    enabled: bool
    path: str
+    default_mode: Literal["RAG", "Search", "Basic", "Summarize"] = Field(
+        "RAG",
+        description="The default mode.",
+    )
    default_chat_system_prompt: str = Field(
        None,
        description="The default system prompt to use for the chat mode.",
--- a/private_gpt/ui/ui.py
+++ b/private_gpt/ui/ui.py
@ -1,4 +1,5 @@
 """This file should be imported if and only if you want to run the UI locally."""
+
 import base64
 import logging
 import time
@ -99,8 +100,11 @@ class PrivateGptUi:
        self._selected_filename = None

        # Initialize system prompt based on default mode
-        self.mode = MODES[0]
-        self._system_prompt = self._get_default_system_prompt(self.mode)
+        default_mode_map = {mode.value: mode for mode in Modes}
+        self._default_mode = default_mode_map.get(
+            settings().ui.default_mode, Modes.RAG_MODE
+        )
+        self._system_prompt = self._get_default_system_prompt(self._default_mode)

    def _chat(
        self, message: str, history: list[list[str]], mode: Modes, *_: Any
@ -390,7 +394,7 @@ class PrivateGptUi:

            with gr.Row(equal_height=False):
                with gr.Column(scale=3):
-                    default_mode = MODES[0]
+                    default_mode = self._default_mode
                    mode = gr.Radio(
                        [mode.value for mode in MODES],
                        label="Mode",
--- a/private_gpt/utils/ollama.py
+++ b/private_gpt/utils/ollama.py
@ -3,10 +3,13 @@ from collections import deque
 from collections.abc import Iterator, Mapping
 from typing import Any

+from httpx import ConnectError
 from tqdm import tqdm  # type: ignore

+from private_gpt.utils.retry import retry
+
 try:
-    from ollama import Client  # type: ignore
+    from ollama import Client, ResponseError  # type: ignore
 except ImportError as e:
    raise ImportError(
        "Ollama dependencies not found, install with `poetry install --extras llms-ollama or embeddings-ollama`"
@ -14,13 +17,25 @@ except ImportError as e:

 logger = logging.getLogger(__name__)

+_MAX_RETRIES = 5
+_JITTER = (3.0, 10.0)

+
+@retry(
+    is_async=False,
+    exceptions=(ConnectError, ResponseError),
+    tries=_MAX_RETRIES,
+    jitter=_JITTER,
+    logger=logger,
+)
 def check_connection(client: Client) -> bool:
    try:
        client.list()
        return True
+    except (ConnectError, ResponseError) as e:
+        raise e
    except Exception as e:
-        logger.error(f"Failed to connect to Ollama: {e!s}")
+        logger.error(f"Failed to connect to Ollama: {type(e).__name__}: {e!s}")
        return False


--- a/private_gpt/utils/retry.py
+++ b/private_gpt/utils/retry.py
@ -0,0 +1,31 @@
+import logging
+from collections.abc import Callable
+from typing import Any
+
+from retry_async import retry as retry_untyped  # type: ignore
+
+retry_logger = logging.getLogger(__name__)
+
+
+def retry(
+    exceptions: Any = Exception,
+    *,
+    is_async: bool = False,
+    tries: int = -1,
+    delay: float = 0,
+    max_delay: float | None = None,
+    backoff: float = 1,
+    jitter: float | tuple[float, float] = 0,
+    logger: logging.Logger = retry_logger,
+) -> Callable[..., Any]:
+    wrapped = retry_untyped(
+        exceptions=exceptions,
+        is_async=is_async,
+        tries=tries,
+        delay=delay,
+        max_delay=max_delay,
+        backoff=backoff,
+        jitter=jitter,
+        logger=logger,
+    )
+    return wrapped  # type: ignore
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,88 +1,81 @@
 [tool.poetry]
 name = "private-gpt"
-version = "0.6.0"
+version = "0.6.2"
 description = "Private GPT"
 authors = ["Zylon <hi@zylon.ai>"]

 [tool.poetry.dependencies]
 python = ">=3.11,<3.12"
 # PrivateGPT
-fastapi = { extras = ["all"], version = "^0.111.0" }
-python-multipart = "^0.0.9"
-injector = "^0.21.0"
-pyyaml = "^6.0.1"
+fastapi = { extras = ["all"], version = "^0.115.0" }
+python-multipart = "^0.0.10"
+injector = "^0.22.0"
+pyyaml = "^6.0.2"
 watchdog = "^4.0.1"
-transformers = "^4.42.3"
+transformers = "^4.44.2"
 docx2txt = "^0.8"
 cryptography = "^3.1"
 # LlamaIndex core libs
-llama-index-core = "^0.10.52"
-llama-index-readers-file = "^0.1.27"
+llama-index-core = ">=0.11.2,<0.12.0"
+llama-index-readers-file = "*"
 # Optional LlamaIndex integration libs
-llama-index-llms-llama-cpp = {version = "^0.1.4", optional = true}
-llama-index-llms-openai = {version = "^0.1.25", optional = true}
-llama-index-llms-openai-like = {version ="^0.1.3", optional = true}
-llama-index-llms-ollama = {version ="^0.2.2", optional = true}
-llama-index-llms-azure-openai = {version ="^0.1.8", optional = true}
-llama-index-llms-gemini = {version ="^0.1.11", optional = true}
-llama-index-embeddings-ollama = {version ="^0.1.2", optional = true}
-llama-index-embeddings-huggingface = {version ="^0.2.2", optional = true}
-llama-index-embeddings-openai = {version ="^0.1.10", optional = true}
-llama-index-embeddings-azure-openai = {version ="^0.1.10", optional = true}
-llama-index-embeddings-gemini = {version ="^0.1.8", optional = true}
-llama-index-vector-stores-qdrant = {version ="^0.2.10", optional = true}
-llama-index-vector-stores-milvus = {version ="^0.1.20", optional = true}
-llama-index-vector-stores-chroma = {version ="^0.1.10", optional = true}
-llama-index-vector-stores-postgres = {version ="^0.1.11", optional = true}
-llama-index-vector-stores-clickhouse = {version ="^0.1.3", optional = true}
-llama-index-storage-docstore-postgres = {version ="^0.1.3", optional = true}
-llama-index-storage-index-store-postgres = {version ="^0.1.4", optional = true}
+llama-index-llms-llama-cpp = {version = "*", optional = true}
+llama-index-llms-openai = {version ="*", optional = true}
+llama-index-llms-openai-like = {version ="*", optional = true}
+llama-index-llms-ollama = {version ="*", optional = true}
+llama-index-llms-azure-openai = {version ="*", optional = true}
+llama-index-llms-gemini = {version ="*", optional = true}
+llama-index-embeddings-ollama = {version ="*", optional = true}
+llama-index-embeddings-huggingface = {version ="*", optional = true}
+llama-index-embeddings-openai = {version ="*", optional = true}
+llama-index-embeddings-azure-openai = {version ="*", optional = true}
+llama-index-embeddings-gemini = {version ="*", optional = true}
+llama-index-embeddings-mistralai = {version ="*", optional = true}
+llama-index-vector-stores-qdrant = {version ="*", optional = true}
+llama-index-vector-stores-milvus = {version ="*", optional = true}
+llama-index-vector-stores-chroma = {version ="*", optional = true}
+llama-index-vector-stores-postgres = {version ="*", optional = true}
+llama-index-vector-stores-clickhouse = {version ="*", optional = true}
+llama-index-storage-docstore-postgres = {version ="*", optional = true}
+llama-index-storage-index-store-postgres = {version ="*", optional = true}
 # Postgres
 psycopg2-binary = {version ="^2.9.9", optional = true}
 asyncpg = {version="^0.29.0", optional = true}

 # ClickHouse
-clickhouse-connect = {version = "^0.7.15", optional = true}
+clickhouse-connect = {version = "^0.7.19", optional = true}

 # Optional Sagemaker dependency
-boto3 = {version ="^1.34.139", optional = true}
-
-# Optional Qdrant client
-qdrant-client = {version ="^1.9.0", optional = true}
+boto3 = {version ="^1.35.26", optional = true}

 # Optional Reranker dependencies
-torch = {version ="^2.3.1", optional = true}
-sentence-transformers = {version ="^3.0.1", optional = true}
+torch = {version ="^2.4.1", optional = true}
+sentence-transformers = {version ="^3.1.1", optional = true}

 # Optional UI
-gradio = {version ="^4.37.2", optional = true}
-# Fix: https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/16289#issuecomment-2255106490
-ffmpy = {git = "https://github.com/EuDs63/ffmpy.git", rev = "333a19ee4d21f32537c0508aa1942ef1aa7afe24", optional = true}
-
-# Optional Google Gemini dependency
-google-generativeai = {version ="^0.5.4", optional = true}
-
-# Optional Ollama client
-ollama = {version ="^0.3.0", optional = true}
+gradio = {version ="^4.44.0", optional = true}
+ffmpy = {version ="^0.4.0", optional = true}

 # Optional HF Transformers
 einops = {version = "^0.8.0", optional = true}
+retry-async = "^0.1.4"

 [tool.poetry.extras]
 ui = ["gradio", "ffmpy"]
 llms-llama-cpp = ["llama-index-llms-llama-cpp"]
 llms-openai = ["llama-index-llms-openai"]
 llms-openai-like = ["llama-index-llms-openai-like"]
-llms-ollama = ["llama-index-llms-ollama", "ollama"]
+llms-ollama = ["llama-index-llms-ollama"]
 llms-sagemaker = ["boto3"]
 llms-azopenai = ["llama-index-llms-azure-openai"]
-llms-gemini = ["llama-index-llms-gemini", "google-generativeai"]
-embeddings-ollama = ["llama-index-embeddings-ollama", "ollama"]
+llms-gemini = ["llama-index-llms-gemini"]
+embeddings-ollama = ["llama-index-embeddings-ollama"]
 embeddings-huggingface = ["llama-index-embeddings-huggingface", "einops"]
 embeddings-openai = ["llama-index-embeddings-openai"]
 embeddings-sagemaker = ["boto3"]
 embeddings-azopenai = ["llama-index-embeddings-azure-openai"]
 embeddings-gemini = ["llama-index-embeddings-gemini"]
+embeddings-mistral = ["llama-index-embeddings-mistralai"]
 vector-stores-qdrant = ["llama-index-vector-stores-qdrant"]
 vector-stores-clickhouse = ["llama-index-vector-stores-clickhouse", "clickhouse_connect"]
 vector-stores-chroma = ["llama-index-vector-stores-chroma"]
@ -92,14 +85,14 @@ storage-nodestore-postgres = ["llama-index-storage-docstore-postgres","llama-ind
 rerank-sentence-transformers = ["torch", "sentence-transformers"]

 [tool.poetry.group.dev.dependencies]
-black = "^22"
-mypy = "^1.2"
-pre-commit = "^2"
-pytest = "^7"
-pytest-cov = "^3"
+black = "^24"
+mypy = "^1.11"
+pre-commit = "^3"
+pytest = "^8"
+pytest-cov = "^5"
 ruff = "^0"
-pytest-asyncio = "^0.21.1"
-types-pyyaml = "^6.0.12.12"
+pytest-asyncio = "^0.24.0"
+types-pyyaml = "^6.0.12.20240917"

 [build-system]
 requires = ["poetry-core>=1.0.0"]
--- a/settings.yaml
+++ b/settings.yaml
@ -25,6 +25,8 @@ data:
 ui:
  enabled: true
  path: /
+  # "RAG", "Search", "Basic", or "Summarize"
+  default_mode: "RAG"
  default_chat_system_prompt: >
    You are a helpful, respectful and honest assistant.
    Always answer as helpfully as possible and follow ALL given instructions.
--- a/tests/fixtures/fast_api_test_client.py
+++ b/tests/fixtures/fast_api_test_client.py
@ -5,7 +5,7 @@ from private_gpt.launcher import create_app
 from tests.fixtures.mock_injector import MockInjector


-@pytest.fixture()
+@pytest.fixture
 def test_client(request: pytest.FixtureRequest, injector: MockInjector) -> TestClient:
    if request is not None and hasattr(request, "param"):
        injector.bind_settings(request.param or {})
--- a/tests/fixtures/ingest_helper.py
+++ b/tests/fixtures/ingest_helper.py
@ -19,6 +19,6 @@ class IngestHelper:
        return ingest_result


-@pytest.fixture()
+@pytest.fixture
 def ingest_helper(test_client: TestClient) -> IngestHelper:
    return IngestHelper(test_client)
--- a/tests/fixtures/mock_injector.py
+++ b/tests/fixtures/mock_injector.py
@ -37,6 +37,6 @@ class MockInjector:
        return self.test_injector.get(interface)


-@pytest.fixture()
+@pytest.fixture
 def injector() -> MockInjector:
    return MockInjector()
--- a/tests/server/ingest/test_local_ingest.py
+++ b/tests/server/ingest/test_local_ingest.py
@ -6,7 +6,7 @@ import pytest
 from fastapi.testclient import TestClient


-@pytest.fixture()
+@pytest.fixture
 def file_path() -> str:
    return "test.txt"

--- a/version.txt
+++ b/version.txt
@ -1 +1 @@
-0.6.1
+0.6.2
Author	SHA1	Message	Date
Iván Martínez	b7ee43788d	Update README.md	2024-11-13 20:29:56 +01:00
meng-hui	940bdd49af	fix: 503 when private gpt gets ollama service (#2104 ) When running private gpt with external ollama API, ollama service returns 503 on startup because ollama service (traefik) might not be ready. - Add healthcheck to ollama service to test for connection to external ollama - private-gpt-ollama service depends on ollama being service_healthy Co-authored-by: Koh Meng Hui <kohmh@duck.com>	2024-10-17 12:44:28 +02:00
Javier Martinez	5851b02378	feat: update llama-index + dependencies (#2092 ) * chore: update libraries * fix: mypy * chore: more updates * fix: mypy/black * chore: fix docker warnings * fix: mypy * fix: black	2024-09-26 16:29:52 +02:00
Dmitri Qiu	5fbb402477	fix: Sanitize null bytes before ingestion (#2090 ) * Sanitize null bytes before ingestion * Added comments	2024-09-25 12:00:03 +02:00
J	fa3c30661d	fix: Add default mode option to settings (#2078 ) * Add default mode option to settings * Revise default_mode to Literal (enum) and add to settings.yaml * Revise to pass make check/test * Default mode: RAG --------- Co-authored-by: Jason <jason@sowinsight.solutions>	2024-09-24 08:33:02 +02:00
Liam Dowd	f9182b3a86	feat: Adding MistralAI mode (#2065 ) * Adding MistralAI mode * Update embedding_component.py * Update ui.py * Update settings.py * Update embedding_component.py * Update settings.py * Update settings.py * Update settings-mistral.yaml * Update llm_component.py * Update settings-mistral.yaml * Update settings.py * Update settings.py * Update ui.py * Update embedding_component.py * Delete settings-mistral.yaml --------- Co-authored-by: SkiingIsFun123 <101684827+SkiingIsFun123@users.noreply.github.com> Co-authored-by: Javier Martinez <javiermartinezalvarez98@gmail.com>	2024-09-24 08:31:30 +02:00
Javier Martinez	8c12c6830b	fix: docker permissions (#2059 ) * fix: missing depends_on * chore: update copy permissions * chore: update entrypoint * Revert "chore: update entrypoint" This reverts commit `f73a36af2f`. * Revert "chore: update copy permissions" This reverts commit `fabc3f66bb`. * style: fix docker warning * fix: multiples fixes * fix: user permissions writing local_data folder	2024-09-24 08:30:58 +02:00
Javier Martinez	77461b96cf	feat: add retry connection to ollama (#2084 ) * feat: add retry connection to ollama When Ollama is running in the docker-compose, traefik is not ready sometimes to route the request, and it fails * fix: mypy	2024-09-16 16:43:05 +02:00
Trivikram Kamat	42628596b2	ci: bump actions/checkout to v4 (#2077 )	2024-09-09 08:53:13 +02:00
Artur Martins	7603b3627d	fix: Rectify ffmpy poetry config; update version from 0.3.2 to 0.4.0 (#2062 ) * Fix: Rectify ffmpy 0.3.2 poetry config * keep optional set to false for ffmpy * Updating ffmpy to version 0.4.0 * Remove comment about a fix	2024-08-21 10:39:58 +02:00
Javier Martinez	89477ea9d3	fix: naming image and ollama-cpu (#2056 )	2024-08-12 08:23:16 +02:00
github-actions[bot]	22904ca8ad	chore(main): release 0.6.2 (#2049 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-08-08 18:16:41 +02:00
Javier Martinez	7fefe408b4	fix: auto-update version (#2052 )	2024-08-08 16:50:42 +02:00
Javier Martinez	b1acf9dc2c	fix: publish image name (#2043 )	2024-08-07 17:39:32 +02:00
Javier Martinez	4ca6d0cb55	fix: add numpy issue to troubleshooting (#2048 ) * docs: add numpy issue to troubleshooting * fix: troubleshooting link ...	2024-08-07 12:16:03 +02:00
Javier Martinez	b16abbefe4	fix: update matplotlib to 3.9.1-post1 to fix win install * chore: block matplotlib to fix installation in window machines * chore: remove workaround, just update poetry.lock * fix: update matplotlib to last version	2024-08-07 11:26:42 +02:00
 @ -1 +1 @@
 .6.1
 .6.2