Compare commits

..

16 Commits
v0.6.1 ... main

Author SHA1 Message Date
Iván Martínez
b7ee43788d
Update README.md 2024-11-13 20:29:56 +01:00
meng-hui
940bdd49af
fix: 503 when private gpt gets ollama service (#2104)
When running private gpt with external ollama API, ollama service
returns 503 on startup because ollama service (traefik) might not be
ready.

- Add healthcheck to ollama service to test for connection to external
ollama
- private-gpt-ollama service depends on ollama being service_healthy

Co-authored-by: Koh Meng Hui <kohmh@duck.com>
2024-10-17 12:44:28 +02:00
Javier Martinez
5851b02378
feat: update llama-index + dependencies (#2092)
* chore: update libraries

* fix: mypy

* chore: more updates

* fix: mypy/black

* chore: fix docker warnings

* fix: mypy

* fix: black
2024-09-26 16:29:52 +02:00
Dmitri Qiu
5fbb402477
fix: Sanitize null bytes before ingestion (#2090)
* Sanitize null bytes before ingestion

* Added comments
2024-09-25 12:00:03 +02:00
J
fa3c30661d
fix: Add default mode option to settings (#2078)
* Add default mode option to settings

* Revise default_mode to Literal (enum) and add to settings.yaml

* Revise to pass make check/test

* Default mode: RAG

---------

Co-authored-by: Jason <jason@sowinsight.solutions>
2024-09-24 08:33:02 +02:00
Liam Dowd
f9182b3a86
feat: Adding MistralAI mode (#2065)
* Adding MistralAI mode

* Update embedding_component.py

* Update ui.py

* Update settings.py

* Update embedding_component.py

* Update settings.py

* Update settings.py

* Update settings-mistral.yaml

* Update llm_component.py

* Update settings-mistral.yaml

* Update settings.py

* Update settings.py

* Update ui.py

* Update embedding_component.py

* Delete settings-mistral.yaml

---------

Co-authored-by: SkiingIsFun123 <101684827+SkiingIsFun123@users.noreply.github.com>
Co-authored-by: Javier Martinez <javiermartinezalvarez98@gmail.com>
2024-09-24 08:31:30 +02:00
Javier Martinez
8c12c6830b
fix: docker permissions (#2059)
* fix: missing depends_on

* chore: update copy permissions

* chore: update entrypoint

* Revert "chore: update entrypoint"

This reverts commit f73a36af2f.

* Revert "chore: update copy permissions"

This reverts commit fabc3f66bb.

* style: fix docker warning

* fix: multiples fixes

* fix: user permissions writing local_data folder
2024-09-24 08:30:58 +02:00
Javier Martinez
77461b96cf
feat: add retry connection to ollama (#2084)
* feat: add retry connection to ollama

When Ollama is running in the docker-compose, traefik is not ready sometimes to route the request, and it fails

* fix: mypy
2024-09-16 16:43:05 +02:00
Trivikram Kamat
42628596b2
ci: bump actions/checkout to v4 (#2077) 2024-09-09 08:53:13 +02:00
Artur Martins
7603b3627d
fix: Rectify ffmpy poetry config; update version from 0.3.2 to 0.4.0 (#2062)
* Fix: Rectify ffmpy 0.3.2 poetry config

* keep optional set to false for ffmpy

* Updating ffmpy to version 0.4.0

* Remove comment about a fix
2024-08-21 10:39:58 +02:00
Javier Martinez
89477ea9d3
fix: naming image and ollama-cpu (#2056) 2024-08-12 08:23:16 +02:00
github-actions[bot]
22904ca8ad
chore(main): release 0.6.2 (#2049)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-08-08 18:16:41 +02:00
Javier Martinez
7fefe408b4
fix: auto-update version (#2052) 2024-08-08 16:50:42 +02:00
Javier Martinez
b1acf9dc2c
fix: publish image name (#2043) 2024-08-07 17:39:32 +02:00
Javier Martinez
4ca6d0cb55
fix: add numpy issue to troubleshooting (#2048)
* docs: add numpy issue to troubleshooting

* fix: troubleshooting link

...
2024-08-07 12:16:03 +02:00
Javier Martinez
b16abbefe4
fix: update matplotlib to 3.9.1-post1 to fix win install
* chore: block matplotlib to fix installation in window machines

* chore: remove workaround, just update poetry.lock

* fix: update matplotlib to last version
2024-08-07 11:26:42 +02:00
33 changed files with 3035 additions and 2464 deletions

View File

@ -0,0 +1,19 @@
{
"$schema": "https://raw.githubusercontent.com/googleapis/release-please/main/schemas/config.json",
"release-type": "simple",
"version-file": "version.txt",
"extra-files": [
{
"type": "toml",
"path": "pyproject.toml",
"jsonpath": "$.tool.poetry.version"
},
{
"type": "generic",
"path": "docker-compose.yaml"
}
],
"packages": {
".": {}
}
}

View File

@ -0,0 +1,3 @@
{
".": "0.6.2"
}

View File

@ -7,7 +7,7 @@ on:
env:
REGISTRY: docker.io
IMAGE_NAME: ${{ github.repository }}
IMAGE_NAME: zylonai/private-gpt
platforms: linux/amd64,linux/arm64
DEFAULT_TYPE: "ollama"

View File

@ -13,7 +13,8 @@ jobs:
release-please:
runs-on: ubuntu-latest
steps:
- uses: google-github-actions/release-please-action@v3
- uses: google-github-actions/release-please-action@v4
id: release
with:
release-type: simple
version-file: version.txt
config-file: .github/release_please/.release-please-config.json
manifest-file: .github/release_please/.release-please-manifest.json

View File

@ -14,7 +14,7 @@ jobs:
setup:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- uses: ./.github/workflows/actions/install_dependencies
checks:
@ -28,7 +28,7 @@ jobs:
- ruff
- mypy
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- uses: ./.github/workflows/actions/install_dependencies
- name: run ${{ matrix.quality-command }}
run: make ${{ matrix.quality-command }}
@ -38,7 +38,7 @@ jobs:
runs-on: ubuntu-latest
name: test
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- uses: ./.github/workflows/actions/install_dependencies
- name: run test
run: make test-coverage

View File

@ -1,5 +1,15 @@
# Changelog
## [0.6.2](https://github.com/zylon-ai/private-gpt/compare/v0.6.1...v0.6.2) (2024-08-08)
### Bug Fixes
* add numpy issue to troubleshooting ([#2048](https://github.com/zylon-ai/private-gpt/issues/2048)) ([4ca6d0c](https://github.com/zylon-ai/private-gpt/commit/4ca6d0cb556be7a598f7d3e3b00d2a29214ee1e8))
* auto-update version ([#2052](https://github.com/zylon-ai/private-gpt/issues/2052)) ([7fefe40](https://github.com/zylon-ai/private-gpt/commit/7fefe408b4267684c6e3c1a43c5dc2b73ec61fe4))
* publish image name ([#2043](https://github.com/zylon-ai/private-gpt/issues/2043)) ([b1acf9d](https://github.com/zylon-ai/private-gpt/commit/b1acf9dc2cbca2047cd0087f13254ff5cda6e570))
* update matplotlib to 3.9.1-post1 to fix win install ([b16abbe](https://github.com/zylon-ai/private-gpt/commit/b16abbefe49527ac038d235659854b98345d5387))
## [0.6.1](https://github.com/zylon-ai/private-gpt/compare/v0.6.0...v0.6.1) (2024-08-05)

View File

@ -1,6 +1,6 @@
### IMPORTANT, THIS IMAGE CAN ONLY BE RUN IN LINUX DOCKER
### You will run into a segfault in mac
FROM python:3.11.6-slim-bookworm as base
FROM python:3.11.6-slim-bookworm AS base
# Install poetry
RUN pip install pipx
@ -20,14 +20,14 @@ RUN apt update && apt install -y \
# https://python-poetry.org/docs/configuration/#virtualenvsin-project
ENV POETRY_VIRTUALENVS_IN_PROJECT=true
FROM base as dependencies
FROM base AS dependencies
WORKDIR /home/worker/app
COPY pyproject.toml poetry.lock ./
ARG POETRY_EXTRAS="ui embeddings-huggingface llms-llama-cpp vector-stores-qdrant"
RUN poetry install --no-root --extras "${POETRY_EXTRAS}"
FROM base as app
FROM base AS app
ENV PYTHONUNBUFFERED=1
ENV PORT=8080

View File

@ -1,4 +1,4 @@
FROM python:3.11.6-slim-bookworm as base
FROM python:3.11.6-slim-bookworm AS base
# Install poetry
RUN pip install pipx
@ -10,14 +10,14 @@ ENV PATH=".venv/bin/:$PATH"
# https://python-poetry.org/docs/configuration/#virtualenvsin-project
ENV POETRY_VIRTUALENVS_IN_PROJECT=true
FROM base as dependencies
FROM base AS dependencies
WORKDIR /home/worker/app
COPY pyproject.toml poetry.lock ./
ARG POETRY_EXTRAS="ui vector-stores-qdrant llms-ollama embeddings-ollama"
RUN poetry install --no-root --extras "${POETRY_EXTRAS}"
FROM base as app
FROM base AS app
ENV PYTHONUNBUFFERED=1
ENV PORT=8080
ENV APP_ENV=prod

View File

@ -1,4 +1,6 @@
# 🔒 PrivateGPT 📑
# PrivateGPT
<a href="https://trendshift.io/repositories/2601" target="_blank"><img src="https://trendshift.io/api/badge/repositories/2601" alt="imartinez%2FprivateGPT | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
[![Tests](https://github.com/zylon-ai/private-gpt/actions/workflows/tests.yml/badge.svg)](https://github.com/zylon-ai/private-gpt/actions/workflows/tests.yml?query=branch%3Amain)
[![Website](https://img.shields.io/website?up_message=check%20it&down_message=down&url=https%3A%2F%2Fdocs.privategpt.dev%2F&label=Documentation)](https://docs.privategpt.dev/)

View File

@ -7,12 +7,13 @@ services:
# Private-GPT service for the Ollama CPU and GPU modes
# This service builds from an external Dockerfile and runs the Ollama mode.
private-gpt-ollama:
image: ${PGPT_IMAGE:-zylonai/private-gpt}${PGPT_TAG:-0.6.1}-ollama
image: ${PGPT_IMAGE:-zylonai/private-gpt}:${PGPT_TAG:-0.6.2}-ollama # x-release-please-version
user: root
build:
context: .
dockerfile: Dockerfile.ollama
volumes:
- ./local_data/:/home/worker/app/local_data
- ./local_data:/home/worker/app/local_data
ports:
- "8001:8001"
environment:
@ -27,11 +28,15 @@ services:
- ollama-cpu
- ollama-cuda
- ollama-api
depends_on:
ollama:
condition: service_healthy
# Private-GPT service for the local mode
# This service builds from a local Dockerfile and runs the application in local mode.
private-gpt-llamacpp-cpu:
image: ${PGPT_IMAGE:-zylonai/private-gpt}${PGPT_TAG:-0.6.1}-llamacpp-cpu
image: ${PGPT_IMAGE:-zylonai/private-gpt}:${PGPT_TAG:-0.6.2}-llamacpp-cpu # x-release-please-version
user: root
build:
context: .
dockerfile: Dockerfile.llamacpp-cpu
@ -44,7 +49,7 @@ services:
environment:
PORT: 8001
PGPT_PROFILES: local
HF_TOKEN: ${HF_TOKEN}
HF_TOKEN: ${HF_TOKEN:-}
profiles:
- llamacpp-cpu
@ -56,9 +61,14 @@ services:
# This will route requests to the Ollama service based on the profile.
ollama:
image: traefik:v2.10
healthcheck:
test: ["CMD", "sh", "-c", "wget -q --spider http://ollama:11434 || exit 1"]
interval: 10s
retries: 3
start_period: 5s
timeout: 5s
ports:
- "11435:11434"
- "8081:8080"
- "8080:8080"
command:
- "--providers.file.filename=/etc/router.yml"
- "--log.level=ERROR"
@ -80,15 +90,19 @@ services:
# Ollama service for the CPU mode
ollama-cpu:
image: ollama/ollama:latest
ports:
- "11434:11434"
volumes:
- ./models:/root/.ollama
profiles:
- ""
- ollama
- ollama-cpu
# Ollama service for the CUDA mode
ollama-cuda:
image: ollama/ollama:latest
ports:
- "11434:11434"
volumes:
- ./models:/root/.ollama
deploy:
@ -99,4 +113,4 @@ services:
count: 1
capabilities: [gpu]
profiles:
- ollama-cuda
- ollama-cuda

View File

@ -307,11 +307,12 @@ If you have all required dependencies properly configured running the
following powershell command should succeed.
```powershell
$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python
$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
```
If your installation was correct, you should see a message similar to the following next
time you start the server `BLAS = 1`.
time you start the server `BLAS = 1`. If there is some issue, please refer to the
[troubleshooting](/installation/getting-started/troubleshooting#building-llama-cpp-with-nvidia-gpu-support) section.
```console
llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)
@ -339,11 +340,12 @@ Some tips:
After that running the following command in the repository will install llama.cpp with GPU support:
```bash
CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python
CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
```
If your installation was correct, you should see a message similar to the following next
time you start the server `BLAS = 1`.
time you start the server `BLAS = 1`. If there is some issue, please refer to the
[troubleshooting](/installation/getting-started/troubleshooting#building-llama-cpp-with-nvidia-gpu-support) section.
```
llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)

View File

@ -46,4 +46,19 @@ huggingface:
embedding:
embed_dim: 384
```
</Callout>
</Callout>
# Building Llama-cpp with NVIDIA GPU support
## Out-of-memory error
If you encounter an out-of-memory error while running `llama-cpp` with CUDA, you can try the following steps to resolve the issue:
1. **Set the next environment:**
```bash
TOKENIZERS_PARALLELISM=true
```
2. **Run PrivateGPT:**
```bash
poetry run python -m privategpt
```
Give thanks to [MarioRossiGithub](https://github.com/MarioRossiGithub) for providing the following solution.

5072
poetry.lock generated

File diff suppressed because it is too large Load Diff

View File

@ -144,6 +144,23 @@ class EmbeddingComponent:
api_key=settings.gemini.api_key,
model_name=settings.gemini.embedding_model,
)
case "mistralai":
try:
from llama_index.embeddings.mistralai import ( # type: ignore
MistralAIEmbedding,
)
except ImportError as e:
raise ImportError(
"Mistral dependencies not found, install with `poetry install --extras embeddings-mistral`"
) from e
api_key = settings.openai.api_key
model = settings.openai.embedding_model
self.embedding_model = MistralAIEmbedding(
api_key=api_key,
model=model,
)
case "mock":
# Not a random number, is the dimensionality used by
# the default embedding model

View File

@ -403,7 +403,7 @@ class PipelineIngestComponent(BaseIngestComponentWithIndex):
self.transformations,
show_progress=self.show_progress,
)
self.node_q.put(("process", file_name, documents, nodes))
self.node_q.put(("process", file_name, documents, list(nodes)))
finally:
self.doc_semaphore.release()
self.doc_q.task_done() # unblock Q joins

View File

@ -92,7 +92,13 @@ class IngestionHelper:
return string_reader.load_data([file_data.read_text()])
logger.debug("Specific reader found for extension=%s", extension)
return reader_cls().load_data(file_data)
documents = reader_cls().load_data(file_data)
# Sanitize NUL bytes in text which can't be stored in Postgres
for i in range(len(documents)):
documents[i].text = documents[i].text.replace("\u0000", "")
return documents
@staticmethod
def _exclude_metadata(documents: list[Document]) -> None:

View File

@ -120,7 +120,6 @@ class LLMComponent:
api_version="",
temperature=settings.llm.temperature,
context_window=settings.llm.context_window,
max_new_tokens=settings.llm.max_new_tokens,
messages_to_prompt=prompt_style.messages_to_prompt,
completion_to_prompt=prompt_style.completion_to_prompt,
tokenizer=settings.llm.tokenizer,
@ -184,10 +183,10 @@ class LLMComponent:
return wrapper
Ollama.chat = add_keep_alive(Ollama.chat)
Ollama.stream_chat = add_keep_alive(Ollama.stream_chat)
Ollama.complete = add_keep_alive(Ollama.complete)
Ollama.stream_complete = add_keep_alive(Ollama.stream_complete)
Ollama.chat = add_keep_alive(Ollama.chat) # type: ignore
Ollama.stream_chat = add_keep_alive(Ollama.stream_chat) # type: ignore
Ollama.complete = add_keep_alive(Ollama.complete) # type: ignore
Ollama.stream_complete = add_keep_alive(Ollama.stream_complete) # type: ignore
self.llm = llm

View File

@ -40,7 +40,8 @@ class AbstractPromptStyle(abc.ABC):
logger.debug("Got for messages='%s' the prompt='%s'", messages, prompt)
return prompt
def completion_to_prompt(self, completion: str) -> str:
def completion_to_prompt(self, prompt: str) -> str:
completion = prompt # Fix: Llama-index parameter has to be named as prompt
prompt = self._completion_to_prompt(completion)
logger.debug("Got for completion='%s' the prompt='%s'", completion, prompt)
return prompt
@ -285,8 +286,9 @@ class ChatMLPromptStyle(AbstractPromptStyle):
def get_prompt_style(
prompt_style: Literal["default", "llama2", "llama3", "tag", "mistral", "chatml"]
| None
prompt_style: (
Literal["default", "llama2", "llama3", "tag", "mistral", "chatml"] | None
)
) -> AbstractPromptStyle:
"""Get the prompt style to use from the given string.

View File

@ -38,10 +38,10 @@ class NodeStoreComponent:
case "postgres":
try:
from llama_index.core.storage.docstore.postgres_docstore import (
from llama_index.storage.docstore.postgres import ( # type: ignore
PostgresDocumentStore,
)
from llama_index.core.storage.index_store.postgres_index_store import (
from llama_index.storage.index_store.postgres import ( # type: ignore
PostgresIndexStore,
)
except ImportError:
@ -55,6 +55,7 @@ class NodeStoreComponent:
self.index_store = PostgresIndexStore.from_params(
**settings.postgres.model_dump(exclude_none=True)
)
self.doc_store = PostgresDocumentStore.from_params(
**settings.postgres.model_dump(exclude_none=True)
)

View File

@ -1,14 +1,17 @@
from collections.abc import Generator
from typing import Any
from collections.abc import Generator, Sequence
from typing import TYPE_CHECKING, Any
from llama_index.core.schema import BaseNode, MetadataMode
from llama_index.core.vector_stores.utils import node_to_metadata_dict
from llama_index.vector_stores.chroma import ChromaVectorStore # type: ignore
if TYPE_CHECKING:
from collections.abc import Mapping
def chunk_list(
lst: list[BaseNode], max_chunk_size: int
) -> Generator[list[BaseNode], None, None]:
lst: Sequence[BaseNode], max_chunk_size: int
) -> Generator[Sequence[BaseNode], None, None]:
"""Yield successive max_chunk_size-sized chunks from lst.
Args:
@ -60,7 +63,7 @@ class BatchedChromaVectorStore(ChromaVectorStore): # type: ignore
)
self.chroma_client = chroma_client
def add(self, nodes: list[BaseNode], **add_kwargs: Any) -> list[str]:
def add(self, nodes: Sequence[BaseNode], **add_kwargs: Any) -> list[str]:
"""Add nodes to index, batching the insertion to avoid issues.
Args:
@ -78,8 +81,8 @@ class BatchedChromaVectorStore(ChromaVectorStore): # type: ignore
all_ids = []
for node_chunk in node_chunks:
embeddings = []
metadatas = []
embeddings: list[Sequence[float]] = []
metadatas: list[Mapping[str, Any]] = []
ids = []
documents = []
for node in node_chunk:

View File

@ -1,4 +1,5 @@
from dataclasses import dataclass
from typing import TYPE_CHECKING
from injector import inject, singleton
from llama_index.core.chat_engine import ContextChatEngine, SimpleChatEngine
@ -26,6 +27,9 @@ from private_gpt.open_ai.extensions.context_filter import ContextFilter
from private_gpt.server.chunks.chunks_service import Chunk
from private_gpt.settings.settings import Settings
if TYPE_CHECKING:
from llama_index.core.postprocessor.types import BaseNodePostprocessor
class Completion(BaseModel):
response: str
@ -114,12 +118,15 @@ class ChatService:
context_filter=context_filter,
similarity_top_k=self.settings.rag.similarity_top_k,
)
node_postprocessors = [
node_postprocessors: list[BaseNodePostprocessor] = [
MetadataReplacementPostProcessor(target_metadata_key="window"),
SimilarityPostprocessor(
similarity_cutoff=settings.rag.similarity_value
),
]
if settings.rag.similarity_value:
node_postprocessors.append(
SimilarityPostprocessor(
similarity_cutoff=settings.rag.similarity_value
)
)
if settings.rag.rerank.enabled:
rerank_postprocessor = SentenceTransformerRerank(

View File

@ -90,9 +90,9 @@ class SummarizeService:
# Add context documents to summarize
if use_context:
# 1. Recover all ref docs
ref_docs: dict[
str, RefDocInfo
] | None = self.storage_context.docstore.get_all_ref_doc_info()
ref_docs: dict[str, RefDocInfo] | None = (
self.storage_context.docstore.get_all_ref_doc_info()
)
if ref_docs is None:
raise ValueError("No documents have been ingested yet.")

View File

@ -136,19 +136,19 @@ class LLMSettings(BaseModel):
0.1,
description="The temperature of the model. Increasing the temperature will make the model answer more creatively. A value of 0.1 would be more factual.",
)
prompt_style: Literal[
"default", "llama2", "llama3", "tag", "mistral", "chatml"
] = Field(
"llama2",
description=(
"The prompt style to use for the chat engine. "
"If `default` - use the default prompt style from the llama_index. It should look like `role: message`.\n"
"If `llama2` - use the llama2 prompt style from the llama_index. Based on `<s>`, `[INST]` and `<<SYS>>`.\n"
"If `llama3` - use the llama3 prompt style from the llama_index."
"If `tag` - use the `tag` prompt style. It should look like `<|role|>: message`. \n"
"If `mistral` - use the `mistral prompt style. It shoudl look like <s>[INST] {System Prompt} [/INST]</s>[INST] { UserInstructions } [/INST]"
"`llama2` is the historic behaviour. `default` might work better with your custom models."
),
prompt_style: Literal["default", "llama2", "llama3", "tag", "mistral", "chatml"] = (
Field(
"llama2",
description=(
"The prompt style to use for the chat engine. "
"If `default` - use the default prompt style from the llama_index. It should look like `role: message`.\n"
"If `llama2` - use the llama2 prompt style from the llama_index. Based on `<s>`, `[INST]` and `<<SYS>>`.\n"
"If `llama3` - use the llama3 prompt style from the llama_index."
"If `tag` - use the `tag` prompt style. It should look like `<|role|>: message`. \n"
"If `mistral` - use the `mistral prompt style. It shoudl look like <s>[INST] {System Prompt} [/INST]</s>[INST] { UserInstructions } [/INST]"
"`llama2` is the historic behaviour. `default` might work better with your custom models."
),
)
)
@ -197,7 +197,14 @@ class HuggingFaceSettings(BaseModel):
class EmbeddingSettings(BaseModel):
mode: Literal[
"huggingface", "openai", "azopenai", "sagemaker", "ollama", "mock", "gemini"
"huggingface",
"openai",
"azopenai",
"sagemaker",
"ollama",
"mock",
"gemini",
"mistralai",
]
ingest_mode: Literal["simple", "batch", "parallel", "pipeline"] = Field(
"simple",
@ -350,6 +357,10 @@ class AzureOpenAISettings(BaseModel):
class UISettings(BaseModel):
enabled: bool
path: str
default_mode: Literal["RAG", "Search", "Basic", "Summarize"] = Field(
"RAG",
description="The default mode.",
)
default_chat_system_prompt: str = Field(
None,
description="The default system prompt to use for the chat mode.",

View File

@ -1,4 +1,5 @@
"""This file should be imported if and only if you want to run the UI locally."""
import base64
import logging
import time
@ -99,8 +100,11 @@ class PrivateGptUi:
self._selected_filename = None
# Initialize system prompt based on default mode
self.mode = MODES[0]
self._system_prompt = self._get_default_system_prompt(self.mode)
default_mode_map = {mode.value: mode for mode in Modes}
self._default_mode = default_mode_map.get(
settings().ui.default_mode, Modes.RAG_MODE
)
self._system_prompt = self._get_default_system_prompt(self._default_mode)
def _chat(
self, message: str, history: list[list[str]], mode: Modes, *_: Any
@ -390,7 +394,7 @@ class PrivateGptUi:
with gr.Row(equal_height=False):
with gr.Column(scale=3):
default_mode = MODES[0]
default_mode = self._default_mode
mode = gr.Radio(
[mode.value for mode in MODES],
label="Mode",

View File

@ -3,10 +3,13 @@ from collections import deque
from collections.abc import Iterator, Mapping
from typing import Any
from httpx import ConnectError
from tqdm import tqdm # type: ignore
from private_gpt.utils.retry import retry
try:
from ollama import Client # type: ignore
from ollama import Client, ResponseError # type: ignore
except ImportError as e:
raise ImportError(
"Ollama dependencies not found, install with `poetry install --extras llms-ollama or embeddings-ollama`"
@ -14,13 +17,25 @@ except ImportError as e:
logger = logging.getLogger(__name__)
_MAX_RETRIES = 5
_JITTER = (3.0, 10.0)
@retry(
is_async=False,
exceptions=(ConnectError, ResponseError),
tries=_MAX_RETRIES,
jitter=_JITTER,
logger=logger,
)
def check_connection(client: Client) -> bool:
try:
client.list()
return True
except (ConnectError, ResponseError) as e:
raise e
except Exception as e:
logger.error(f"Failed to connect to Ollama: {e!s}")
logger.error(f"Failed to connect to Ollama: {type(e).__name__}: {e!s}")
return False

View File

@ -0,0 +1,31 @@
import logging
from collections.abc import Callable
from typing import Any
from retry_async import retry as retry_untyped # type: ignore
retry_logger = logging.getLogger(__name__)
def retry(
exceptions: Any = Exception,
*,
is_async: bool = False,
tries: int = -1,
delay: float = 0,
max_delay: float | None = None,
backoff: float = 1,
jitter: float | tuple[float, float] = 0,
logger: logging.Logger = retry_logger,
) -> Callable[..., Any]:
wrapped = retry_untyped(
exceptions=exceptions,
is_async=is_async,
tries=tries,
delay=delay,
max_delay=max_delay,
backoff=backoff,
jitter=jitter,
logger=logger,
)
return wrapped # type: ignore

View File

@ -1,88 +1,81 @@
[tool.poetry]
name = "private-gpt"
version = "0.6.0"
version = "0.6.2"
description = "Private GPT"
authors = ["Zylon <hi@zylon.ai>"]
[tool.poetry.dependencies]
python = ">=3.11,<3.12"
# PrivateGPT
fastapi = { extras = ["all"], version = "^0.111.0" }
python-multipart = "^0.0.9"
injector = "^0.21.0"
pyyaml = "^6.0.1"
fastapi = { extras = ["all"], version = "^0.115.0" }
python-multipart = "^0.0.10"
injector = "^0.22.0"
pyyaml = "^6.0.2"
watchdog = "^4.0.1"
transformers = "^4.42.3"
transformers = "^4.44.2"
docx2txt = "^0.8"
cryptography = "^3.1"
# LlamaIndex core libs
llama-index-core = "^0.10.52"
llama-index-readers-file = "^0.1.27"
llama-index-core = ">=0.11.2,<0.12.0"
llama-index-readers-file = "*"
# Optional LlamaIndex integration libs
llama-index-llms-llama-cpp = {version = "^0.1.4", optional = true}
llama-index-llms-openai = {version = "^0.1.25", optional = true}
llama-index-llms-openai-like = {version ="^0.1.3", optional = true}
llama-index-llms-ollama = {version ="^0.2.2", optional = true}
llama-index-llms-azure-openai = {version ="^0.1.8", optional = true}
llama-index-llms-gemini = {version ="^0.1.11", optional = true}
llama-index-embeddings-ollama = {version ="^0.1.2", optional = true}
llama-index-embeddings-huggingface = {version ="^0.2.2", optional = true}
llama-index-embeddings-openai = {version ="^0.1.10", optional = true}
llama-index-embeddings-azure-openai = {version ="^0.1.10", optional = true}
llama-index-embeddings-gemini = {version ="^0.1.8", optional = true}
llama-index-vector-stores-qdrant = {version ="^0.2.10", optional = true}
llama-index-vector-stores-milvus = {version ="^0.1.20", optional = true}
llama-index-vector-stores-chroma = {version ="^0.1.10", optional = true}
llama-index-vector-stores-postgres = {version ="^0.1.11", optional = true}
llama-index-vector-stores-clickhouse = {version ="^0.1.3", optional = true}
llama-index-storage-docstore-postgres = {version ="^0.1.3", optional = true}
llama-index-storage-index-store-postgres = {version ="^0.1.4", optional = true}
llama-index-llms-llama-cpp = {version = "*", optional = true}
llama-index-llms-openai = {version ="*", optional = true}
llama-index-llms-openai-like = {version ="*", optional = true}
llama-index-llms-ollama = {version ="*", optional = true}
llama-index-llms-azure-openai = {version ="*", optional = true}
llama-index-llms-gemini = {version ="*", optional = true}
llama-index-embeddings-ollama = {version ="*", optional = true}
llama-index-embeddings-huggingface = {version ="*", optional = true}
llama-index-embeddings-openai = {version ="*", optional = true}
llama-index-embeddings-azure-openai = {version ="*", optional = true}
llama-index-embeddings-gemini = {version ="*", optional = true}
llama-index-embeddings-mistralai = {version ="*", optional = true}
llama-index-vector-stores-qdrant = {version ="*", optional = true}
llama-index-vector-stores-milvus = {version ="*", optional = true}
llama-index-vector-stores-chroma = {version ="*", optional = true}
llama-index-vector-stores-postgres = {version ="*", optional = true}
llama-index-vector-stores-clickhouse = {version ="*", optional = true}
llama-index-storage-docstore-postgres = {version ="*", optional = true}
llama-index-storage-index-store-postgres = {version ="*", optional = true}
# Postgres
psycopg2-binary = {version ="^2.9.9", optional = true}
asyncpg = {version="^0.29.0", optional = true}
# ClickHouse
clickhouse-connect = {version = "^0.7.15", optional = true}
clickhouse-connect = {version = "^0.7.19", optional = true}
# Optional Sagemaker dependency
boto3 = {version ="^1.34.139", optional = true}
# Optional Qdrant client
qdrant-client = {version ="^1.9.0", optional = true}
boto3 = {version ="^1.35.26", optional = true}
# Optional Reranker dependencies
torch = {version ="^2.3.1", optional = true}
sentence-transformers = {version ="^3.0.1", optional = true}
torch = {version ="^2.4.1", optional = true}
sentence-transformers = {version ="^3.1.1", optional = true}
# Optional UI
gradio = {version ="^4.37.2", optional = true}
# Fix: https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/16289#issuecomment-2255106490
ffmpy = {git = "https://github.com/EuDs63/ffmpy.git", rev = "333a19ee4d21f32537c0508aa1942ef1aa7afe24", optional = true}
# Optional Google Gemini dependency
google-generativeai = {version ="^0.5.4", optional = true}
# Optional Ollama client
ollama = {version ="^0.3.0", optional = true}
gradio = {version ="^4.44.0", optional = true}
ffmpy = {version ="^0.4.0", optional = true}
# Optional HF Transformers
einops = {version = "^0.8.0", optional = true}
retry-async = "^0.1.4"
[tool.poetry.extras]
ui = ["gradio", "ffmpy"]
llms-llama-cpp = ["llama-index-llms-llama-cpp"]
llms-openai = ["llama-index-llms-openai"]
llms-openai-like = ["llama-index-llms-openai-like"]
llms-ollama = ["llama-index-llms-ollama", "ollama"]
llms-ollama = ["llama-index-llms-ollama"]
llms-sagemaker = ["boto3"]
llms-azopenai = ["llama-index-llms-azure-openai"]
llms-gemini = ["llama-index-llms-gemini", "google-generativeai"]
embeddings-ollama = ["llama-index-embeddings-ollama", "ollama"]
llms-gemini = ["llama-index-llms-gemini"]
embeddings-ollama = ["llama-index-embeddings-ollama"]
embeddings-huggingface = ["llama-index-embeddings-huggingface", "einops"]
embeddings-openai = ["llama-index-embeddings-openai"]
embeddings-sagemaker = ["boto3"]
embeddings-azopenai = ["llama-index-embeddings-azure-openai"]
embeddings-gemini = ["llama-index-embeddings-gemini"]
embeddings-mistral = ["llama-index-embeddings-mistralai"]
vector-stores-qdrant = ["llama-index-vector-stores-qdrant"]
vector-stores-clickhouse = ["llama-index-vector-stores-clickhouse", "clickhouse_connect"]
vector-stores-chroma = ["llama-index-vector-stores-chroma"]
@ -92,14 +85,14 @@ storage-nodestore-postgres = ["llama-index-storage-docstore-postgres","llama-ind
rerank-sentence-transformers = ["torch", "sentence-transformers"]
[tool.poetry.group.dev.dependencies]
black = "^22"
mypy = "^1.2"
pre-commit = "^2"
pytest = "^7"
pytest-cov = "^3"
black = "^24"
mypy = "^1.11"
pre-commit = "^3"
pytest = "^8"
pytest-cov = "^5"
ruff = "^0"
pytest-asyncio = "^0.21.1"
types-pyyaml = "^6.0.12.12"
pytest-asyncio = "^0.24.0"
types-pyyaml = "^6.0.12.20240917"
[build-system]
requires = ["poetry-core>=1.0.0"]

View File

@ -25,21 +25,23 @@ data:
ui:
enabled: true
path: /
# "RAG", "Search", "Basic", or "Summarize"
default_mode: "RAG"
default_chat_system_prompt: >
You are a helpful, respectful and honest assistant.
You are a helpful, respectful and honest assistant.
Always answer as helpfully as possible and follow ALL given instructions.
Do not speculate or make up information.
Do not reference any given instructions or context.
default_query_system_prompt: >
You can only answer questions about the provided context.
If you know the answer but it is not based in the provided context, don't provide
You can only answer questions about the provided context.
If you know the answer but it is not based in the provided context, don't provide
the answer, just state the answer is not in the context provided.
default_summarization_system_prompt: >
Provide a comprehensive summary of the provided context information.
Provide a comprehensive summary of the provided context information.
The summary should cover all the key points and main ideas presented in
the original text, while also condensing the information into a concise
the original text, while also condensing the information into a concise
and easy-to-understand format. Please ensure that the summary includes
relevant details and examples that support the main ideas, while avoiding
relevant details and examples that support the main ideas, while avoiding
any unnecessary information or repetition.
delete_file_button_enabled: true
delete_all_files_button_enabled: true

View File

@ -5,7 +5,7 @@ from private_gpt.launcher import create_app
from tests.fixtures.mock_injector import MockInjector
@pytest.fixture()
@pytest.fixture
def test_client(request: pytest.FixtureRequest, injector: MockInjector) -> TestClient:
if request is not None and hasattr(request, "param"):
injector.bind_settings(request.param or {})

View File

@ -19,6 +19,6 @@ class IngestHelper:
return ingest_result
@pytest.fixture()
@pytest.fixture
def ingest_helper(test_client: TestClient) -> IngestHelper:
return IngestHelper(test_client)

View File

@ -37,6 +37,6 @@ class MockInjector:
return self.test_injector.get(interface)
@pytest.fixture()
@pytest.fixture
def injector() -> MockInjector:
return MockInjector()

View File

@ -6,7 +6,7 @@ import pytest
from fastapi.testclient import TestClient
@pytest.fixture()
@pytest.fixture
def file_path() -> str:
return "test.txt"

View File

@ -1 +1 @@
0.6.1
0.6.2