mirror of
https://github.com/imartinez/privateGPT.git
synced 2025-10-11 09:43:31 +00:00
Compare commits
28 Commits
feature/te
...
v0.5.0
Author | SHA1 | Date | |
---|---|---|---|
|
94ef38cbba | ||
|
8a836e4651 | ||
|
f0b174c097 | ||
|
bac818add5 | ||
|
ea153fb92f | ||
|
b3b0140e24 | ||
|
83adc12a8e | ||
|
f83abff8bc | ||
|
087cb0b7b7 | ||
|
774e256052 | ||
|
6f6c785dac | ||
|
c2d694852b | ||
|
7d2de5c96f | ||
|
348df781b5 | ||
|
572518143a | ||
|
134fc54d7d | ||
|
1efac6a3fe | ||
|
258d02d87c | ||
|
63de7e4930 | ||
|
68b3a34b03 | ||
|
d17c34e81a | ||
|
84ad16af80 | ||
|
821bca32e9 | ||
|
02dc83e8e9 | ||
|
410bf7a71f | ||
|
290b9fb084 | ||
|
1b03b369c0 | ||
|
45f05711eb |
43
CHANGELOG.md
43
CHANGELOG.md
@@ -1,5 +1,48 @@
|
||||
# Changelog
|
||||
|
||||
## [0.5.0](https://github.com/zylon-ai/private-gpt/compare/v0.4.0...v0.5.0) (2024-04-02)
|
||||
|
||||
|
||||
### Features
|
||||
|
||||
* **code:** improve concat of strings in ui ([#1785](https://github.com/zylon-ai/private-gpt/issues/1785)) ([bac818a](https://github.com/zylon-ai/private-gpt/commit/bac818add51b104cda925b8f1f7b51448e935ca1))
|
||||
* **docker:** set default Docker to use Ollama ([#1812](https://github.com/zylon-ai/private-gpt/issues/1812)) ([f83abff](https://github.com/zylon-ai/private-gpt/commit/f83abff8bc955a6952c92cc7bcb8985fcec93afa))
|
||||
* **docs:** Add guide Llama-CPP Linux AMD GPU support ([#1782](https://github.com/zylon-ai/private-gpt/issues/1782)) ([8a836e4](https://github.com/zylon-ai/private-gpt/commit/8a836e4651543f099c59e2bf497ab8c55a7cd2e5))
|
||||
* **docs:** Feature/upgrade docs ([#1741](https://github.com/zylon-ai/private-gpt/issues/1741)) ([5725181](https://github.com/zylon-ai/private-gpt/commit/572518143ac46532382db70bed6f73b5082302c1))
|
||||
* **docs:** upgrade fern ([#1596](https://github.com/zylon-ai/private-gpt/issues/1596)) ([84ad16a](https://github.com/zylon-ai/private-gpt/commit/84ad16af80191597a953248ce66e963180e8ddec))
|
||||
* **ingest:** Created a faster ingestion mode - pipeline ([#1750](https://github.com/zylon-ai/private-gpt/issues/1750)) ([134fc54](https://github.com/zylon-ai/private-gpt/commit/134fc54d7d636be91680dc531f5cbe2c5892ac56))
|
||||
* **llm - embed:** Add support for Azure OpenAI ([#1698](https://github.com/zylon-ai/private-gpt/issues/1698)) ([1efac6a](https://github.com/zylon-ai/private-gpt/commit/1efac6a3fe19e4d62325e2c2915cd84ea277f04f))
|
||||
* **llm:** adds serveral settings for llamacpp and ollama ([#1703](https://github.com/zylon-ai/private-gpt/issues/1703)) ([02dc83e](https://github.com/zylon-ai/private-gpt/commit/02dc83e8e9f7ada181ff813f25051bbdff7b7c6b))
|
||||
* **llm:** Ollama LLM-Embeddings decouple + longer keep_alive settings ([#1800](https://github.com/zylon-ai/private-gpt/issues/1800)) ([b3b0140](https://github.com/zylon-ai/private-gpt/commit/b3b0140e244e7a313bfaf4ef10eb0f7e4192710e))
|
||||
* **llm:** Ollama timeout setting ([#1773](https://github.com/zylon-ai/private-gpt/issues/1773)) ([6f6c785](https://github.com/zylon-ai/private-gpt/commit/6f6c785dac2bbad37d0b67fda215784298514d39))
|
||||
* **local:** tiktoken cache within repo for offline ([#1467](https://github.com/zylon-ai/private-gpt/issues/1467)) ([821bca3](https://github.com/zylon-ai/private-gpt/commit/821bca32e9ee7c909fd6488445ff6a04463bf91b))
|
||||
* **nodestore:** add Postgres for the doc and index store ([#1706](https://github.com/zylon-ai/private-gpt/issues/1706)) ([68b3a34](https://github.com/zylon-ai/private-gpt/commit/68b3a34b032a08ca073a687d2058f926032495b3))
|
||||
* **rag:** expose similarity_top_k and similarity_score to settings ([#1771](https://github.com/zylon-ai/private-gpt/issues/1771)) ([087cb0b](https://github.com/zylon-ai/private-gpt/commit/087cb0b7b74c3eb80f4f60b47b3a021c81272ae1))
|
||||
* **RAG:** Introduce SentenceTransformer Reranker ([#1810](https://github.com/zylon-ai/private-gpt/issues/1810)) ([83adc12](https://github.com/zylon-ai/private-gpt/commit/83adc12a8ef0fa0c13a0dec084fa596445fc9075))
|
||||
* **scripts:** Wipe qdrant and obtain db Stats command ([#1783](https://github.com/zylon-ai/private-gpt/issues/1783)) ([ea153fb](https://github.com/zylon-ai/private-gpt/commit/ea153fb92f1f61f64c0d04fff0048d4d00b6f8d0))
|
||||
* **ui:** Add Model Information to ChatInterface label ([f0b174c](https://github.com/zylon-ai/private-gpt/commit/f0b174c097c2d5e52deae8ef88de30a0d9013a38))
|
||||
* **ui:** add sources check to not repeat identical sources ([#1705](https://github.com/zylon-ai/private-gpt/issues/1705)) ([290b9fb](https://github.com/zylon-ai/private-gpt/commit/290b9fb084632216300e89bdadbfeb0380724b12))
|
||||
* **UI:** Faster startup and document listing ([#1763](https://github.com/zylon-ai/private-gpt/issues/1763)) ([348df78](https://github.com/zylon-ai/private-gpt/commit/348df781b51606b2f9810bcd46f850e54192fd16))
|
||||
* **ui:** maintain score order when curating sources ([#1643](https://github.com/zylon-ai/private-gpt/issues/1643)) ([410bf7a](https://github.com/zylon-ai/private-gpt/commit/410bf7a71f17e77c4aec723ab80c233b53765964))
|
||||
* unify settings for vector and nodestore connections to PostgreSQL ([#1730](https://github.com/zylon-ai/private-gpt/issues/1730)) ([63de7e4](https://github.com/zylon-ai/private-gpt/commit/63de7e4930ac90dd87620225112a22ffcbbb31ee))
|
||||
* wipe per storage type ([#1772](https://github.com/zylon-ai/private-gpt/issues/1772)) ([c2d6948](https://github.com/zylon-ai/private-gpt/commit/c2d694852b4696834962a42fde047b728722ad74))
|
||||
|
||||
|
||||
### Bug Fixes
|
||||
|
||||
* **docs:** Minor documentation amendment ([#1739](https://github.com/zylon-ai/private-gpt/issues/1739)) ([258d02d](https://github.com/zylon-ai/private-gpt/commit/258d02d87c5cb81d6c3a6f06aa69339b670dffa9))
|
||||
* Fixed docker-compose ([#1758](https://github.com/zylon-ai/private-gpt/issues/1758)) ([774e256](https://github.com/zylon-ai/private-gpt/commit/774e2560520dc31146561d09a2eb464c68593871))
|
||||
* **ingest:** update script label ([#1770](https://github.com/zylon-ai/private-gpt/issues/1770)) ([7d2de5c](https://github.com/zylon-ai/private-gpt/commit/7d2de5c96fd42e339b26269b3155791311ef1d08))
|
||||
* **settings:** set default tokenizer to avoid running make setup fail ([#1709](https://github.com/zylon-ai/private-gpt/issues/1709)) ([d17c34e](https://github.com/zylon-ai/private-gpt/commit/d17c34e81a84518086b93605b15032e2482377f7))
|
||||
|
||||
## [0.4.0](https://github.com/imartinez/privateGPT/compare/v0.3.0...v0.4.0) (2024-03-06)
|
||||
|
||||
|
||||
### Features
|
||||
|
||||
* Upgrade to LlamaIndex to 0.10 ([#1663](https://github.com/imartinez/privateGPT/issues/1663)) ([45f0571](https://github.com/imartinez/privateGPT/commit/45f05711eb71ffccdedb26f37e680ced55795d44))
|
||||
* **Vector:** support pgvector ([#1624](https://github.com/imartinez/privateGPT/issues/1624)) ([cd40e39](https://github.com/imartinez/privateGPT/commit/cd40e3982b780b548b9eea6438c759f1c22743a8))
|
||||
|
||||
## [0.3.0](https://github.com/imartinez/privateGPT/compare/v0.2.0...v0.3.0) (2024-02-16)
|
||||
|
||||
|
||||
|
@@ -14,7 +14,7 @@ FROM base as dependencies
|
||||
WORKDIR /home/worker/app
|
||||
COPY pyproject.toml poetry.lock ./
|
||||
|
||||
RUN poetry install --extras "ui vector-stores-qdrant"
|
||||
RUN poetry install --extras "ui vector-stores-qdrant llms-ollama embeddings-ollama"
|
||||
|
||||
FROM base as app
|
||||
|
||||
|
3
Makefile
3
Makefile
@@ -51,6 +51,9 @@ api-docs:
|
||||
ingest:
|
||||
@poetry run python scripts/ingest_folder.py $(call args)
|
||||
|
||||
stats:
|
||||
poetry run python scripts/utils.py stats
|
||||
|
||||
wipe:
|
||||
poetry run python scripts/utils.py wipe
|
||||
|
||||
|
@@ -1,14 +1,16 @@
|
||||
services:
|
||||
private-gpt:
|
||||
build:
|
||||
dockerfile: Dockerfile.local
|
||||
dockerfile: Dockerfile.external
|
||||
volumes:
|
||||
- ./local_data/:/home/worker/app/local_data
|
||||
- ./models/:/home/worker/app/models
|
||||
ports:
|
||||
- 8001:8080
|
||||
environment:
|
||||
PORT: 8080
|
||||
PGPT_PROFILES: docker
|
||||
PGPT_MODE: local
|
||||
|
||||
PGPT_MODE: ollama
|
||||
ollama:
|
||||
image: ollama/ollama:latest
|
||||
volumes:
|
||||
- ./models:/root/.ollama
|
||||
|
@@ -58,10 +58,14 @@ navigation:
|
||||
contents:
|
||||
- page: Vector Stores
|
||||
path: ./docs/pages/manual/vectordb.mdx
|
||||
- page: Node Stores
|
||||
path: ./docs/pages/manual/nodestore.mdx
|
||||
- section: Advanced Setup
|
||||
contents:
|
||||
- page: LLM Backends
|
||||
path: ./docs/pages/manual/llms.mdx
|
||||
- page: Reranking
|
||||
path: ./docs/pages/manual/reranker.mdx
|
||||
- section: User Interface
|
||||
contents:
|
||||
- page: User interface (Gradio) Manual
|
||||
|
@@ -8,14 +8,14 @@ The clients are kept up to date automatically, so we encourage you to use the la
|
||||
|
||||
<Cards>
|
||||
<Card
|
||||
title="Node.js/TypeScript"
|
||||
title="Node.js/TypeScript - WIP"
|
||||
icon="fa-brands fa-node"
|
||||
href="https://github.com/imartinez/privateGPT-typescript"
|
||||
/>
|
||||
<Card
|
||||
title="Python"
|
||||
title="Python - Ready!"
|
||||
icon="fa-brands fa-python"
|
||||
href="https://github.com/imartinez/privateGPT-python"
|
||||
href="https://github.com/imartinez/pgpt_python"
|
||||
/>
|
||||
<br />
|
||||
</Cards>
|
||||
@@ -24,12 +24,12 @@ The clients are kept up to date automatically, so we encourage you to use the la
|
||||
|
||||
<Cards>
|
||||
<Card
|
||||
title="Java"
|
||||
title="Java - WIP"
|
||||
icon="fa-brands fa-java"
|
||||
href="https://github.com/imartinez/privateGPT-java"
|
||||
/>
|
||||
<Card
|
||||
title="Go"
|
||||
title="Go - WIP"
|
||||
icon="fa-brands fa-golang"
|
||||
href="https://github.com/imartinez/privateGPT-go"
|
||||
/>
|
||||
|
@@ -40,20 +40,21 @@ In order to run PrivateGPT in a fully local setup, you will need to run the LLM,
|
||||
### Vector stores
|
||||
The vector stores supported (Qdrant, ChromaDB and Postgres) run locally by default.
|
||||
### Embeddings
|
||||
For local embeddings you need to install the 'embeddings-huggingface' extra dependencies. It will use Huggingface Embeddings.
|
||||
For local Embeddings there are two options:
|
||||
* (Recommended) You can use the 'ollama' option in PrivateGPT, which will connect to your local Ollama instance. Ollama simplifies a lot the installation of local LLMs.
|
||||
* You can use the 'embeddings-huggingface' option in PrivateGPT, which will use HuggingFace.
|
||||
|
||||
Note: Ollama will support Embeddings in the short term for easier installation, but it doesn't as of today.
|
||||
|
||||
In order for local embeddings to work, you need to download the embeddings model to the `models` folder. You can do so by running the `setup` script:
|
||||
In order for HuggingFace LLM to work (the second option), you need to download the embeddings model to the `models` folder. You can do so by running the `setup` script:
|
||||
```bash
|
||||
poetry run python scripts/setup
|
||||
```
|
||||
|
||||
### LLM
|
||||
For local LLM there are two options:
|
||||
* (Recommended) You can use the 'ollama' option in PrivateGPT, which will connect to your local Ollama instance. Ollama simplifies a lot the installation of local LLMs.
|
||||
* You can use the 'llms-llama-cpp' option in PrivateGPT, which will use LlamaCPP. It works great on Mac with Metal most of the times (leverages Metal GPU), but it can be tricky in certain Linux and Windows distributions, depending on the GPU. In the installation document you'll find guides and troubleshooting.
|
||||
|
||||
In order for local LLM to work (the second option), you need to download the embeddings model to the `models` folder. You can do so by running the `setup` script:
|
||||
In order for LlamaCPP powered LLM to work (the second option), you need to download the LLM model to the `models` folder. You can do so by running the `setup` script:
|
||||
```bash
|
||||
poetry run python scripts/setup
|
||||
```
|
@@ -30,8 +30,8 @@ pyenv local 3.11
|
||||
PrivateGPT allows to customize the setup -from fully local to cloud based- by deciding the modules to use.
|
||||
Here are the different options available:
|
||||
|
||||
- LLM: "llama-cpp", "ollama", "sagemaker", "openai", "openailike"
|
||||
- Embeddings: "huggingface", "openai", "sagemaker"
|
||||
- LLM: "llama-cpp", "ollama", "sagemaker", "openai", "openailike", "azopenai"
|
||||
- Embeddings: "huggingface", "openai", "sagemaker", "azopenai"
|
||||
- Vector stores: "qdrant", "chroma", "postgres"
|
||||
- UI: whether or not to enable UI (Gradio) or just go with the API
|
||||
|
||||
@@ -44,15 +44,17 @@ poetry install --extras "<extra1> <extra2>..."
|
||||
Where `<extra>` can be any of the following:
|
||||
|
||||
- ui: adds support for UI using Gradio
|
||||
- llms-ollama: adds support for Ollama LLM, the easiest way to get a local LLM running
|
||||
- llms-ollama: adds support for Ollama LLM, the easiest way to get a local LLM running, requires Ollama running locally
|
||||
- llms-llama-cpp: adds support for local LLM using LlamaCPP - expect a messy installation process on some platforms
|
||||
- llms-sagemaker: adds support for Amazon Sagemaker LLM, requires Sagemaker inference endpoints
|
||||
- llms-nvidia-tensorrt: add support for Nvidia TensorRT LLM
|
||||
- llms-openai: adds support for OpenAI LLM, requires OpenAI API key
|
||||
- llms-openai-like: adds support for 3rd party LLM providers that are compatible with OpenAI's API
|
||||
- llms-azopenai: adds support for Azure OpenAI LLM, requires Azure OpenAI inference endpoints
|
||||
- embeddings-ollama: adds support for Ollama Embeddings, requires Ollama running locally
|
||||
- embeddings-huggingface: adds support for local Embeddings using HuggingFace
|
||||
- embeddings-sagemaker: adds support for Amazon Sagemaker Embeddings, requires Sagemaker inference endpoints
|
||||
- embeddings-openai = adds support for OpenAI Embeddings, requires OpenAI API key
|
||||
- embeddings-azopenai = adds support for Azure OpenAI Embeddings, requires Azure OpenAI inference endpoints
|
||||
- vector-stores-qdrant: adds support for Qdrant vector store
|
||||
- vector-stores-chroma: adds support for Chroma DB vector store
|
||||
- vector-stores-postgres: adds support for Postgres vector store
|
||||
@@ -79,21 +81,29 @@ set PGPT_PROFILES=ollama
|
||||
make run
|
||||
```
|
||||
|
||||
### Local, Ollama-powered setup
|
||||
### Local, Ollama-powered setup - RECOMMENDED
|
||||
|
||||
The easiest way to run PrivateGPT fully locally is to depend on Ollama for the LLM. Ollama provides a local LLM that is easy to install and use.
|
||||
**The easiest way to run PrivateGPT fully locally** is to depend on Ollama for the LLM. Ollama provides local LLM and Embeddings super easy to install and use, abstracting the complexity of GPU support. It's the recommended setup for local development.
|
||||
|
||||
Go to [ollama.ai](https://ollama.ai/) and follow the instructions to install Ollama on your machine.
|
||||
|
||||
Once done, you can install PrivateGPT dependencies with the following command:
|
||||
After the installation, make sure the Ollama desktop app is closed.
|
||||
|
||||
Install the models to be used, the default settings-ollama.yaml is configured to user `mistral 7b` LLM (~4GB) and `nomic-embed-text` Embeddings (~275MB). Therefore:
|
||||
|
||||
```bash
|
||||
poetry install --extras "ui llms-ollama embeddings-huggingface vector-stores-qdrant"
|
||||
ollama pull mistral
|
||||
ollama pull nomic-embed-text
|
||||
```
|
||||
|
||||
We are installing "embeddings-huggingface" dependency to support local embeddings, because Ollama doesn't support embeddings just yet. But they working on it!
|
||||
In order for local embeddings to work, you need to download the embeddings model to the `models` folder. You can do so by running the `setup` script:
|
||||
Now, start Ollama service (it will start a local inference server, serving both the LLM and the Embeddings):
|
||||
```bash
|
||||
poetry run python scripts/setup
|
||||
ollama serve
|
||||
```
|
||||
|
||||
Once done, on a different terminal, you can install PrivateGPT with the following command:
|
||||
```bash
|
||||
poetry install --extras "ui llms-ollama embeddings-ollama vector-stores-qdrant"
|
||||
```
|
||||
|
||||
Once installed, you can run PrivateGPT. Make sure you have a working Ollama running locally before running the following command.
|
||||
@@ -102,7 +112,7 @@ Once installed, you can run PrivateGPT. Make sure you have a working Ollama runn
|
||||
PGPT_PROFILES=ollama make run
|
||||
```
|
||||
|
||||
PrivateGPT will use the already existing `settings-ollama.yaml` settings file, which is already configured to use Ollama LLM, local Embeddings, and Qdrant. Review it and adapt it to your needs (different LLM model, different Ollama port, etc.)
|
||||
PrivateGPT will use the already existing `settings-ollama.yaml` settings file, which is already configured to use Ollama LLM and Embeddings, and Qdrant. Review it and adapt it to your needs (different models, different Ollama port, etc.)
|
||||
|
||||
The UI will be available at http://localhost:8001
|
||||
|
||||
@@ -114,7 +124,7 @@ You need to have access to sagemaker inference endpoints for the LLM and / or th
|
||||
|
||||
Edit the `settings-sagemaker.yaml` file to include the correct Sagemaker endpoints.
|
||||
|
||||
Then, install PrivateGPT dependencies with the following command:
|
||||
Then, install PrivateGPT with the following command:
|
||||
```bash
|
||||
poetry install --extras "ui llms-sagemaker embeddings-sagemaker vector-stores-qdrant"
|
||||
```
|
||||
@@ -129,75 +139,6 @@ PrivateGPT will use the already existing `settings-sagemaker.yaml` settings file
|
||||
|
||||
The UI will be available at http://localhost:8001
|
||||
|
||||
### Local, TensorRT-powered setup
|
||||
|
||||
To get the most out of NVIDIA GPUs, you can set up a fully local PrivateGPT using TensorRT as its LLM provider. For more information about Nvidia TensorRT, check the [official documentation](https://github.com/NVIDIA/TensorRT-LLM).
|
||||
|
||||
Follow these steps to set up a local TensorRT-powered PrivateGPT:
|
||||
|
||||
- Nvidia Cuda 12.2 or higher is currently required to run TensorRT-LLM.
|
||||
|
||||
- Install tensorrt_llm via pip as explained [here](https://pypi.org/project/tensorrt-llm/)
|
||||
|
||||
```bash
|
||||
pip install --no-cache-dir --extra-index-url https://pypi.nvidia.com tensorrt-llm
|
||||
````
|
||||
|
||||
- For this example we will use Llama2. The Llama2 model files need to be created via scripts following the instructions [here](https://github.com/NVIDIA/trt-llm-rag-windows/blob/release/1.0/README.md#building-trt-engine).
|
||||
The following files will be created from following the steps in the link:
|
||||
|
||||
* `Llama_float16_tp1_rank0.engine`: The main output of the build script, containing the executable graph of operations with the model weights embedded.
|
||||
|
||||
* `config.jsonp`: Includes detailed information about the model, like its general structure and precision, as well as information about which plug-ins were incorporated into the engine.
|
||||
|
||||
* `model.cache`: Caches some of the timing and optimization information from model compilation, making successive builds quicker.
|
||||
|
||||
- Create a folder inside `models` called `tensorrt`, and move all of the files mentioned above to that directory.
|
||||
|
||||
Once done, you can install PrivateGPT dependencies with the following command:
|
||||
```bash
|
||||
poetry install --extras "ui llms-nvidia-tensorrt embeddings-huggingface vector-stores-qdrant"
|
||||
```
|
||||
|
||||
We are installing "embeddings-huggingface" dependency to support local embeddings, because TensorRT only covers the LLM.
|
||||
In order for local embeddings to work, you need to download the embeddings model to the `models` folder. You can do so by running the `setup` script:
|
||||
```bash
|
||||
poetry run python scripts/setup
|
||||
```
|
||||
|
||||
Once installed, you can run PrivateGPT.
|
||||
|
||||
```bash
|
||||
PGPT_PROFILES=tensorrt make run
|
||||
```
|
||||
|
||||
PrivateGPT will use the already existing `settings-tensorrt.yaml` settings file, which is already configured to use Nvidia TensorRT LLM, local Embeddings, and Qdrant. Review it and adapt it to your needs (different LLM model, etc.)
|
||||
|
||||
The UI will be available at http://localhost:8001
|
||||
|
||||
### Local, Llama-CPP powered setup
|
||||
|
||||
If you want to run PrivateGPT fully locally without relying on Ollama, you can run the following command to install its dependencies:
|
||||
|
||||
```bash
|
||||
poetry install --extras "ui llms-llama-cpp embeddings-huggingface vector-stores-qdrant"
|
||||
```
|
||||
|
||||
In order for local LLM and embeddings to work, you need to download the models to the `models` folder. You can do so by running the `setup` script:
|
||||
```bash
|
||||
poetry run python scripts/setup
|
||||
```
|
||||
|
||||
Once installed, you can run PrivateGPT with the following command:
|
||||
|
||||
```bash
|
||||
PGPT_PROFILES=local make run
|
||||
```
|
||||
|
||||
PrivateGPT will load the already existing `settings-local.yaml` file, which is already configured to use LlamaCPP LLM, HuggingFace embeddings and Qdrant.
|
||||
|
||||
The UI will be available at http://localhost:8001
|
||||
|
||||
### Non-Private, OpenAI-powered test setup
|
||||
|
||||
If you want to test PrivateGPT with OpenAI's LLM and Embeddings -taking into account your data is going to OpenAI!- you can run the following command:
|
||||
@@ -206,7 +147,7 @@ You need an OPENAI API key to run this setup.
|
||||
|
||||
Edit the `settings-openai.yaml` file to include the correct API KEY. Never commit it! It's a secret! As an alternative to editing `settings-openai.yaml`, you can just set the env var OPENAI_API_KEY.
|
||||
|
||||
Then, install PrivateGPT dependencies with the following command:
|
||||
Then, install PrivateGPT with the following command:
|
||||
```bash
|
||||
poetry install --extras "ui llms-openai embeddings-openai vector-stores-qdrant"
|
||||
```
|
||||
@@ -221,9 +162,32 @@ PrivateGPT will use the already existing `settings-openai.yaml` settings file, w
|
||||
|
||||
The UI will be available at http://localhost:8001
|
||||
|
||||
### Non-Private, Azure OpenAI-powered test setup
|
||||
|
||||
If you want to test PrivateGPT with Azure OpenAI's LLM and Embeddings -taking into account your data is going to Azure OpenAI!- you can run the following command:
|
||||
|
||||
You need to have access to Azure OpenAI inference endpoints for the LLM and / or the embeddings, and have Azure OpenAI credentials properly configured.
|
||||
|
||||
Edit the `settings-azopenai.yaml` file to include the correct Azure OpenAI endpoints.
|
||||
|
||||
Then, install PrivateGPT with the following command:
|
||||
```bash
|
||||
poetry install --extras "ui llms-azopenai embeddings-azopenai vector-stores-qdrant"
|
||||
```
|
||||
|
||||
Once installed, you can run PrivateGPT.
|
||||
|
||||
```bash
|
||||
PGPT_PROFILES=azopenai make run
|
||||
```
|
||||
|
||||
PrivateGPT will use the already existing `settings-azopenai.yaml` settings file, which is already configured to use Azure OpenAI LLM and Embeddings endpoints, and Qdrant.
|
||||
|
||||
The UI will be available at http://localhost:8001
|
||||
|
||||
### Local, Llama-CPP powered setup
|
||||
|
||||
If you want to run PrivateGPT fully locally without relying on Ollama, you can run the following command to install its dependencies:
|
||||
If you want to run PrivateGPT fully locally without relying on Ollama, you can run the following command:
|
||||
|
||||
```bash
|
||||
poetry install --extras "ui llms-llama-cpp embeddings-huggingface vector-stores-qdrant"
|
||||
@@ -336,6 +300,40 @@ llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, co
|
||||
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 |
|
||||
```
|
||||
|
||||
##### Llama-CPP Linux AMD GPU support
|
||||
|
||||
Linux GPU support is done through ROCm.
|
||||
Some tips:
|
||||
* Install ROCm from [quick-start install guide](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/tutorial/quick-start.html)
|
||||
* [Install PyTorch for ROCm](https://rocm.docs.amd.com/projects/radeon/en/latest/docs/install/install-pytorch.html)
|
||||
```bash
|
||||
wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.0/torch-2.1.1%2Brocm6.0-cp311-cp311-linux_x86_64.whl
|
||||
poetry run pip install --force-reinstall --no-cache-dir torch-2.1.1+rocm6.0-cp311-cp311-linux_x86_64.whl
|
||||
```
|
||||
* Install bitsandbytes for ROCm
|
||||
```bash
|
||||
PYTORCH_ROCM_ARCH=gfx900,gfx906,gfx908,gfx90a,gfx1030,gfx1100,gfx1101,gfx940,gfx941,gfx942
|
||||
BITSANDBYTES_VERSION=62353b0200b8557026c176e74ac48b84b953a854
|
||||
git clone https://github.com/arlo-phoenix/bitsandbytes-rocm-5.6
|
||||
cd bitsandbytes-rocm-5.6
|
||||
git checkout ${BITSANDBYTES_VERSION}
|
||||
make hip ROCM_TARGET=${PYTORCH_ROCM_ARCH} ROCM_HOME=/opt/rocm/
|
||||
pip install . --extra-index-url https://download.pytorch.org/whl/nightly
|
||||
```
|
||||
|
||||
After that running the following command in the repository will install llama.cpp with GPU support:
|
||||
```bash
|
||||
LLAMA_CPP_PYTHON_VERSION=0.2.56
|
||||
DAMDGPU_TARGETS=gfx900;gfx906;gfx908;gfx90a;gfx1030;gfx1100;gfx1101;gfx940;gfx941;gfx942
|
||||
CMAKE_ARGS="-DLLAMA_HIPBLAS=ON -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ -DAMDGPU_TARGETS=${DAMDGPU_TARGETS}" poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python==${LLAMA_CPP_PYTHON_VERSION}
|
||||
```
|
||||
|
||||
If your installation was correct, you should see a message similar to the following next time you start the server `BLAS = 1`.
|
||||
|
||||
```
|
||||
AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 |
|
||||
```
|
||||
|
||||
##### Llama-CPP Known issues and Troubleshooting
|
||||
|
||||
Execution of LLMs locally still has a lot of sharp edges, specially when running on non Linux platforms.
|
||||
|
@@ -62,6 +62,7 @@ The following ingestion mode exist:
|
||||
* `simple`: historic behavior, ingest one document at a time, sequentially
|
||||
* `batch`: read, parse, and embed multiple documents using batches (batch read, and then batch parse, and then batch embed)
|
||||
* `parallel`: read, parse, and embed multiple documents in parallel. This is the fastest ingestion mode for local setup.
|
||||
* `pipeline`: Alternative to parallel.
|
||||
To change the ingestion mode, you can use the `embedding.ingest_mode` configuration value. The default value is `simple`.
|
||||
|
||||
To configure the number of workers used for parallel or batched ingestion, you can use
|
||||
|
@@ -98,6 +98,43 @@ to run an OpenAI compatible server. Then, you can run PrivateGPT using the `sett
|
||||
|
||||
`PGPT_PROFILES=vllm make run`
|
||||
|
||||
### Using Azure OpenAI
|
||||
|
||||
If you cannot run a local model (because you don't have a GPU, for example) or for testing purposes, you may
|
||||
decide to run PrivateGPT using Azure OpenAI as the LLM and Embeddings model.
|
||||
|
||||
In order to do so, create a profile `settings-azopenai.yaml` with the following contents:
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
mode: azopenai
|
||||
|
||||
embedding:
|
||||
mode: azopenai
|
||||
|
||||
azopenai:
|
||||
api_key: <your_azopenai_api_key> # You could skip this configuration and use the AZ_OPENAI_API_KEY env var instead
|
||||
azure_endpoint: <your_azopenai_endpoint> # You could skip this configuration and use the AZ_OPENAI_ENDPOINT env var instead
|
||||
api_version: <api_version> # The API version to use. Default is "2023_05_15"
|
||||
embedding_deployment_name: <your_embedding_deployment_name> # You could skip this configuration and use the AZ_OPENAI_EMBEDDING_DEPLOYMENT_NAME env var instead
|
||||
embedding_model: <openai_embeddings_to_use> # Optional model to use. Default is "text-embedding-ada-002"
|
||||
llm_deployment_name: <your_model_deployment_name> # You could skip this configuration and use the AZ_OPENAI_LLM_DEPLOYMENT_NAME env var instead
|
||||
llm_model: <openai_model_to_use> # Optional model to use. Default is "gpt-35-turbo"
|
||||
```
|
||||
|
||||
And run PrivateGPT loading that profile you just created:
|
||||
|
||||
`PGPT_PROFILES=azopenai make run`
|
||||
|
||||
or
|
||||
|
||||
`PGPT_PROFILES=azopenai poetry run python -m private_gpt`
|
||||
|
||||
When the server is started it will print a log *Application startup complete*.
|
||||
Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.
|
||||
You'll notice the speed and quality of response is higher, given you are using Azure OpenAI's servers for the heavy
|
||||
computations.
|
||||
|
||||
### Using AWS Sagemaker
|
||||
|
||||
For a fully private & performant setup, you can choose to have both your LLM and Embeddings model deployed using Sagemaker.
|
||||
|
66
fern/docs/pages/manual/nodestore.mdx
Normal file
66
fern/docs/pages/manual/nodestore.mdx
Normal file
@@ -0,0 +1,66 @@
|
||||
## NodeStores
|
||||
PrivateGPT supports **Simple** and [Postgres](https://www.postgresql.org/) providers. Simple being the default.
|
||||
|
||||
In order to select one or the other, set the `nodestore.database` property in the `settings.yaml` file to `simple` or `postgres`.
|
||||
|
||||
```yaml
|
||||
nodestore:
|
||||
database: simple
|
||||
```
|
||||
|
||||
### Simple Document Store
|
||||
|
||||
Setting up simple document store: Persist data with in-memory and disk storage.
|
||||
|
||||
Enabling the simple document store is an excellent choice for small projects or proofs of concept where you need to persist data while maintaining minimal setup complexity. To get started, set the nodestore.database property in your settings.yaml file as follows:
|
||||
|
||||
```yaml
|
||||
nodestore:
|
||||
database: simple
|
||||
```
|
||||
The beauty of the simple document store is its flexibility and ease of implementation. It provides a solid foundation for managing and retrieving data without the need for complex setup or configuration. The combination of in-memory processing and disk persistence ensures that you can efficiently handle small to medium-sized datasets while maintaining data consistency across runs.
|
||||
|
||||
### Postgres Document Store
|
||||
|
||||
To enable Postgres, set the `nodestore.database` property in the `settings.yaml` file to `postgres` and install the `storage-nodestore-postgres` extra. Note: Vector Embeddings Storage in Postgres is configured separately
|
||||
|
||||
```bash
|
||||
poetry install --extras storage-nodestore-postgres
|
||||
```
|
||||
|
||||
The available configuration options are:
|
||||
| Field | Description |
|
||||
|---------------|-----------------------------------------------------------|
|
||||
| **host** | The server hosting the Postgres database. Default is `localhost` |
|
||||
| **port** | The port on which the Postgres database is accessible. Default is `5432` |
|
||||
| **database** | The specific database to connect to. Default is `postgres` |
|
||||
| **user** | The username for database access. Default is `postgres` |
|
||||
| **password** | The password for database access. (Required) |
|
||||
| **schema_name** | The database schema to use. Default is `private_gpt` |
|
||||
|
||||
For example:
|
||||
```yaml
|
||||
nodestore:
|
||||
database: postgres
|
||||
|
||||
postgres:
|
||||
host: localhost
|
||||
port: 5432
|
||||
database: postgres
|
||||
user: postgres
|
||||
password: <PASSWORD>
|
||||
schema_name: private_gpt
|
||||
```
|
||||
|
||||
Given the above configuration, Two PostgreSQL tables will be created upon successful connection: one for storing metadata related to the index and another for document data itself.
|
||||
|
||||
```
|
||||
postgres=# \dt private_gpt.*
|
||||
List of relations
|
||||
Schema | Name | Type | Owner
|
||||
-------------+-----------------+-------+--------------
|
||||
private_gpt | data_docstore | table | postgres
|
||||
private_gpt | data_indexstore | table | postgres
|
||||
|
||||
postgres=#
|
||||
```
|
36
fern/docs/pages/manual/reranker.mdx
Normal file
36
fern/docs/pages/manual/reranker.mdx
Normal file
@@ -0,0 +1,36 @@
|
||||
## Enhancing Response Quality with Reranking
|
||||
|
||||
PrivateGPT offers a reranking feature aimed at optimizing response generation by filtering out irrelevant documents, potentially leading to faster response times and enhanced relevance of answers generated by the LLM.
|
||||
|
||||
### Enabling Reranking
|
||||
|
||||
Document reranking can significantly improve the efficiency and quality of the responses by pre-selecting the most relevant documents before generating an answer. To leverage this feature, ensure that it is enabled in the RAG settings and consider adjusting the parameters to best fit your use case.
|
||||
|
||||
#### Additional Requirements
|
||||
|
||||
Before enabling reranking, you must install additional dependencies:
|
||||
|
||||
```bash
|
||||
poetry install --extras rerank-sentence-transformers
|
||||
```
|
||||
|
||||
This command installs dependencies for the cross-encoder reranker from sentence-transformers, which is currently the only supported method by PrivateGPT for document reranking.
|
||||
|
||||
#### Configuration
|
||||
|
||||
To enable and configure reranking, adjust the `rag` section within the `settings.yaml` file. Here are the key settings to consider:
|
||||
|
||||
- `similarity_top_k`: Determines the number of documents to initially retrieve and consider for reranking. This value should be larger than `top_n`.
|
||||
- `rerank`:
|
||||
- `enabled`: Set to `true` to activate the reranking feature.
|
||||
- `top_n`: Specifies the number of documents to use in the final answer generation process, chosen from the top-ranked documents provided by `similarity_top_k`.
|
||||
|
||||
Example configuration snippet:
|
||||
|
||||
```yaml
|
||||
rag:
|
||||
similarity_top_k: 10 # Number of documents to retrieve and consider for reranking
|
||||
rerank:
|
||||
enabled: true
|
||||
top_n: 3 # Number of top-ranked documents to use for generating the answer
|
||||
```
|
@@ -1,7 +1,7 @@
|
||||
## Vectorstores
|
||||
PrivateGPT supports [Qdrant](https://qdrant.tech/), [Chroma](https://www.trychroma.com/) and [PGVector](https://github.com/pgvector/pgvector) as vectorstore providers. Qdrant being the default.
|
||||
|
||||
In order to select one or the other, set the `vectorstore.database` property in the `settings.yaml` file to `qdrant`, `chroma` or `pgvector`.
|
||||
In order to select one or the other, set the `vectorstore.database` property in the `settings.yaml` file to `qdrant`, `chroma` or `postgres`.
|
||||
|
||||
```yaml
|
||||
vectorstore:
|
||||
@@ -50,14 +50,15 @@ poetry install --extras chroma
|
||||
By default `chroma` will use a disk-based database stored in local_data_path / "chroma_db" (being local_data_path defined in settings.yaml)
|
||||
|
||||
### PGVector
|
||||
To use the PGVector store a [postgreSQL](https://www.postgresql.org/) database with the PGVector extension must be used.
|
||||
|
||||
To enable PGVector, set the `vectorstore.database` property in the `settings.yaml` file to `pgvector` and install the `pgvector` extra.
|
||||
To enable PGVector, set the `vectorstore.database` property in the `settings.yaml` file to `postgres` and install the `vector-stores-postgres` extra.
|
||||
|
||||
```bash
|
||||
poetry install --extras pgvector
|
||||
poetry install --extras vector-stores-postgres
|
||||
```
|
||||
|
||||
PGVector settings can be configured by setting values to the `pgvector` property in the `settings.yaml` file.
|
||||
PGVector settings can be configured by setting values to the `postgres` property in the `settings.yaml` file.
|
||||
|
||||
The available configuration options are:
|
||||
| Field | Description |
|
||||
@@ -67,19 +68,36 @@ The available configuration options are:
|
||||
| **database** | The specific database to connect to. Default is `postgres` |
|
||||
| **user** | The username for database access. Default is `postgres` |
|
||||
| **password** | The password for database access. (Required) |
|
||||
| **embed_dim** | The dimensionality of the embedding model (Required) |
|
||||
| **schema_name** | The database schema to use. Default is `private_gpt` |
|
||||
| **table_name** | The database table to use. Default is `embeddings` |
|
||||
|
||||
For example:
|
||||
```yaml
|
||||
pgvector:
|
||||
vectorstore:
|
||||
database: postgres
|
||||
|
||||
postgres:
|
||||
host: localhost
|
||||
port: 5432
|
||||
database: postgres
|
||||
user: postgres
|
||||
password: <PASSWORD>
|
||||
embed_dim: 384 # 384 is for BAAI/bge-small-en-v1.5
|
||||
schema_name: private_gpt
|
||||
table_name: embeddings
|
||||
```
|
||||
|
||||
The following table will be created in the database
|
||||
```
|
||||
postgres=# \d private_gpt.data_embeddings
|
||||
Table "private_gpt.data_embeddings"
|
||||
Column | Type | Collation | Nullable | Default
|
||||
-----------+-------------------+-----------+----------+---------------------------------------------------------
|
||||
id | bigint | | not null | nextval('private_gpt.data_embeddings_id_seq'::regclass)
|
||||
text | character varying | | not null |
|
||||
metadata_ | json | | |
|
||||
node_id | character varying | | |
|
||||
embedding | vector(768) | | |
|
||||
Indexes:
|
||||
"data_embeddings_pkey" PRIMARY KEY, btree (id)
|
||||
|
||||
postgres=#
|
||||
```
|
||||
The dimensions of the embeddings columns will be set based on the `embedding.embed_dim` value. If the embedding model changes this table may need to be dropped and recreated to avoid a dimension mismatch.
|
||||
|
@@ -1,4 +1,4 @@
|
||||
{
|
||||
"organization": "privategpt",
|
||||
"version": "0.15.3"
|
||||
"version": "0.19.10"
|
||||
}
|
506
poetry.lock
generated
506
poetry.lock
generated
@@ -1,4 +1,4 @@
|
||||
# This file is automatically @generated by Poetry 1.7.1 and should not be changed by hand.
|
||||
# This file is automatically @generated by Poetry 1.8.2 and should not be changed by hand.
|
||||
|
||||
[[package]]
|
||||
name = "aiofiles"
|
||||
@@ -98,7 +98,6 @@ files = [
|
||||
|
||||
[package.dependencies]
|
||||
aiosignal = ">=1.1.2"
|
||||
async-timeout = {version = ">=4.0,<5.0", markers = "python_version < \"3.11\""}
|
||||
attrs = ">=17.3.0"
|
||||
frozenlist = ">=1.1.1"
|
||||
multidict = ">=4.5,<7.0"
|
||||
@@ -139,7 +138,6 @@ numpy = "*"
|
||||
packaging = "*"
|
||||
pandas = ">=0.25"
|
||||
toolz = "*"
|
||||
typing-extensions = {version = ">=4.0.1", markers = "python_version < \"3.11\""}
|
||||
|
||||
[package.extras]
|
||||
dev = ["anywidget", "geopandas", "hatch", "ipython", "m2r", "mypy", "pandas-stubs", "pyarrow (>=11)", "pytest", "pytest-cov", "ruff (>=0.1.3)", "types-jsonschema", "types-setuptools", "vega-datasets", "vegafusion[embed] (>=1.4.0)", "vl-convert-python (>=1.1.0)"]
|
||||
@@ -168,7 +166,6 @@ files = [
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
exceptiongroup = {version = "*", markers = "python_version < \"3.11\""}
|
||||
idna = ">=2.8"
|
||||
sniffio = ">=1.1"
|
||||
|
||||
@@ -188,9 +185,6 @@ files = [
|
||||
{file = "asgiref-3.7.2.tar.gz", hash = "sha256:9e0ce3aa93a819ba5b45120216b23878cf6e8525eb3848653452b4192b92afed"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
typing-extensions = {version = ">=4", markers = "python_version < \"3.11\""}
|
||||
|
||||
[package.extras]
|
||||
tests = ["mypy (>=0.800)", "pytest", "pytest-asyncio"]
|
||||
|
||||
@@ -198,7 +192,7 @@ tests = ["mypy (>=0.800)", "pytest", "pytest-asyncio"]
|
||||
name = "async-timeout"
|
||||
version = "4.0.3"
|
||||
description = "Timeout context manager for asyncio programs"
|
||||
optional = false
|
||||
optional = true
|
||||
python-versions = ">=3.7"
|
||||
files = [
|
||||
{file = "async-timeout-4.0.3.tar.gz", hash = "sha256:4640d96be84d82d02ed59ea2b7105a0f7b33abe8703703cd0ab0bf87c427522f"},
|
||||
@@ -280,6 +274,42 @@ docs = ["furo", "myst-parser", "sphinx", "sphinx-notfound-page", "sphinxcontrib-
|
||||
tests = ["attrs[tests-no-zope]", "zope-interface"]
|
||||
tests-no-zope = ["cloudpickle", "hypothesis", "mypy (>=1.1.1)", "pympler", "pytest (>=4.3.0)", "pytest-mypy-plugins", "pytest-xdist[psutil]"]
|
||||
|
||||
[[package]]
|
||||
name = "azure-core"
|
||||
version = "1.30.1"
|
||||
description = "Microsoft Azure Core Library for Python"
|
||||
optional = true
|
||||
python-versions = ">=3.7"
|
||||
files = [
|
||||
{file = "azure-core-1.30.1.tar.gz", hash = "sha256:26273a254131f84269e8ea4464f3560c731f29c0c1f69ac99010845f239c1a8f"},
|
||||
{file = "azure_core-1.30.1-py3-none-any.whl", hash = "sha256:7c5ee397e48f281ec4dd773d67a0a47a0962ed6fa833036057f9ea067f688e74"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
requests = ">=2.21.0"
|
||||
six = ">=1.11.0"
|
||||
typing-extensions = ">=4.6.0"
|
||||
|
||||
[package.extras]
|
||||
aio = ["aiohttp (>=3.0)"]
|
||||
|
||||
[[package]]
|
||||
name = "azure-identity"
|
||||
version = "1.15.0"
|
||||
description = "Microsoft Azure Identity Library for Python"
|
||||
optional = true
|
||||
python-versions = ">=3.7"
|
||||
files = [
|
||||
{file = "azure-identity-1.15.0.tar.gz", hash = "sha256:4c28fc246b7f9265610eb5261d65931183d019a23d4b0e99357facb2e6c227c8"},
|
||||
{file = "azure_identity-1.15.0-py3-none-any.whl", hash = "sha256:a14b1f01c7036f11f148f22cd8c16e05035293d714458d6b44ddf534d93eb912"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
azure-core = ">=1.23.0,<2.0.0"
|
||||
cryptography = ">=2.5"
|
||||
msal = ">=1.24.0,<2.0.0"
|
||||
msal-extensions = ">=0.3.0,<2.0.0"
|
||||
|
||||
[[package]]
|
||||
name = "backoff"
|
||||
version = "2.2.1"
|
||||
@@ -378,7 +408,6 @@ click = ">=8.0.0"
|
||||
mypy-extensions = ">=0.4.3"
|
||||
pathspec = ">=0.9.0"
|
||||
platformdirs = ">=2"
|
||||
tomli = {version = ">=1.1.0", markers = "python_full_version < \"3.11.0a7\""}
|
||||
|
||||
[package.extras]
|
||||
colorama = ["colorama (>=0.4.3)"]
|
||||
@@ -453,7 +482,6 @@ files = [
|
||||
colorama = {version = "*", markers = "os_name == \"nt\""}
|
||||
packaging = ">=19.0"
|
||||
pyproject_hooks = "*"
|
||||
tomli = {version = ">=1.1.0", markers = "python_version < \"3.11\""}
|
||||
|
||||
[package.extras]
|
||||
docs = ["furo (>=2023.08.17)", "sphinx (>=7.0,<8.0)", "sphinx-argparse-cli (>=1.5)", "sphinx-autodoc-typehints (>=1.10)", "sphinx-issues (>=3.0.0)"]
|
||||
@@ -483,6 +511,70 @@ files = [
|
||||
{file = "certifi-2023.11.17.tar.gz", hash = "sha256:9b469f3a900bf28dc19b8cfbf8019bf47f7fdd1a65a1d4ffb98fc14166beb4d1"},
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "cffi"
|
||||
version = "1.16.0"
|
||||
description = "Foreign Function Interface for Python calling C code."
|
||||
optional = true
|
||||
python-versions = ">=3.8"
|
||||
files = [
|
||||
{file = "cffi-1.16.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:6b3d6606d369fc1da4fd8c357d026317fbb9c9b75d36dc16e90e84c26854b088"},
|
||||
{file = "cffi-1.16.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:ac0f5edd2360eea2f1daa9e26a41db02dd4b0451b48f7c318e217ee092a213e9"},
|
||||
{file = "cffi-1.16.0-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:7e61e3e4fa664a8588aa25c883eab612a188c725755afff6289454d6362b9673"},
|
||||
{file = "cffi-1.16.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a72e8961a86d19bdb45851d8f1f08b041ea37d2bd8d4fd19903bc3083d80c896"},
|
||||
{file = "cffi-1.16.0-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:5b50bf3f55561dac5438f8e70bfcdfd74543fd60df5fa5f62d94e5867deca684"},
|
||||
{file = "cffi-1.16.0-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:7651c50c8c5ef7bdb41108b7b8c5a83013bfaa8a935590c5d74627c047a583c7"},
|
||||
{file = "cffi-1.16.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e4108df7fe9b707191e55f33efbcb2d81928e10cea45527879a4749cbe472614"},
|
||||
{file = "cffi-1.16.0-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:32c68ef735dbe5857c810328cb2481e24722a59a2003018885514d4c09af9743"},
|
||||
{file = "cffi-1.16.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:673739cb539f8cdaa07d92d02efa93c9ccf87e345b9a0b556e3ecc666718468d"},
|
||||
{file = "cffi-1.16.0-cp310-cp310-win32.whl", hash = "sha256:9f90389693731ff1f659e55c7d1640e2ec43ff725cc61b04b2f9c6d8d017df6a"},
|
||||
{file = "cffi-1.16.0-cp310-cp310-win_amd64.whl", hash = "sha256:e6024675e67af929088fda399b2094574609396b1decb609c55fa58b028a32a1"},
|
||||
{file = "cffi-1.16.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:b84834d0cf97e7d27dd5b7f3aca7b6e9263c56308ab9dc8aae9784abb774d404"},
|
||||
{file = "cffi-1.16.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:1b8ebc27c014c59692bb2664c7d13ce7a6e9a629be20e54e7271fa696ff2b417"},
|
||||
{file = "cffi-1.16.0-cp311-cp311-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:ee07e47c12890ef248766a6e55bd38ebfb2bb8edd4142d56db91b21ea68b7627"},
|
||||
{file = "cffi-1.16.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d8a9d3ebe49f084ad71f9269834ceccbf398253c9fac910c4fd7053ff1386936"},
|
||||
{file = "cffi-1.16.0-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:e70f54f1796669ef691ca07d046cd81a29cb4deb1e5f942003f401c0c4a2695d"},
|
||||
{file = "cffi-1.16.0-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:5bf44d66cdf9e893637896c7faa22298baebcd18d1ddb6d2626a6e39793a1d56"},
|
||||
{file = "cffi-1.16.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7b78010e7b97fef4bee1e896df8a4bbb6712b7f05b7ef630f9d1da00f6444d2e"},
|
||||
{file = "cffi-1.16.0-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:c6a164aa47843fb1b01e941d385aab7215563bb8816d80ff3a363a9f8448a8dc"},
|
||||
{file = "cffi-1.16.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:e09f3ff613345df5e8c3667da1d918f9149bd623cd9070c983c013792a9a62eb"},
|
||||
{file = "cffi-1.16.0-cp311-cp311-win32.whl", hash = "sha256:2c56b361916f390cd758a57f2e16233eb4f64bcbeee88a4881ea90fca14dc6ab"},
|
||||
{file = "cffi-1.16.0-cp311-cp311-win_amd64.whl", hash = "sha256:db8e577c19c0fda0beb7e0d4e09e0ba74b1e4c092e0e40bfa12fe05b6f6d75ba"},
|
||||
{file = "cffi-1.16.0-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:fa3a0128b152627161ce47201262d3140edb5a5c3da88d73a1b790a959126956"},
|
||||
{file = "cffi-1.16.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:68e7c44931cc171c54ccb702482e9fc723192e88d25a0e133edd7aff8fcd1f6e"},
|
||||
{file = "cffi-1.16.0-cp312-cp312-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:abd808f9c129ba2beda4cfc53bde801e5bcf9d6e0f22f095e45327c038bfe68e"},
|
||||
{file = "cffi-1.16.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:88e2b3c14bdb32e440be531ade29d3c50a1a59cd4e51b1dd8b0865c54ea5d2e2"},
|
||||
{file = "cffi-1.16.0-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:fcc8eb6d5902bb1cf6dc4f187ee3ea80a1eba0a89aba40a5cb20a5087d961357"},
|
||||
{file = "cffi-1.16.0-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:b7be2d771cdba2942e13215c4e340bfd76398e9227ad10402a8767ab1865d2e6"},
|
||||
{file = "cffi-1.16.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e715596e683d2ce000574bae5d07bd522c781a822866c20495e52520564f0969"},
|
||||
{file = "cffi-1.16.0-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:2d92b25dbf6cae33f65005baf472d2c245c050b1ce709cc4588cdcdd5495b520"},
|
||||
{file = "cffi-1.16.0-cp312-cp312-win32.whl", hash = "sha256:b2ca4e77f9f47c55c194982e10f058db063937845bb2b7a86c84a6cfe0aefa8b"},
|
||||
{file = "cffi-1.16.0-cp312-cp312-win_amd64.whl", hash = "sha256:68678abf380b42ce21a5f2abde8efee05c114c2fdb2e9eef2efdb0257fba1235"},
|
||||
{file = "cffi-1.16.0-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:0c9ef6ff37e974b73c25eecc13952c55bceed9112be2d9d938ded8e856138bcc"},
|
||||
{file = "cffi-1.16.0-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a09582f178759ee8128d9270cd1344154fd473bb77d94ce0aeb2a93ebf0feaf0"},
|
||||
{file = "cffi-1.16.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e760191dd42581e023a68b758769e2da259b5d52e3103c6060ddc02c9edb8d7b"},
|
||||
{file = "cffi-1.16.0-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:80876338e19c951fdfed6198e70bc88f1c9758b94578d5a7c4c91a87af3cf31c"},
|
||||
{file = "cffi-1.16.0-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:a6a14b17d7e17fa0d207ac08642c8820f84f25ce17a442fd15e27ea18d67c59b"},
|
||||
{file = "cffi-1.16.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6602bc8dc6f3a9e02b6c22c4fc1e47aa50f8f8e6d3f78a5e16ac33ef5fefa324"},
|
||||
{file = "cffi-1.16.0-cp38-cp38-win32.whl", hash = "sha256:131fd094d1065b19540c3d72594260f118b231090295d8c34e19a7bbcf2e860a"},
|
||||
{file = "cffi-1.16.0-cp38-cp38-win_amd64.whl", hash = "sha256:31d13b0f99e0836b7ff893d37af07366ebc90b678b6664c955b54561fc36ef36"},
|
||||
{file = "cffi-1.16.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:582215a0e9adbe0e379761260553ba11c58943e4bbe9c36430c4ca6ac74b15ed"},
|
||||
{file = "cffi-1.16.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:b29ebffcf550f9da55bec9e02ad430c992a87e5f512cd63388abb76f1036d8d2"},
|
||||
{file = "cffi-1.16.0-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:dc9b18bf40cc75f66f40a7379f6a9513244fe33c0e8aa72e2d56b0196a7ef872"},
|
||||
{file = "cffi-1.16.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9cb4a35b3642fc5c005a6755a5d17c6c8b6bcb6981baf81cea8bfbc8903e8ba8"},
|
||||
{file = "cffi-1.16.0-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:b86851a328eedc692acf81fb05444bdf1891747c25af7529e39ddafaf68a4f3f"},
|
||||
{file = "cffi-1.16.0-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:c0f31130ebc2d37cdd8e44605fb5fa7ad59049298b3f745c74fa74c62fbfcfc4"},
|
||||
{file = "cffi-1.16.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:8f8e709127c6c77446a8c0a8c8bf3c8ee706a06cd44b1e827c3e6a2ee6b8c098"},
|
||||
{file = "cffi-1.16.0-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:748dcd1e3d3d7cd5443ef03ce8685043294ad6bd7c02a38d1bd367cfd968e000"},
|
||||
{file = "cffi-1.16.0-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:8895613bcc094d4a1b2dbe179d88d7fb4a15cee43c052e8885783fac397d91fe"},
|
||||
{file = "cffi-1.16.0-cp39-cp39-win32.whl", hash = "sha256:ed86a35631f7bfbb28e108dd96773b9d5a6ce4811cf6ea468bb6a359b256b1e4"},
|
||||
{file = "cffi-1.16.0-cp39-cp39-win_amd64.whl", hash = "sha256:3686dffb02459559c74dd3d81748269ffb0eb027c39a6fc99502de37d501faa8"},
|
||||
{file = "cffi-1.16.0.tar.gz", hash = "sha256:bcb3ef43e58665bbda2fb198698fcae6776483e0c4a631aa5647806c25e02cc0"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
pycparser = "*"
|
||||
|
||||
[[package]]
|
||||
name = "cfgv"
|
||||
version = "3.4.0"
|
||||
@@ -837,12 +929,63 @@ files = [
|
||||
{file = "coverage-7.3.3.tar.gz", hash = "sha256:df04c64e58df96b4427db8d0559e95e2df3138c9916c96f9f6a4dd220db2fdb7"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
tomli = {version = "*", optional = true, markers = "python_full_version <= \"3.11.0a6\" and extra == \"toml\""}
|
||||
|
||||
[package.extras]
|
||||
toml = ["tomli"]
|
||||
|
||||
[[package]]
|
||||
name = "cryptography"
|
||||
version = "42.0.5"
|
||||
description = "cryptography is a package which provides cryptographic recipes and primitives to Python developers."
|
||||
optional = true
|
||||
python-versions = ">=3.7"
|
||||
files = [
|
||||
{file = "cryptography-42.0.5-cp37-abi3-macosx_10_12_universal2.whl", hash = "sha256:a30596bae9403a342c978fb47d9b0ee277699fa53bbafad14706af51fe543d16"},
|
||||
{file = "cryptography-42.0.5-cp37-abi3-macosx_10_12_x86_64.whl", hash = "sha256:b7ffe927ee6531c78f81aa17e684e2ff617daeba7f189f911065b2ea2d526dec"},
|
||||
{file = "cryptography-42.0.5-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2424ff4c4ac7f6b8177b53c17ed5d8fa74ae5955656867f5a8affaca36a27abb"},
|
||||
{file = "cryptography-42.0.5-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:329906dcc7b20ff3cad13c069a78124ed8247adcac44b10bea1130e36caae0b4"},
|
||||
{file = "cryptography-42.0.5-cp37-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:b03c2ae5d2f0fc05f9a2c0c997e1bc18c8229f392234e8a0194f202169ccd278"},
|
||||
{file = "cryptography-42.0.5-cp37-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:f8837fe1d6ac4a8052a9a8ddab256bc006242696f03368a4009be7ee3075cdb7"},
|
||||
{file = "cryptography-42.0.5-cp37-abi3-musllinux_1_1_aarch64.whl", hash = "sha256:0270572b8bd2c833c3981724b8ee9747b3ec96f699a9665470018594301439ee"},
|
||||
{file = "cryptography-42.0.5-cp37-abi3-musllinux_1_1_x86_64.whl", hash = "sha256:b8cac287fafc4ad485b8a9b67d0ee80c66bf3574f655d3b97ef2e1082360faf1"},
|
||||
{file = "cryptography-42.0.5-cp37-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:16a48c23a62a2f4a285699dba2e4ff2d1cff3115b9df052cdd976a18856d8e3d"},
|
||||
{file = "cryptography-42.0.5-cp37-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:2bce03af1ce5a5567ab89bd90d11e7bbdff56b8af3acbbec1faded8f44cb06da"},
|
||||
{file = "cryptography-42.0.5-cp37-abi3-win32.whl", hash = "sha256:b6cd2203306b63e41acdf39aa93b86fb566049aeb6dc489b70e34bcd07adca74"},
|
||||
{file = "cryptography-42.0.5-cp37-abi3-win_amd64.whl", hash = "sha256:98d8dc6d012b82287f2c3d26ce1d2dd130ec200c8679b6213b3c73c08b2b7940"},
|
||||
{file = "cryptography-42.0.5-cp39-abi3-macosx_10_12_universal2.whl", hash = "sha256:5e6275c09d2badf57aea3afa80d975444f4be8d3bc58f7f80d2a484c6f9485c8"},
|
||||
{file = "cryptography-42.0.5-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e4985a790f921508f36f81831817cbc03b102d643b5fcb81cd33df3fa291a1a1"},
|
||||
{file = "cryptography-42.0.5-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7cde5f38e614f55e28d831754e8a3bacf9ace5d1566235e39d91b35502d6936e"},
|
||||
{file = "cryptography-42.0.5-cp39-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:7367d7b2eca6513681127ebad53b2582911d1736dc2ffc19f2c3ae49997496bc"},
|
||||
{file = "cryptography-42.0.5-cp39-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:cd2030f6650c089aeb304cf093f3244d34745ce0cfcc39f20c6fbfe030102e2a"},
|
||||
{file = "cryptography-42.0.5-cp39-abi3-musllinux_1_1_aarch64.whl", hash = "sha256:a2913c5375154b6ef2e91c10b5720ea6e21007412f6437504ffea2109b5a33d7"},
|
||||
{file = "cryptography-42.0.5-cp39-abi3-musllinux_1_1_x86_64.whl", hash = "sha256:c41fb5e6a5fe9ebcd58ca3abfeb51dffb5d83d6775405305bfa8715b76521922"},
|
||||
{file = "cryptography-42.0.5-cp39-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:3eaafe47ec0d0ffcc9349e1708be2aaea4c6dd4978d76bf6eb0cb2c13636c6fc"},
|
||||
{file = "cryptography-42.0.5-cp39-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:1b95b98b0d2af784078fa69f637135e3c317091b615cd0905f8b8a087e86fa30"},
|
||||
{file = "cryptography-42.0.5-cp39-abi3-win32.whl", hash = "sha256:1f71c10d1e88467126f0efd484bd44bca5e14c664ec2ede64c32f20875c0d413"},
|
||||
{file = "cryptography-42.0.5-cp39-abi3-win_amd64.whl", hash = "sha256:a011a644f6d7d03736214d38832e030d8268bcff4a41f728e6030325fea3e400"},
|
||||
{file = "cryptography-42.0.5-pp310-pypy310_pp73-macosx_10_12_x86_64.whl", hash = "sha256:9481ffe3cf013b71b2428b905c4f7a9a4f76ec03065b05ff499bb5682a8d9ad8"},
|
||||
{file = "cryptography-42.0.5-pp310-pypy310_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:ba334e6e4b1d92442b75ddacc615c5476d4ad55cc29b15d590cc6b86efa487e2"},
|
||||
{file = "cryptography-42.0.5-pp310-pypy310_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:ba3e4a42397c25b7ff88cdec6e2a16c2be18720f317506ee25210f6d31925f9c"},
|
||||
{file = "cryptography-42.0.5-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:111a0d8553afcf8eb02a4fea6ca4f59d48ddb34497aa8706a6cf536f1a5ec576"},
|
||||
{file = "cryptography-42.0.5-pp39-pypy39_pp73-macosx_10_12_x86_64.whl", hash = "sha256:cd65d75953847815962c84a4654a84850b2bb4aed3f26fadcc1c13892e1e29f6"},
|
||||
{file = "cryptography-42.0.5-pp39-pypy39_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:e807b3188f9eb0eaa7bbb579b462c5ace579f1cedb28107ce8b48a9f7ad3679e"},
|
||||
{file = "cryptography-42.0.5-pp39-pypy39_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:f12764b8fffc7a123f641d7d049d382b73f96a34117e0b637b80643169cec8ac"},
|
||||
{file = "cryptography-42.0.5-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:37dd623507659e08be98eec89323469e8c7b4c1407c85112634ae3dbdb926fdd"},
|
||||
{file = "cryptography-42.0.5.tar.gz", hash = "sha256:6fe07eec95dfd477eb9530aef5bead34fec819b3aaf6c5bd6d20565da607bfe1"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
cffi = {version = ">=1.12", markers = "platform_python_implementation != \"PyPy\""}
|
||||
|
||||
[package.extras]
|
||||
docs = ["sphinx (>=5.3.0)", "sphinx-rtd-theme (>=1.1.1)"]
|
||||
docstest = ["pyenchant (>=1.6.11)", "readme-renderer", "sphinxcontrib-spelling (>=4.0.1)"]
|
||||
nox = ["nox"]
|
||||
pep8test = ["check-sdist", "click", "mypy", "ruff"]
|
||||
sdist = ["build"]
|
||||
ssh = ["bcrypt (>=3.1.5)"]
|
||||
test = ["certifi", "pretend", "pytest (>=6.2.0)", "pytest-benchmark", "pytest-cov", "pytest-xdist"]
|
||||
test-randomorder = ["pytest-randomly"]
|
||||
|
||||
[[package]]
|
||||
name = "cycler"
|
||||
version = "0.12.1"
|
||||
@@ -968,20 +1111,6 @@ files = [
|
||||
dnspython = ">=2.0.0"
|
||||
idna = ">=2.0.0"
|
||||
|
||||
[[package]]
|
||||
name = "exceptiongroup"
|
||||
version = "1.2.0"
|
||||
description = "Backport of PEP 654 (exception groups)"
|
||||
optional = false
|
||||
python-versions = ">=3.7"
|
||||
files = [
|
||||
{file = "exceptiongroup-1.2.0-py3-none-any.whl", hash = "sha256:4bfd3996ac73b41e9b9628b04e079f193850720ea5945fc96a08633c66912f14"},
|
||||
{file = "exceptiongroup-1.2.0.tar.gz", hash = "sha256:91f5c769735f051a4290d52edd0858999b57e5876e9f85937691bd4c9fa3ed68"},
|
||||
]
|
||||
|
||||
[package.extras]
|
||||
test = ["pytest (>=6)"]
|
||||
|
||||
[[package]]
|
||||
name = "fastapi"
|
||||
version = "0.110.0"
|
||||
@@ -2064,13 +2193,13 @@ test = ["httpx (>=0.24.1)", "pytest (>=7.4.0)", "scipy (>=1.10)"]
|
||||
|
||||
[[package]]
|
||||
name = "llama-index-core"
|
||||
version = "0.10.13"
|
||||
version = "0.10.14.post1"
|
||||
description = "Interface between LLMs and your data"
|
||||
optional = false
|
||||
python-versions = ">=3.8.1,<4.0"
|
||||
files = [
|
||||
{file = "llama_index_core-0.10.13-py3-none-any.whl", hash = "sha256:40c76fc02be7cd948a333ca541f2ff38cf02774e1c960674e2b68c61943bac90"},
|
||||
{file = "llama_index_core-0.10.13.tar.gz", hash = "sha256:826fded00767923fba8aca94f46c32b259e8879f517016ab7a3801b1b37187a1"},
|
||||
{file = "llama_index_core-0.10.14.post1-py3-none-any.whl", hash = "sha256:7b12ebebe023e8f5e50c0fcff4af7a67e4842b2e1ca6a84b09442394d2689de6"},
|
||||
{file = "llama_index_core-0.10.14.post1.tar.gz", hash = "sha256:adb931fced7bff092b26599e7f89952c171bf2994872906b5712ecc8107d4727"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
@@ -2105,6 +2234,22 @@ local-models = ["optimum[onnxruntime] (>=1.13.2,<2.0.0)", "sentencepiece (>=0.1.
|
||||
postgres = ["asyncpg (>=0.28.0,<0.29.0)", "pgvector (>=0.1.0,<0.2.0)", "psycopg2-binary (>=2.9.9,<3.0.0)"]
|
||||
query-tools = ["guidance (>=0.0.64,<0.0.65)", "jsonpath-ng (>=1.6.0,<2.0.0)", "lm-format-enforcer (>=0.4.3,<0.5.0)", "rank-bm25 (>=0.2.2,<0.3.0)", "scikit-learn", "spacy (>=3.7.1,<4.0.0)"]
|
||||
|
||||
[[package]]
|
||||
name = "llama-index-embeddings-azure-openai"
|
||||
version = "0.1.6"
|
||||
description = "llama-index embeddings azure openai integration"
|
||||
optional = true
|
||||
python-versions = ">=3.8.1,<4.0"
|
||||
files = [
|
||||
{file = "llama_index_embeddings_azure_openai-0.1.6-py3-none-any.whl", hash = "sha256:a84a6d7d67296690e5d20070ce5d9920ec56b0d339338d276eae2a7b2f822b9e"},
|
||||
{file = "llama_index_embeddings_azure_openai-0.1.6.tar.gz", hash = "sha256:05092b1b31bd0f45257d161f1e5a17261c60e688f4c6a4fe316557349ac2aebc"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
llama-index-core = ">=0.10.11.post1,<0.11.0"
|
||||
llama-index-embeddings-openai = ">=0.1.3,<0.2.0"
|
||||
llama-index-llms-azure-openai = ">=0.1.3,<0.2.0"
|
||||
|
||||
[[package]]
|
||||
name = "llama-index-embeddings-huggingface"
|
||||
version = "0.1.4"
|
||||
@@ -2122,6 +2267,20 @@ llama-index-core = ">=0.10.1,<0.11.0"
|
||||
torch = ">=2.1.2,<3.0.0"
|
||||
transformers = ">=4.37.0,<5.0.0"
|
||||
|
||||
[[package]]
|
||||
name = "llama-index-embeddings-ollama"
|
||||
version = "0.1.2"
|
||||
description = "llama-index embeddings ollama integration"
|
||||
optional = true
|
||||
python-versions = ">=3.8.1,<4.0"
|
||||
files = [
|
||||
{file = "llama_index_embeddings_ollama-0.1.2-py3-none-any.whl", hash = "sha256:ac7afabfa1134059af351b021e05e256bf86dd15e5176ffa5ab0305bcf03b33f"},
|
||||
{file = "llama_index_embeddings_ollama-0.1.2.tar.gz", hash = "sha256:a9e0809bddd2e4ad888f249519edc7e3d339c74e4e03fc5a40c3060dc41d47a9"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
llama-index-core = ">=0.10.1,<0.11.0"
|
||||
|
||||
[[package]]
|
||||
name = "llama-index-embeddings-openai"
|
||||
version = "0.1.6"
|
||||
@@ -2136,6 +2295,23 @@ files = [
|
||||
[package.dependencies]
|
||||
llama-index-core = ">=0.10.1,<0.11.0"
|
||||
|
||||
[[package]]
|
||||
name = "llama-index-llms-azure-openai"
|
||||
version = "0.1.5"
|
||||
description = "llama-index llms azure openai integration"
|
||||
optional = true
|
||||
python-versions = ">=3.8.1,<4.0"
|
||||
files = [
|
||||
{file = "llama_index_llms_azure_openai-0.1.5-py3-none-any.whl", hash = "sha256:180805a7114198155aad7cc3abdf599142c59242d366b11ee8a9150de35b7773"},
|
||||
{file = "llama_index_llms_azure_openai-0.1.5.tar.gz", hash = "sha256:5a1c3d1a6a4fe4d03acb50b61594e6775dc86a431738afa291f3708029299a92"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
azure-identity = ">=1.15.0,<2.0.0"
|
||||
httpx = "*"
|
||||
llama-index-core = ">=0.10.11.post1,<0.11.0"
|
||||
llama-index-llms-openai = ">=0.1.1,<0.2.0"
|
||||
|
||||
[[package]]
|
||||
name = "llama-index-llms-llama-cpp"
|
||||
version = "0.1.3"
|
||||
@@ -2151,22 +2327,6 @@ files = [
|
||||
llama-cpp-python = ">=0.2.32,<0.3.0"
|
||||
llama-index-core = ">=0.10.1,<0.11.0"
|
||||
|
||||
[[package]]
|
||||
name = "llama-index-llms-nvidia-tensorrt"
|
||||
version = "0.1.4"
|
||||
description = "llama-index llms nvidia tensorrt integration"
|
||||
optional = true
|
||||
python-versions = ">=3.8.1,<4.0"
|
||||
files = [
|
||||
{file = "llama_index_llms_nvidia_tensorrt-0.1.4-py3-none-any.whl", hash = "sha256:146b249de86317985d57d1acb89e5af1ef1564462899e6711f1ec97b3ba9ce7c"},
|
||||
{file = "llama_index_llms_nvidia_tensorrt-0.1.4.tar.gz", hash = "sha256:7edddbe1ad2bc8f9fc2812853b800c8ad2b610931b870d49ad7d5be920e6dbfc"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
llama-index-core = ">=0.10.1,<0.11.0"
|
||||
torch = ">=2.1.2,<3.0.0"
|
||||
transformers = ">=4.37.0,<5.0.0"
|
||||
|
||||
[[package]]
|
||||
name = "llama-index-llms-ollama"
|
||||
version = "0.1.2"
|
||||
@@ -2229,6 +2389,34 @@ llama-index-core = ">=0.10.1,<0.11.0"
|
||||
pymupdf = ">=1.23.21,<2.0.0"
|
||||
pypdf = ">=4.0.1,<5.0.0"
|
||||
|
||||
[[package]]
|
||||
name = "llama-index-storage-docstore-postgres"
|
||||
version = "0.1.2"
|
||||
description = "llama-index docstore postgres integration"
|
||||
optional = true
|
||||
python-versions = ">=3.8.1,<4.0"
|
||||
files = [
|
||||
{file = "llama_index_storage_docstore_postgres-0.1.2-py3-none-any.whl", hash = "sha256:54c9534d26a641af85857452ce09279eddec27ca14c3a50c4481e95f394daa08"},
|
||||
{file = "llama_index_storage_docstore_postgres-0.1.2.tar.gz", hash = "sha256:40f5ebd9b461023110343c478caf9ef96c30317dd077e8b156460dff1568dba7"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
llama-index-core = ">=0.10.1,<0.11.0"
|
||||
|
||||
[[package]]
|
||||
name = "llama-index-storage-index-store-postgres"
|
||||
version = "0.1.2"
|
||||
description = "llama-index index_store postgres integration"
|
||||
optional = true
|
||||
python-versions = ">=3.8.1,<4.0"
|
||||
files = [
|
||||
{file = "llama_index_storage_index_store_postgres-0.1.2-py3-none-any.whl", hash = "sha256:8728c9cc5ce9312cf364e1cb1b65e0aba24321e20a16463d8f27f5a883b51b72"},
|
||||
{file = "llama_index_storage_index_store_postgres-0.1.2.tar.gz", hash = "sha256:6a6af1ea6110b2b34de87acaf97c9615bbb738eb504fe89482fb6b973b07eb47"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
llama-index-core = ">=0.10.1,<0.11.0"
|
||||
|
||||
[[package]]
|
||||
name = "llama-index-vector-stores-chroma"
|
||||
version = "0.1.4"
|
||||
@@ -2571,6 +2759,44 @@ docs = ["sphinx"]
|
||||
gmpy = ["gmpy2 (>=2.1.0a4)"]
|
||||
tests = ["pytest (>=4.6)"]
|
||||
|
||||
[[package]]
|
||||
name = "msal"
|
||||
version = "1.27.0"
|
||||
description = "The Microsoft Authentication Library (MSAL) for Python library enables your app to access the Microsoft Cloud by supporting authentication of users with Microsoft Azure Active Directory accounts (AAD) and Microsoft Accounts (MSA) using industry standard OAuth2 and OpenID Connect."
|
||||
optional = true
|
||||
python-versions = ">=2.7"
|
||||
files = [
|
||||
{file = "msal-1.27.0-py2.py3-none-any.whl", hash = "sha256:572d07149b83e7343a85a3bcef8e581167b4ac76befcbbb6eef0c0e19643cdc0"},
|
||||
{file = "msal-1.27.0.tar.gz", hash = "sha256:3109503c038ba6b307152b0e8d34f98113f2e7a78986e28d0baf5b5303afda52"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
cryptography = ">=0.6,<45"
|
||||
PyJWT = {version = ">=1.0.0,<3", extras = ["crypto"]}
|
||||
requests = ">=2.0.0,<3"
|
||||
|
||||
[package.extras]
|
||||
broker = ["pymsalruntime (>=0.13.2,<0.15)"]
|
||||
|
||||
[[package]]
|
||||
name = "msal-extensions"
|
||||
version = "1.1.0"
|
||||
description = "Microsoft Authentication Library extensions (MSAL EX) provides a persistence API that can save your data on disk, encrypted on Windows, macOS and Linux. Concurrent data access will be coordinated by a file lock mechanism."
|
||||
optional = true
|
||||
python-versions = ">=3.7"
|
||||
files = [
|
||||
{file = "msal-extensions-1.1.0.tar.gz", hash = "sha256:6ab357867062db7b253d0bd2df6d411c7891a0ee7308d54d1e4317c1d1c54252"},
|
||||
{file = "msal_extensions-1.1.0-py3-none-any.whl", hash = "sha256:01be9711b4c0b1a151450068eeb2c4f0997df3bba085ac299de3a66f585e382f"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
msal = ">=0.4.1,<2.0.0"
|
||||
packaging = "*"
|
||||
portalocker = [
|
||||
{version = ">=1.0,<3", markers = "platform_system != \"Windows\""},
|
||||
{version = ">=1.6,<3", markers = "platform_system == \"Windows\""},
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "multidict"
|
||||
version = "6.0.4"
|
||||
@@ -2692,7 +2918,6 @@ files = [
|
||||
|
||||
[package.dependencies]
|
||||
mypy-extensions = ">=1.0.0"
|
||||
tomli = {version = ">=1.1.0", markers = "python_version < \"3.11\""}
|
||||
typing-extensions = ">=4.1.0"
|
||||
|
||||
[package.extras]
|
||||
@@ -2948,6 +3173,7 @@ optional = true
|
||||
python-versions = ">=3"
|
||||
files = [
|
||||
{file = "nvidia_nvjitlink_cu12-12.3.101-py3-none-manylinux1_x86_64.whl", hash = "sha256:64335a8088e2b9d196ae8665430bc6a2b7e6ef2eb877a9c735c804bd4ff6467c"},
|
||||
{file = "nvidia_nvjitlink_cu12-12.3.101-py3-none-manylinux2014_aarch64.whl", hash = "sha256:211a63e7b30a9d62f1a853e19928fbb1a750e3f17a13a3d1f98ff0ced19478dd"},
|
||||
{file = "nvidia_nvjitlink_cu12-12.3.101-py3-none-win_amd64.whl", hash = "sha256:1b2e317e437433753530792f13eece58f0aec21a2b05903be7bffe58a606cbd1"},
|
||||
]
|
||||
|
||||
@@ -3325,10 +3551,7 @@ files = [
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
numpy = [
|
||||
{version = ">=1.22.4,<2", markers = "python_version < \"3.11\""},
|
||||
{version = ">=1.23.2,<2", markers = "python_version == \"3.11\""},
|
||||
]
|
||||
numpy = {version = ">=1.23.2,<2", markers = "python_version == \"3.11\""}
|
||||
python-dateutil = ">=2.8.2"
|
||||
pytz = ">=2020.1"
|
||||
tzdata = ">=2022.1"
|
||||
@@ -3711,6 +3934,17 @@ files = [
|
||||
[package.dependencies]
|
||||
pyasn1 = ">=0.4.6,<0.6.0"
|
||||
|
||||
[[package]]
|
||||
name = "pycparser"
|
||||
version = "2.21"
|
||||
description = "C parser in Python"
|
||||
optional = true
|
||||
python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*"
|
||||
files = [
|
||||
{file = "pycparser-2.21-py2.py3-none-any.whl", hash = "sha256:8ee45429555515e1f6b185e78100aea234072576aa43ab53aefcae078162fca9"},
|
||||
{file = "pycparser-2.21.tar.gz", hash = "sha256:e644fdec12f7872f86c58ff790da456218b10f863970249516d60a5eaca77206"},
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "pydantic"
|
||||
version = "2.5.2"
|
||||
@@ -3905,6 +4139,26 @@ files = [
|
||||
plugins = ["importlib-metadata"]
|
||||
windows-terminal = ["colorama (>=0.4.6)"]
|
||||
|
||||
[[package]]
|
||||
name = "pyjwt"
|
||||
version = "2.8.0"
|
||||
description = "JSON Web Token implementation in Python"
|
||||
optional = true
|
||||
python-versions = ">=3.7"
|
||||
files = [
|
||||
{file = "PyJWT-2.8.0-py3-none-any.whl", hash = "sha256:59127c392cc44c2da5bb3192169a91f429924e17aff6534d70fdc02ab3e04320"},
|
||||
{file = "PyJWT-2.8.0.tar.gz", hash = "sha256:57e28d156e3d5c10088e0c68abb90bfac3df82b40a71bd0daa20c65ccd5c23de"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
cryptography = {version = ">=3.4.0", optional = true, markers = "extra == \"crypto\""}
|
||||
|
||||
[package.extras]
|
||||
crypto = ["cryptography (>=3.4.0)"]
|
||||
dev = ["coverage[toml] (==5.0.4)", "cryptography (>=3.4.0)", "pre-commit", "pytest (>=6.0.0,<7.0.0)", "sphinx (>=4.5.0,<5.0.0)", "sphinx-rtd-theme", "zope.interface"]
|
||||
docs = ["sphinx (>=4.5.0,<5.0.0)", "sphinx-rtd-theme", "zope.interface"]
|
||||
tests = ["coverage[toml] (==5.0.4)", "pytest (>=6.0.0,<7.0.0)"]
|
||||
|
||||
[[package]]
|
||||
name = "pymupdf"
|
||||
version = "1.23.25"
|
||||
@@ -4016,9 +4270,6 @@ files = [
|
||||
{file = "pyproject_hooks-1.0.0.tar.gz", hash = "sha256:f271b298b97f5955d53fb12b72c1fb1948c22c1a6b70b315c54cedaca0264ef5"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
tomli = {version = ">=1.1.0", markers = "python_version < \"3.11\""}
|
||||
|
||||
[[package]]
|
||||
name = "pyreadline3"
|
||||
version = "3.4.1"
|
||||
@@ -4043,11 +4294,9 @@ files = [
|
||||
|
||||
[package.dependencies]
|
||||
colorama = {version = "*", markers = "sys_platform == \"win32\""}
|
||||
exceptiongroup = {version = ">=1.0.0rc8", markers = "python_version < \"3.11\""}
|
||||
iniconfig = "*"
|
||||
packaging = "*"
|
||||
pluggy = ">=0.12,<2.0"
|
||||
tomli = {version = ">=1.0.0", markers = "python_version < \"3.11\""}
|
||||
|
||||
[package.extras]
|
||||
testing = ["argcomplete", "attrs (>=19.2.0)", "hypothesis (>=3.56)", "mock", "nose", "pygments (>=2.7.2)", "requests", "setuptools", "xmlschema"]
|
||||
@@ -4189,6 +4438,7 @@ files = [
|
||||
{file = "PyYAML-6.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:bf07ee2fef7014951eeb99f56f39c9bb4af143d8aa3c21b1677805985307da34"},
|
||||
{file = "PyYAML-6.0.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:855fb52b0dc35af121542a76b9a84f8d1cd886ea97c84703eaa6d88e37a2ad28"},
|
||||
{file = "PyYAML-6.0.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:40df9b996c2b73138957fe23a16a4f0ba614f4c0efce1e9406a184b6d07fa3a9"},
|
||||
{file = "PyYAML-6.0.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a08c6f0fe150303c1c6b71ebcd7213c2858041a7e01975da3a99aed1e7a378ef"},
|
||||
{file = "PyYAML-6.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6c22bec3fbe2524cde73d7ada88f6566758a8f7227bfbf93a408a9d86bcc12a0"},
|
||||
{file = "PyYAML-6.0.1-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:8d4e9c88387b0f5c7d5f281e55304de64cf7f9c0021a3525bd3b1c542da3b0e4"},
|
||||
{file = "PyYAML-6.0.1-cp312-cp312-win32.whl", hash = "sha256:d483d2cdf104e7c9fa60c544d92981f12ad66a457afae824d146093b8c294c54"},
|
||||
@@ -4699,6 +4949,90 @@ tensorflow = ["safetensors[numpy]", "tensorflow (>=2.11.0)"]
|
||||
testing = ["h5py (>=3.7.0)", "huggingface_hub (>=0.12.1)", "hypothesis (>=6.70.2)", "pytest (>=7.2.0)", "pytest-benchmark (>=4.0.0)", "safetensors[numpy]", "setuptools_rust (>=1.5.2)"]
|
||||
torch = ["safetensors[numpy]", "torch (>=1.10)"]
|
||||
|
||||
[[package]]
|
||||
name = "scikit-learn"
|
||||
version = "1.4.1.post1"
|
||||
description = "A set of python modules for machine learning and data mining"
|
||||
optional = true
|
||||
python-versions = ">=3.9"
|
||||
files = [
|
||||
{file = "scikit-learn-1.4.1.post1.tar.gz", hash = "sha256:93d3d496ff1965470f9977d05e5ec3376fb1e63b10e4fda5e39d23c2d8969a30"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:c540aaf44729ab5cd4bd5e394f2b375e65ceaea9cdd8c195788e70433d91bbc5"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp310-cp310-macosx_12_0_arm64.whl", hash = "sha256:4310bff71aa98b45b46cd26fa641309deb73a5d1c0461d181587ad4f30ea3c36"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9f43dd527dabff5521af2786a2f8de5ba381e182ec7292663508901cf6ceaf6e"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c02e27d65b0c7dc32f2c5eb601aaf5530b7a02bfbe92438188624524878336f2"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp310-cp310-win_amd64.whl", hash = "sha256:629e09f772ad42f657ca60a1a52342eef786218dd20cf1369a3b8d085e55ef8f"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:6145dfd9605b0b50ae72cdf72b61a2acd87501369a763b0d73d004710ebb76b5"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp311-cp311-macosx_12_0_arm64.whl", hash = "sha256:1afed6951bc9d2053c6ee9a518a466cbc9b07c6a3f9d43bfe734192b6125d508"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ce03506ccf5f96b7e9030fea7eb148999b254c44c10182ac55857bc9b5d4815f"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4ba516fcdc73d60e7f48cbb0bccb9acbdb21807de3651531208aac73c758e3ab"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp311-cp311-win_amd64.whl", hash = "sha256:78cd27b4669513b50db4f683ef41ea35b5dddc797bd2bbd990d49897fd1c8a46"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:a1e289f33f613cefe6707dead50db31930530dc386b6ccff176c786335a7b01c"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp312-cp312-macosx_12_0_arm64.whl", hash = "sha256:0df87de9ce1c0140f2818beef310fb2e2afdc1e66fc9ad587965577f17733649"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:712c1c69c45b58ef21635360b3d0a680ff7d83ac95b6f9b82cf9294070cda710"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1754b0c2409d6ed5a3380512d0adcf182a01363c669033a2b55cca429ed86a81"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp312-cp312-win_amd64.whl", hash = "sha256:1d491ef66e37f4e812db7e6c8286520c2c3fc61b34bf5e59b67b4ce528de93af"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:aa0029b78ef59af22cfbd833e8ace8526e4df90212db7ceccbea582ebb5d6794"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp39-cp39-macosx_12_0_arm64.whl", hash = "sha256:14e4c88436ac96bf69eb6d746ac76a574c314a23c6961b7d344b38877f20fee1"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d7cd3a77c32879311f2aa93466d3c288c955ef71d191503cf0677c3340ae8ae0"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2a3ee19211ded1a52ee37b0a7b373a8bfc66f95353af058a210b692bd4cda0dd"},
|
||||
{file = "scikit_learn-1.4.1.post1-cp39-cp39-win_amd64.whl", hash = "sha256:234b6bda70fdcae9e4abbbe028582ce99c280458665a155eed0b820599377d25"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
joblib = ">=1.2.0"
|
||||
numpy = ">=1.19.5,<2.0"
|
||||
scipy = ">=1.6.0"
|
||||
threadpoolctl = ">=2.0.0"
|
||||
|
||||
[package.extras]
|
||||
benchmark = ["matplotlib (>=3.3.4)", "memory-profiler (>=0.57.0)", "pandas (>=1.1.5)"]
|
||||
docs = ["Pillow (>=7.1.2)", "matplotlib (>=3.3.4)", "memory-profiler (>=0.57.0)", "numpydoc (>=1.2.0)", "pandas (>=1.1.5)", "plotly (>=5.14.0)", "pooch (>=1.6.0)", "scikit-image (>=0.17.2)", "seaborn (>=0.9.0)", "sphinx (>=6.0.0)", "sphinx-copybutton (>=0.5.2)", "sphinx-gallery (>=0.15.0)", "sphinx-prompt (>=1.3.0)", "sphinxext-opengraph (>=0.4.2)"]
|
||||
examples = ["matplotlib (>=3.3.4)", "pandas (>=1.1.5)", "plotly (>=5.14.0)", "pooch (>=1.6.0)", "scikit-image (>=0.17.2)", "seaborn (>=0.9.0)"]
|
||||
tests = ["black (>=23.3.0)", "matplotlib (>=3.3.4)", "mypy (>=1.3)", "numpydoc (>=1.2.0)", "pandas (>=1.1.5)", "polars (>=0.19.12)", "pooch (>=1.6.0)", "pyamg (>=4.0.0)", "pyarrow (>=12.0.0)", "pytest (>=7.1.2)", "pytest-cov (>=2.9.0)", "ruff (>=0.0.272)", "scikit-image (>=0.17.2)"]
|
||||
|
||||
[[package]]
|
||||
name = "scipy"
|
||||
version = "1.12.0"
|
||||
description = "Fundamental algorithms for scientific computing in Python"
|
||||
optional = true
|
||||
python-versions = ">=3.9"
|
||||
files = [
|
||||
{file = "scipy-1.12.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:78e4402e140879387187f7f25d91cc592b3501a2e51dfb320f48dfb73565f10b"},
|
||||
{file = "scipy-1.12.0-cp310-cp310-macosx_12_0_arm64.whl", hash = "sha256:f5f00ebaf8de24d14b8449981a2842d404152774c1a1d880c901bf454cb8e2a1"},
|
||||
{file = "scipy-1.12.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e53958531a7c695ff66c2e7bb7b79560ffdc562e2051644c5576c39ff8efb563"},
|
||||
{file = "scipy-1.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5e32847e08da8d895ce09d108a494d9eb78974cf6de23063f93306a3e419960c"},
|
||||
{file = "scipy-1.12.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:4c1020cad92772bf44b8e4cdabc1df5d87376cb219742549ef69fc9fd86282dd"},
|
||||
{file = "scipy-1.12.0-cp310-cp310-win_amd64.whl", hash = "sha256:75ea2a144096b5e39402e2ff53a36fecfd3b960d786b7efd3c180e29c39e53f2"},
|
||||
{file = "scipy-1.12.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:408c68423f9de16cb9e602528be4ce0d6312b05001f3de61fe9ec8b1263cad08"},
|
||||
{file = "scipy-1.12.0-cp311-cp311-macosx_12_0_arm64.whl", hash = "sha256:5adfad5dbf0163397beb4aca679187d24aec085343755fcdbdeb32b3679f254c"},
|
||||
{file = "scipy-1.12.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c3003652496f6e7c387b1cf63f4bb720951cfa18907e998ea551e6de51a04467"},
|
||||
{file = "scipy-1.12.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:8b8066bce124ee5531d12a74b617d9ac0ea59245246410e19bca549656d9a40a"},
|
||||
{file = "scipy-1.12.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:8bee4993817e204d761dba10dbab0774ba5a8612e57e81319ea04d84945375ba"},
|
||||
{file = "scipy-1.12.0-cp311-cp311-win_amd64.whl", hash = "sha256:a24024d45ce9a675c1fb8494e8e5244efea1c7a09c60beb1eeb80373d0fecc70"},
|
||||
{file = "scipy-1.12.0-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:e7e76cc48638228212c747ada851ef355c2bb5e7f939e10952bc504c11f4e372"},
|
||||
{file = "scipy-1.12.0-cp312-cp312-macosx_12_0_arm64.whl", hash = "sha256:f7ce148dffcd64ade37b2df9315541f9adad6efcaa86866ee7dd5db0c8f041c3"},
|
||||
{file = "scipy-1.12.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9c39f92041f490422924dfdb782527a4abddf4707616e07b021de33467f917bc"},
|
||||
{file = "scipy-1.12.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a7ebda398f86e56178c2fa94cad15bf457a218a54a35c2a7b4490b9f9cb2676c"},
|
||||
{file = "scipy-1.12.0-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:95e5c750d55cf518c398a8240571b0e0782c2d5a703250872f36eaf737751338"},
|
||||
{file = "scipy-1.12.0-cp312-cp312-win_amd64.whl", hash = "sha256:e646d8571804a304e1da01040d21577685ce8e2db08ac58e543eaca063453e1c"},
|
||||
{file = "scipy-1.12.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:913d6e7956c3a671de3b05ccb66b11bc293f56bfdef040583a7221d9e22a2e35"},
|
||||
{file = "scipy-1.12.0-cp39-cp39-macosx_12_0_arm64.whl", hash = "sha256:bba1b0c7256ad75401c73e4b3cf09d1f176e9bd4248f0d3112170fb2ec4db067"},
|
||||
{file = "scipy-1.12.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:730badef9b827b368f351eacae2e82da414e13cf8bd5051b4bdfd720271a5371"},
|
||||
{file = "scipy-1.12.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6546dc2c11a9df6926afcbdd8a3edec28566e4e785b915e849348c6dd9f3f490"},
|
||||
{file = "scipy-1.12.0-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:196ebad3a4882081f62a5bf4aeb7326aa34b110e533aab23e4374fcccb0890dc"},
|
||||
{file = "scipy-1.12.0-cp39-cp39-win_amd64.whl", hash = "sha256:b360f1b6b2f742781299514e99ff560d1fe9bd1bff2712894b52abe528d1fd1e"},
|
||||
{file = "scipy-1.12.0.tar.gz", hash = "sha256:4bf5abab8a36d20193c698b0f1fc282c1d083c94723902c447e5d2f1780936a3"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
numpy = ">=1.22.4,<1.29.0"
|
||||
|
||||
[package.extras]
|
||||
dev = ["click", "cython-lint (>=0.12.2)", "doit (>=0.36.0)", "mypy", "pycodestyle", "pydevtool", "rich-click", "ruff", "types-psutil", "typing_extensions"]
|
||||
doc = ["jupytext", "matplotlib (>2)", "myst-nb", "numpydoc", "pooch", "pydata-sphinx-theme (==0.9.0)", "sphinx (!=4.1.0)", "sphinx-design (>=0.2.0)"]
|
||||
test = ["asv", "gmpy2", "hypothesis", "mpmath", "pooch", "pytest", "pytest-cov", "pytest-timeout", "pytest-xdist", "scikit-umfpack", "threadpoolctl"]
|
||||
|
||||
[[package]]
|
||||
name = "semantic-version"
|
||||
version = "2.10.0"
|
||||
@@ -4714,6 +5048,27 @@ files = [
|
||||
dev = ["Django (>=1.11)", "check-manifest", "colorama (<=0.4.1)", "coverage", "flake8", "nose2", "readme-renderer (<25.0)", "tox", "wheel", "zest.releaser[recommended]"]
|
||||
doc = ["Sphinx", "sphinx-rtd-theme"]
|
||||
|
||||
[[package]]
|
||||
name = "sentence-transformers"
|
||||
version = "2.6.1"
|
||||
description = "Multilingual text embeddings"
|
||||
optional = true
|
||||
python-versions = ">=3.8.0"
|
||||
files = [
|
||||
{file = "sentence-transformers-2.6.1.tar.gz", hash = "sha256:633ad6b70e390ea335de8689652a5d6c21a323b79ed19519c2f392451088487f"},
|
||||
{file = "sentence_transformers-2.6.1-py3-none-any.whl", hash = "sha256:a887e17696b513f99a709ce1f37fd547f53857aebe863785ede546c303b09ea0"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
huggingface-hub = ">=0.15.1"
|
||||
numpy = "*"
|
||||
Pillow = "*"
|
||||
scikit-learn = "*"
|
||||
scipy = "*"
|
||||
torch = ">=1.11.0"
|
||||
tqdm = "*"
|
||||
transformers = ">=4.32.0,<5.0.0"
|
||||
|
||||
[[package]]
|
||||
name = "setuptools"
|
||||
version = "69.0.2"
|
||||
@@ -4906,6 +5261,17 @@ files = [
|
||||
[package.extras]
|
||||
doc = ["reno", "sphinx", "tornado (>=4.5)"]
|
||||
|
||||
[[package]]
|
||||
name = "threadpoolctl"
|
||||
version = "3.4.0"
|
||||
description = "threadpoolctl"
|
||||
optional = true
|
||||
python-versions = ">=3.8"
|
||||
files = [
|
||||
{file = "threadpoolctl-3.4.0-py3-none-any.whl", hash = "sha256:8f4c689a65b23e5ed825c8436a92b818aac005e0f3715f6a1664d7c7ee29d262"},
|
||||
{file = "threadpoolctl-3.4.0.tar.gz", hash = "sha256:f11b491a03661d6dd7ef692dd422ab34185d982466c49c8f98c8f716b5c93196"},
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "tiktoken"
|
||||
version = "0.5.2"
|
||||
@@ -5085,17 +5451,6 @@ dev = ["tokenizers[testing]"]
|
||||
docs = ["setuptools_rust", "sphinx", "sphinx_rtd_theme"]
|
||||
testing = ["black (==22.3)", "datasets", "numpy", "pytest", "requests"]
|
||||
|
||||
[[package]]
|
||||
name = "tomli"
|
||||
version = "2.0.1"
|
||||
description = "A lil' TOML parser"
|
||||
optional = false
|
||||
python-versions = ">=3.7"
|
||||
files = [
|
||||
{file = "tomli-2.0.1-py3-none-any.whl", hash = "sha256:939de3e7a6161af0c887ef91b7d41a53e7c5a1ca976325f429cb46ea9bc30ecc"},
|
||||
{file = "tomli-2.0.1.tar.gz", hash = "sha256:de526c12914f0c550d15924c62d72abc48d6fe7364aa87328337a31007fe8a4f"},
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "tomlkit"
|
||||
version = "0.12.0"
|
||||
@@ -5193,13 +5548,13 @@ telegram = ["requests"]
|
||||
|
||||
[[package]]
|
||||
name = "transformers"
|
||||
version = "4.38.1"
|
||||
version = "4.38.2"
|
||||
description = "State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow"
|
||||
optional = false
|
||||
python-versions = ">=3.8.0"
|
||||
files = [
|
||||
{file = "transformers-4.38.1-py3-none-any.whl", hash = "sha256:a7a9265fb060183e9d975cbbadc4d531b10281589c43f6d07563f86322728973"},
|
||||
{file = "transformers-4.38.1.tar.gz", hash = "sha256:86dc84ccbe36123647e84cbd50fc31618c109a41e6be92514b064ab55bf1304c"},
|
||||
{file = "transformers-4.38.2-py3-none-any.whl", hash = "sha256:c4029cb9f01b3dd335e52f364c52d2b37c65b4c78e02e6a08b1919c5c928573e"},
|
||||
{file = "transformers-4.38.2.tar.gz", hash = "sha256:c5fc7ad682b8a50a48b2a4c05d4ea2de5567adb1bdd00053619dbe5960857dd5"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
@@ -5464,7 +5819,6 @@ h11 = ">=0.8"
|
||||
httptools = {version = ">=0.5.0", optional = true, markers = "extra == \"standard\""}
|
||||
python-dotenv = {version = ">=0.13", optional = true, markers = "extra == \"standard\""}
|
||||
pyyaml = {version = ">=5.1", optional = true, markers = "extra == \"standard\""}
|
||||
typing-extensions = {version = ">=4.0", markers = "python_version < \"3.11\""}
|
||||
uvloop = {version = ">=0.14.0,<0.15.0 || >0.15.0,<0.15.1 || >0.15.1", optional = true, markers = "(sys_platform != \"win32\" and sys_platform != \"cygwin\") and platform_python_implementation != \"PyPy\" and extra == \"standard\""}
|
||||
watchfiles = {version = ">=0.13", optional = true, markers = "extra == \"standard\""}
|
||||
websockets = {version = ">=10.4", optional = true, markers = "extra == \"standard\""}
|
||||
@@ -5957,15 +6311,19 @@ docs = ["furo", "jaraco.packaging (>=9.3)", "jaraco.tidelift (>=1.4)", "rst.link
|
||||
testing = ["big-O", "jaraco.functools", "jaraco.itertools", "more-itertools", "pytest (>=6)", "pytest-black (>=0.3.7)", "pytest-checkdocs (>=2.4)", "pytest-cov", "pytest-enabler (>=2.2)", "pytest-ignore-flaky", "pytest-mypy (>=0.9.1)", "pytest-ruff"]
|
||||
|
||||
[extras]
|
||||
embeddings-azopenai = ["llama-index-embeddings-azure-openai"]
|
||||
embeddings-huggingface = ["llama-index-embeddings-huggingface"]
|
||||
embeddings-ollama = ["llama-index-embeddings-ollama"]
|
||||
embeddings-openai = ["llama-index-embeddings-openai"]
|
||||
embeddings-sagemaker = ["boto3"]
|
||||
llms-azopenai = ["llama-index-llms-azure-openai"]
|
||||
llms-llama-cpp = ["llama-index-llms-llama-cpp"]
|
||||
llms-nvidia-tensorrt = ["llama-index-llms-nvidia-tensorrt"]
|
||||
llms-ollama = ["llama-index-llms-ollama"]
|
||||
llms-openai = ["llama-index-llms-openai"]
|
||||
llms-openai-like = ["llama-index-llms-openai-like"]
|
||||
llms-sagemaker = ["boto3"]
|
||||
rerank-sentence-transformers = ["sentence-transformers", "torch"]
|
||||
storage-nodestore-postgres = ["asyncpg", "llama-index-storage-docstore-postgres", "llama-index-storage-index-store-postgres", "psycopg2-binary"]
|
||||
ui = ["gradio"]
|
||||
vector-stores-chroma = ["llama-index-vector-stores-chroma"]
|
||||
vector-stores-postgres = ["llama-index-vector-stores-postgres"]
|
||||
@@ -5973,5 +6331,5 @@ vector-stores-qdrant = ["llama-index-vector-stores-qdrant"]
|
||||
|
||||
[metadata]
|
||||
lock-version = "2.0"
|
||||
python-versions = ">=3.10,<3.12"
|
||||
content-hash = "39f0ac666402807cde29f763c14dfb6b2fc9862c0cd31de398c67a1fedbb4b12"
|
||||
python-versions = ">=3.11,<3.12"
|
||||
content-hash = "0b3665bd11a604609249ff0267e4e5cf009881d16a84f9774fc54d45a1373e09"
|
||||
|
@@ -1,4 +1,5 @@
|
||||
"""private-gpt."""
|
||||
|
||||
import logging
|
||||
import os
|
||||
|
||||
@@ -21,3 +22,6 @@ os.environ["GRADIO_ANALYTICS_ENABLED"] = "False"
|
||||
# Disable chromaDB telemetry
|
||||
# It is already disabled, see PR#1144
|
||||
# os.environ["ANONYMIZED_TELEMETRY"] = "False"
|
||||
|
||||
# adding tiktoken cache path within repo to be able to run in offline environment.
|
||||
os.environ["TIKTOKEN_CACHE_DIR"] = "tiktoken_cache"
|
||||
|
@@ -57,6 +57,39 @@ class EmbeddingComponent:
|
||||
|
||||
openai_settings = settings.openai.api_key
|
||||
self.embedding_model = OpenAIEmbedding(api_key=openai_settings)
|
||||
case "ollama":
|
||||
try:
|
||||
from llama_index.embeddings.ollama import ( # type: ignore
|
||||
OllamaEmbedding,
|
||||
)
|
||||
except ImportError as e:
|
||||
raise ImportError(
|
||||
"Local dependencies not found, install with `poetry install --extras embeddings-ollama`"
|
||||
) from e
|
||||
|
||||
ollama_settings = settings.ollama
|
||||
self.embedding_model = OllamaEmbedding(
|
||||
model_name=ollama_settings.embedding_model,
|
||||
base_url=ollama_settings.embedding_api_base,
|
||||
)
|
||||
case "azopenai":
|
||||
try:
|
||||
from llama_index.embeddings.azure_openai import ( # type: ignore
|
||||
AzureOpenAIEmbedding,
|
||||
)
|
||||
except ImportError as e:
|
||||
raise ImportError(
|
||||
"Azure OpenAI dependencies not found, install with `poetry install --extras embeddings-azopenai`"
|
||||
) from e
|
||||
|
||||
azopenai_settings = settings.azopenai
|
||||
self.embedding_model = AzureOpenAIEmbedding(
|
||||
model=azopenai_settings.embedding_model,
|
||||
deployment_name=azopenai_settings.embedding_deployment_name,
|
||||
api_key=azopenai_settings.api_key,
|
||||
azure_endpoint=azopenai_settings.azure_endpoint,
|
||||
api_version=azopenai_settings.api_version,
|
||||
)
|
||||
case "mock":
|
||||
# Not a random number, is the dimensionality used by
|
||||
# the default embedding model
|
||||
|
@@ -6,6 +6,7 @@ import multiprocessing.pool
|
||||
import os
|
||||
import threading
|
||||
from pathlib import Path
|
||||
from queue import Queue
|
||||
from typing import Any
|
||||
|
||||
from llama_index.core.data_structs import IndexDict
|
||||
@@ -13,12 +14,13 @@ from llama_index.core.embeddings.utils import EmbedType
|
||||
from llama_index.core.indices import VectorStoreIndex, load_index_from_storage
|
||||
from llama_index.core.indices.base import BaseIndex
|
||||
from llama_index.core.ingestion import run_transformations
|
||||
from llama_index.core.schema import Document, TransformComponent
|
||||
from llama_index.core.schema import BaseNode, Document, TransformComponent
|
||||
from llama_index.core.storage import StorageContext
|
||||
|
||||
from private_gpt.components.ingest.ingest_helper import IngestionHelper
|
||||
from private_gpt.paths import local_data_path
|
||||
from private_gpt.settings.settings import Settings
|
||||
from private_gpt.utils.eta import eta
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -314,6 +316,170 @@ class ParallelizedIngestComponent(BaseIngestComponentWithIndex):
|
||||
self._file_to_documents_work_pool.terminate()
|
||||
|
||||
|
||||
class PipelineIngestComponent(BaseIngestComponentWithIndex):
|
||||
"""Pipeline ingestion - keeping the embedding worker pool as busy as possible.
|
||||
|
||||
This class implements a threaded ingestion pipeline, which comprises two threads
|
||||
and two queues. The primary thread is responsible for reading and parsing files
|
||||
into documents. These documents are then placed into a queue, which is
|
||||
distributed to a pool of worker processes for embedding computation. After
|
||||
embedding, the documents are transferred to another queue where they are
|
||||
accumulated until a threshold is reached. Upon reaching this threshold, the
|
||||
accumulated documents are flushed to the document store, index, and vector
|
||||
store.
|
||||
|
||||
Exception handling ensures robustness against erroneous files. However, in the
|
||||
pipelined design, one error can lead to the discarding of multiple files. Any
|
||||
discarded files will be reported.
|
||||
"""
|
||||
|
||||
NODE_FLUSH_COUNT = 5000 # Save the index every # nodes.
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
storage_context: StorageContext,
|
||||
embed_model: EmbedType,
|
||||
transformations: list[TransformComponent],
|
||||
count_workers: int,
|
||||
*args: Any,
|
||||
**kwargs: Any,
|
||||
) -> None:
|
||||
super().__init__(storage_context, embed_model, transformations, *args, **kwargs)
|
||||
self.count_workers = count_workers
|
||||
assert (
|
||||
len(self.transformations) >= 2
|
||||
), "Embeddings must be in the transformations"
|
||||
assert count_workers > 0, "count_workers must be > 0"
|
||||
self.count_workers = count_workers
|
||||
# We are doing our own multiprocessing
|
||||
# To do not collide with the multiprocessing of huggingface, we disable it
|
||||
os.environ["TOKENIZERS_PARALLELISM"] = "false"
|
||||
|
||||
# doc_q stores parsed files as Document chunks.
|
||||
# Using a shallow queue causes the filesystem parser to block
|
||||
# when it reaches capacity. This ensures it doesn't outpace the
|
||||
# computationally intensive embeddings phase, avoiding unnecessary
|
||||
# memory consumption. The semaphore is used to bound the async worker
|
||||
# embedding computations to cause the doc Q to fill and block.
|
||||
self.doc_semaphore = multiprocessing.Semaphore(
|
||||
self.count_workers
|
||||
) # limit the doc queue to # items.
|
||||
self.doc_q: Queue[tuple[str, str | None, list[Document] | None]] = Queue(20)
|
||||
# node_q stores documents parsed into nodes (embeddings).
|
||||
# Larger queue size so we don't block the embedding workers during a slow
|
||||
# index update.
|
||||
self.node_q: Queue[
|
||||
tuple[str, str | None, list[Document] | None, list[BaseNode] | None]
|
||||
] = Queue(40)
|
||||
threading.Thread(target=self._doc_to_node, daemon=True).start()
|
||||
threading.Thread(target=self._write_nodes, daemon=True).start()
|
||||
|
||||
def _doc_to_node(self) -> None:
|
||||
# Parse documents into nodes
|
||||
with multiprocessing.pool.ThreadPool(processes=self.count_workers) as pool:
|
||||
while True:
|
||||
try:
|
||||
cmd, file_name, documents = self.doc_q.get(
|
||||
block=True
|
||||
) # Documents for a file
|
||||
if cmd == "process":
|
||||
# Push CPU/GPU embedding work to the worker pool
|
||||
# Acquire semaphore to control access to worker pool
|
||||
self.doc_semaphore.acquire()
|
||||
pool.apply_async(
|
||||
self._doc_to_node_worker, (file_name, documents)
|
||||
)
|
||||
elif cmd == "quit":
|
||||
break
|
||||
finally:
|
||||
if cmd != "process":
|
||||
self.doc_q.task_done() # unblock Q joins
|
||||
|
||||
def _doc_to_node_worker(self, file_name: str, documents: list[Document]) -> None:
|
||||
# CPU/GPU intensive work in its own process
|
||||
try:
|
||||
nodes = run_transformations(
|
||||
documents, # type: ignore[arg-type]
|
||||
self.transformations,
|
||||
show_progress=self.show_progress,
|
||||
)
|
||||
self.node_q.put(("process", file_name, documents, nodes))
|
||||
finally:
|
||||
self.doc_semaphore.release()
|
||||
self.doc_q.task_done() # unblock Q joins
|
||||
|
||||
def _save_docs(
|
||||
self, files: list[str], documents: list[Document], nodes: list[BaseNode]
|
||||
) -> None:
|
||||
try:
|
||||
logger.info(
|
||||
f"Saving {len(files)} files ({len(documents)} documents / {len(nodes)} nodes)"
|
||||
)
|
||||
self._index.insert_nodes(nodes)
|
||||
for document in documents:
|
||||
self._index.docstore.set_document_hash(
|
||||
document.get_doc_id(), document.hash
|
||||
)
|
||||
self._save_index()
|
||||
except Exception:
|
||||
# Tell the user so they can investigate these files
|
||||
logger.exception(f"Processing files {files}")
|
||||
finally:
|
||||
# Clearing work, even on exception, maintains a clean state.
|
||||
nodes.clear()
|
||||
documents.clear()
|
||||
files.clear()
|
||||
|
||||
def _write_nodes(self) -> None:
|
||||
# Save nodes to index. I/O intensive.
|
||||
node_stack: list[BaseNode] = []
|
||||
doc_stack: list[Document] = []
|
||||
file_stack: list[str] = []
|
||||
while True:
|
||||
try:
|
||||
cmd, file_name, documents, nodes = self.node_q.get(block=True)
|
||||
if cmd in ("flush", "quit"):
|
||||
if file_stack:
|
||||
self._save_docs(file_stack, doc_stack, node_stack)
|
||||
if cmd == "quit":
|
||||
break
|
||||
elif cmd == "process":
|
||||
node_stack.extend(nodes) # type: ignore[arg-type]
|
||||
doc_stack.extend(documents) # type: ignore[arg-type]
|
||||
file_stack.append(file_name) # type: ignore[arg-type]
|
||||
# Constant saving is heavy on I/O - accumulate to a threshold
|
||||
if len(node_stack) >= self.NODE_FLUSH_COUNT:
|
||||
self._save_docs(file_stack, doc_stack, node_stack)
|
||||
finally:
|
||||
self.node_q.task_done()
|
||||
|
||||
def _flush(self) -> None:
|
||||
self.doc_q.put(("flush", None, None))
|
||||
self.doc_q.join()
|
||||
self.node_q.put(("flush", None, None, None))
|
||||
self.node_q.join()
|
||||
|
||||
def ingest(self, file_name: str, file_data: Path) -> list[Document]:
|
||||
documents = IngestionHelper.transform_file_into_documents(file_name, file_data)
|
||||
self.doc_q.put(("process", file_name, documents))
|
||||
self._flush()
|
||||
return documents
|
||||
|
||||
def bulk_ingest(self, files: list[tuple[str, Path]]) -> list[Document]:
|
||||
docs = []
|
||||
for file_name, file_data in eta(files):
|
||||
try:
|
||||
documents = IngestionHelper.transform_file_into_documents(
|
||||
file_name, file_data
|
||||
)
|
||||
self.doc_q.put(("process", file_name, documents))
|
||||
docs.extend(documents)
|
||||
except Exception:
|
||||
logger.exception(f"Skipping {file_data.name}")
|
||||
self._flush()
|
||||
return docs
|
||||
|
||||
|
||||
def get_ingestion_component(
|
||||
storage_context: StorageContext,
|
||||
embed_model: EmbedType,
|
||||
@@ -336,6 +502,13 @@ def get_ingestion_component(
|
||||
transformations=transformations,
|
||||
count_workers=settings.embedding.count_workers,
|
||||
)
|
||||
elif ingest_mode == "pipeline":
|
||||
return PipelineIngestComponent(
|
||||
storage_context=storage_context,
|
||||
embed_model=embed_model,
|
||||
transformations=transformations,
|
||||
count_workers=settings.embedding.count_workers,
|
||||
)
|
||||
else:
|
||||
return SimpleIngestComponent(
|
||||
storage_context=storage_context,
|
||||
|
@@ -1,4 +1,6 @@
|
||||
import logging
|
||||
from collections.abc import Callable
|
||||
from typing import Any
|
||||
|
||||
from injector import inject, singleton
|
||||
from llama_index.core.llms import LLM, MockLLM
|
||||
@@ -39,16 +41,23 @@ class LLMComponent:
|
||||
) from e
|
||||
|
||||
prompt_style = get_prompt_style(settings.llamacpp.prompt_style)
|
||||
|
||||
settings_kwargs = {
|
||||
"tfs_z": settings.llamacpp.tfs_z, # ollama and llama-cpp
|
||||
"top_k": settings.llamacpp.top_k, # ollama and llama-cpp
|
||||
"top_p": settings.llamacpp.top_p, # ollama and llama-cpp
|
||||
"repeat_penalty": settings.llamacpp.repeat_penalty, # ollama llama-cpp
|
||||
"n_gpu_layers": -1,
|
||||
"offload_kqv": True,
|
||||
}
|
||||
self.llm = LlamaCPP(
|
||||
model_path=str(models_path / settings.llamacpp.llm_hf_model_file),
|
||||
temperature=0.1,
|
||||
temperature=settings.llm.temperature,
|
||||
max_new_tokens=settings.llm.max_new_tokens,
|
||||
context_window=settings.llm.context_window,
|
||||
generate_kwargs={},
|
||||
callback_manager=LlamaIndexSettings.callback_manager,
|
||||
# All to GPU
|
||||
model_kwargs={"n_gpu_layers": -1, "offload_kqv": True},
|
||||
model_kwargs=settings_kwargs,
|
||||
# transform inputs into Llama2 format
|
||||
messages_to_prompt=prompt_style.messages_to_prompt,
|
||||
completion_to_prompt=prompt_style.completion_to_prompt,
|
||||
@@ -108,25 +117,59 @@ class LLMComponent:
|
||||
) from e
|
||||
|
||||
ollama_settings = settings.ollama
|
||||
|
||||
settings_kwargs = {
|
||||
"tfs_z": ollama_settings.tfs_z, # ollama and llama-cpp
|
||||
"num_predict": ollama_settings.num_predict, # ollama only
|
||||
"top_k": ollama_settings.top_k, # ollama and llama-cpp
|
||||
"top_p": ollama_settings.top_p, # ollama and llama-cpp
|
||||
"repeat_last_n": ollama_settings.repeat_last_n, # ollama
|
||||
"repeat_penalty": ollama_settings.repeat_penalty, # ollama llama-cpp
|
||||
}
|
||||
|
||||
self.llm = Ollama(
|
||||
model=ollama_settings.model, base_url=ollama_settings.api_base
|
||||
model=ollama_settings.llm_model,
|
||||
base_url=ollama_settings.api_base,
|
||||
temperature=settings.llm.temperature,
|
||||
context_window=settings.llm.context_window,
|
||||
additional_kwargs=settings_kwargs,
|
||||
request_timeout=ollama_settings.request_timeout,
|
||||
)
|
||||
case "tensorrt":
|
||||
|
||||
if (
|
||||
ollama_settings.keep_alive
|
||||
!= ollama_settings.model_fields["keep_alive"].default
|
||||
):
|
||||
# Modify Ollama methods to use the "keep_alive" field.
|
||||
def add_keep_alive(func: Callable[..., Any]) -> Callable[..., Any]:
|
||||
def wrapper(*args: Any, **kwargs: Any) -> Any:
|
||||
kwargs["keep_alive"] = ollama_settings.keep_alive
|
||||
return func(*args, **kwargs)
|
||||
|
||||
return wrapper
|
||||
|
||||
Ollama.chat = add_keep_alive(Ollama.chat)
|
||||
Ollama.stream_chat = add_keep_alive(Ollama.stream_chat)
|
||||
Ollama.complete = add_keep_alive(Ollama.complete)
|
||||
Ollama.stream_complete = add_keep_alive(Ollama.stream_complete)
|
||||
|
||||
case "azopenai":
|
||||
try:
|
||||
from llama_index.llms.nvidia_tensorrt import ( # type: ignore
|
||||
LocalTensorRTLLM,
|
||||
from llama_index.llms.azure_openai import ( # type: ignore
|
||||
AzureOpenAI,
|
||||
)
|
||||
except ImportError as e:
|
||||
raise ImportError(
|
||||
"Nvidia TensorRTLLM dependencies not found, install with `poetry install --extras llms-nvidia-tensorrt`"
|
||||
"Azure OpenAI dependencies not found, install with `poetry install --extras llms-azopenai`"
|
||||
) from e
|
||||
|
||||
prompt_style = get_prompt_style(settings.tensorrt.prompt_style)
|
||||
self.llm = LocalTensorRTLLM(
|
||||
model_path=settings.tensorrt.model_path,
|
||||
engine_name=settings.tensorrt.engine_name,
|
||||
tokenizer_dir=settings.llm.tokenizer,
|
||||
completion_to_prompt=prompt_style.completion_to_prompt,
|
||||
azopenai_settings = settings.azopenai
|
||||
self.llm = AzureOpenAI(
|
||||
model=azopenai_settings.llm_model,
|
||||
deployment_name=azopenai_settings.llm_deployment_name,
|
||||
api_key=azopenai_settings.api_key,
|
||||
azure_endpoint=azopenai_settings.azure_endpoint,
|
||||
api_version=azopenai_settings.api_version,
|
||||
)
|
||||
case "mock":
|
||||
self.llm = MockLLM()
|
||||
|
@@ -6,6 +6,7 @@ from llama_index.core.storage.index_store import SimpleIndexStore
|
||||
from llama_index.core.storage.index_store.types import BaseIndexStore
|
||||
|
||||
from private_gpt.paths import local_data_path
|
||||
from private_gpt.settings.settings import Settings
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -16,19 +17,51 @@ class NodeStoreComponent:
|
||||
doc_store: BaseDocumentStore
|
||||
|
||||
@inject
|
||||
def __init__(self) -> None:
|
||||
try:
|
||||
self.index_store = SimpleIndexStore.from_persist_dir(
|
||||
persist_dir=str(local_data_path)
|
||||
)
|
||||
except FileNotFoundError:
|
||||
logger.debug("Local index store not found, creating a new one")
|
||||
self.index_store = SimpleIndexStore()
|
||||
def __init__(self, settings: Settings) -> None:
|
||||
match settings.nodestore.database:
|
||||
case "simple":
|
||||
try:
|
||||
self.index_store = SimpleIndexStore.from_persist_dir(
|
||||
persist_dir=str(local_data_path)
|
||||
)
|
||||
except FileNotFoundError:
|
||||
logger.debug("Local index store not found, creating a new one")
|
||||
self.index_store = SimpleIndexStore()
|
||||
|
||||
try:
|
||||
self.doc_store = SimpleDocumentStore.from_persist_dir(
|
||||
persist_dir=str(local_data_path)
|
||||
)
|
||||
except FileNotFoundError:
|
||||
logger.debug("Local document store not found, creating a new one")
|
||||
self.doc_store = SimpleDocumentStore()
|
||||
try:
|
||||
self.doc_store = SimpleDocumentStore.from_persist_dir(
|
||||
persist_dir=str(local_data_path)
|
||||
)
|
||||
except FileNotFoundError:
|
||||
logger.debug("Local document store not found, creating a new one")
|
||||
self.doc_store = SimpleDocumentStore()
|
||||
|
||||
case "postgres":
|
||||
try:
|
||||
from llama_index.core.storage.docstore.postgres_docstore import (
|
||||
PostgresDocumentStore,
|
||||
)
|
||||
from llama_index.core.storage.index_store.postgres_index_store import (
|
||||
PostgresIndexStore,
|
||||
)
|
||||
except ImportError:
|
||||
raise ImportError(
|
||||
"Postgres dependencies not found, install with `poetry install --extras storage-nodestore-postgres`"
|
||||
) from None
|
||||
|
||||
if settings.postgres is None:
|
||||
raise ValueError("Postgres index/doc store settings not found.")
|
||||
|
||||
self.index_store = PostgresIndexStore.from_params(
|
||||
**settings.postgres.model_dump(exclude_none=True)
|
||||
)
|
||||
self.doc_store = PostgresDocumentStore.from_params(
|
||||
**settings.postgres.model_dump(exclude_none=True)
|
||||
)
|
||||
|
||||
case _:
|
||||
# Should be unreachable
|
||||
# The settings validator should have caught this
|
||||
raise ValueError(
|
||||
f"Database {settings.nodestore.database} not supported"
|
||||
)
|
||||
|
@@ -3,7 +3,12 @@ import typing
|
||||
|
||||
from injector import inject, singleton
|
||||
from llama_index.core.indices.vector_store import VectorIndexRetriever, VectorStoreIndex
|
||||
from llama_index.core.vector_stores.types import VectorStore
|
||||
from llama_index.core.vector_stores.types import (
|
||||
FilterCondition,
|
||||
MetadataFilter,
|
||||
MetadataFilters,
|
||||
VectorStore,
|
||||
)
|
||||
|
||||
from private_gpt.open_ai.extensions.context_filter import ContextFilter
|
||||
from private_gpt.paths import local_data_path
|
||||
@@ -12,33 +17,28 @@ from private_gpt.settings.settings import Settings
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@typing.no_type_check
|
||||
def _chromadb_doc_id_metadata_filter(
|
||||
def _doc_id_metadata_filter(
|
||||
context_filter: ContextFilter | None,
|
||||
) -> dict | None:
|
||||
if context_filter is None or context_filter.docs_ids is None:
|
||||
return {} # No filter
|
||||
elif len(context_filter.docs_ids) < 1:
|
||||
return {"doc_id": "-"} # Effectively filtering out all docs
|
||||
else:
|
||||
doc_filter_items = []
|
||||
if len(context_filter.docs_ids) > 1:
|
||||
doc_filter = {"$or": doc_filter_items}
|
||||
for doc_id in context_filter.docs_ids:
|
||||
doc_filter_items.append({"doc_id": doc_id})
|
||||
else:
|
||||
doc_filter = {"doc_id": context_filter.docs_ids[0]}
|
||||
return doc_filter
|
||||
) -> MetadataFilters:
|
||||
filters = MetadataFilters(filters=[], condition=FilterCondition.OR)
|
||||
|
||||
if context_filter is not None and context_filter.docs_ids is not None:
|
||||
for doc_id in context_filter.docs_ids:
|
||||
filters.filters.append(MetadataFilter(key="doc_id", value=doc_id))
|
||||
|
||||
return filters
|
||||
|
||||
|
||||
@singleton
|
||||
class VectorStoreComponent:
|
||||
settings: Settings
|
||||
vector_store: VectorStore
|
||||
|
||||
@inject
|
||||
def __init__(self, settings: Settings) -> None:
|
||||
self.settings = settings
|
||||
match settings.vectorstore.database:
|
||||
case "pgvector":
|
||||
case "postgres":
|
||||
try:
|
||||
from llama_index.vector_stores.postgres import ( # type: ignore
|
||||
PGVectorStore,
|
||||
@@ -48,15 +48,17 @@ class VectorStoreComponent:
|
||||
"Postgres dependencies not found, install with `poetry install --extras vector-stores-postgres`"
|
||||
) from e
|
||||
|
||||
if settings.pgvector is None:
|
||||
if settings.postgres is None:
|
||||
raise ValueError(
|
||||
"PGVectorStore settings not found. Please provide settings."
|
||||
"Postgres settings not found. Please provide settings."
|
||||
)
|
||||
|
||||
self.vector_store = typing.cast(
|
||||
VectorStore,
|
||||
PGVectorStore.from_params(
|
||||
**settings.pgvector.model_dump(exclude_none=True)
|
||||
**settings.postgres.model_dump(exclude_none=True),
|
||||
table_name="embeddings",
|
||||
embed_dim=settings.embedding.embed_dim,
|
||||
),
|
||||
)
|
||||
|
||||
@@ -96,7 +98,7 @@ class VectorStoreComponent:
|
||||
from llama_index.vector_stores.qdrant import ( # type: ignore
|
||||
QdrantVectorStore,
|
||||
)
|
||||
from qdrant_client import QdrantClient
|
||||
from qdrant_client import QdrantClient # type: ignore
|
||||
except ImportError as e:
|
||||
raise ImportError(
|
||||
"Qdrant dependencies not found, install with `poetry install --extras vector-stores-qdrant`"
|
||||
@@ -126,20 +128,22 @@ class VectorStoreComponent:
|
||||
f"Vectorstore database {settings.vectorstore.database} not supported"
|
||||
)
|
||||
|
||||
@staticmethod
|
||||
def get_retriever(
|
||||
self,
|
||||
index: VectorStoreIndex,
|
||||
context_filter: ContextFilter | None = None,
|
||||
similarity_top_k: int = 2,
|
||||
) -> VectorIndexRetriever:
|
||||
# This way we support qdrant (using doc_ids) and chroma (using where clause)
|
||||
# This way we support qdrant (using doc_ids) and the rest (using filters)
|
||||
return VectorIndexRetriever(
|
||||
index=index,
|
||||
similarity_top_k=similarity_top_k,
|
||||
doc_ids=context_filter.docs_ids if context_filter else None,
|
||||
vector_store_kwargs={
|
||||
"where": _chromadb_doc_id_metadata_filter(context_filter)
|
||||
},
|
||||
filters=(
|
||||
_doc_id_metadata_filter(context_filter)
|
||||
if self.settings.vectorstore.database != "qdrant"
|
||||
else None
|
||||
),
|
||||
)
|
||||
|
||||
def close(self) -> None:
|
||||
|
@@ -1,4 +1,5 @@
|
||||
"""FastAPI app creation, logger configuration and main API routes."""
|
||||
|
||||
import logging
|
||||
|
||||
from fastapi import Depends, FastAPI, Request
|
||||
|
@@ -8,6 +8,10 @@ from llama_index.core.chat_engine.types import (
|
||||
from llama_index.core.indices import VectorStoreIndex
|
||||
from llama_index.core.indices.postprocessor import MetadataReplacementPostProcessor
|
||||
from llama_index.core.llms import ChatMessage, MessageRole
|
||||
from llama_index.core.postprocessor import (
|
||||
SentenceTransformerRerank,
|
||||
SimilarityPostprocessor,
|
||||
)
|
||||
from llama_index.core.storage import StorageContext
|
||||
from llama_index.core.types import TokenGen
|
||||
from pydantic import BaseModel
|
||||
@@ -20,6 +24,7 @@ from private_gpt.components.vector_store.vector_store_component import (
|
||||
)
|
||||
from private_gpt.open_ai.extensions.context_filter import ContextFilter
|
||||
from private_gpt.server.chunks.chunks_service import Chunk
|
||||
from private_gpt.settings.settings import Settings
|
||||
|
||||
|
||||
class Completion(BaseModel):
|
||||
@@ -68,14 +73,18 @@ class ChatEngineInput:
|
||||
|
||||
@singleton
|
||||
class ChatService:
|
||||
settings: Settings
|
||||
|
||||
@inject
|
||||
def __init__(
|
||||
self,
|
||||
settings: Settings,
|
||||
llm_component: LLMComponent,
|
||||
vector_store_component: VectorStoreComponent,
|
||||
embedding_component: EmbeddingComponent,
|
||||
node_store_component: NodeStoreComponent,
|
||||
) -> None:
|
||||
self.settings = settings
|
||||
self.llm_component = llm_component
|
||||
self.embedding_component = embedding_component
|
||||
self.vector_store_component = vector_store_component
|
||||
@@ -98,25 +107,31 @@ class ChatService:
|
||||
use_context: bool = False,
|
||||
context_filter: ContextFilter | None = None,
|
||||
) -> BaseChatEngine:
|
||||
settings = self.settings
|
||||
if use_context:
|
||||
vector_index_retriever = self.vector_store_component.get_retriever(
|
||||
index=self.index, context_filter=context_filter
|
||||
index=self.index,
|
||||
context_filter=context_filter,
|
||||
similarity_top_k=self.settings.rag.similarity_top_k,
|
||||
)
|
||||
# TODO ContextChatEngine is still not migrated by LlamaIndex to accept
|
||||
# llm directly, so we are passing legacy ServiceContext until it is fixed.
|
||||
from llama_index.core import ServiceContext
|
||||
node_postprocessors = [
|
||||
MetadataReplacementPostProcessor(target_metadata_key="window"),
|
||||
SimilarityPostprocessor(
|
||||
similarity_cutoff=settings.rag.similarity_value
|
||||
),
|
||||
]
|
||||
|
||||
if settings.rag.rerank.enabled:
|
||||
rerank_postprocessor = SentenceTransformerRerank(
|
||||
model=settings.rag.rerank.model, top_n=settings.rag.rerank.top_n
|
||||
)
|
||||
node_postprocessors.append(rerank_postprocessor)
|
||||
|
||||
return ContextChatEngine.from_defaults(
|
||||
system_prompt=system_prompt,
|
||||
retriever=vector_index_retriever,
|
||||
llm=self.llm_component.llm, # Takes no effect at the moment
|
||||
service_context=ServiceContext.from_defaults(
|
||||
llm=self.llm_component.llm,
|
||||
embed_model=self.embedding_component.embedding_model,
|
||||
),
|
||||
node_postprocessors=[
|
||||
MetadataReplacementPostProcessor(target_metadata_key="window"),
|
||||
],
|
||||
node_postprocessors=node_postprocessors,
|
||||
)
|
||||
else:
|
||||
return SimpleChatEngine.from_defaults(
|
||||
|
@@ -1,7 +1,7 @@
|
||||
import logging
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
from typing import AnyStr, BinaryIO
|
||||
from typing import TYPE_CHECKING, AnyStr, BinaryIO
|
||||
|
||||
from injector import inject, singleton
|
||||
from llama_index.core.node_parser import SentenceWindowNodeParser
|
||||
@@ -17,6 +17,9 @@ from private_gpt.components.vector_store.vector_store_component import (
|
||||
from private_gpt.server.ingest.model import IngestedDoc
|
||||
from private_gpt.settings.settings import settings
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from llama_index.core.storage.docstore.types import RefDocInfo
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@@ -86,17 +89,15 @@ class IngestService:
|
||||
return [IngestedDoc.from_document(document) for document in documents]
|
||||
|
||||
def list_ingested(self) -> list[IngestedDoc]:
|
||||
ingested_docs = []
|
||||
ingested_docs: list[IngestedDoc] = []
|
||||
try:
|
||||
docstore = self.storage_context.docstore
|
||||
ingested_docs_ids: set[str] = set()
|
||||
ref_docs: dict[str, RefDocInfo] | None = docstore.get_all_ref_doc_info()
|
||||
|
||||
for node in docstore.docs.values():
|
||||
if node.ref_doc_id is not None:
|
||||
ingested_docs_ids.add(node.ref_doc_id)
|
||||
if not ref_docs:
|
||||
return ingested_docs
|
||||
|
||||
for doc_id in ingested_docs_ids:
|
||||
ref_doc_info = docstore.get_ref_doc_info(ref_doc_id=doc_id)
|
||||
for doc_id, ref_doc_info in ref_docs.items():
|
||||
doc_metadata = None
|
||||
if ref_doc_info is not None and ref_doc_info.metadata is not None:
|
||||
doc_metadata = IngestedDoc.curate_metadata(ref_doc_info.metadata)
|
||||
|
@@ -12,6 +12,7 @@ Authorization can be done by following fastapi's guides:
|
||||
* https://fastapi.tiangolo.com/tutorial/security/
|
||||
* https://fastapi.tiangolo.com/tutorial/dependencies/dependencies-in-path-operation-decorators/
|
||||
"""
|
||||
|
||||
# mypy: ignore-errors
|
||||
# Disabled mypy error: All conditional function variants must have identical signatures
|
||||
# We are changing the implementation of the authenticated method, based on
|
||||
|
@@ -82,7 +82,7 @@ class DataSettings(BaseModel):
|
||||
|
||||
class LLMSettings(BaseModel):
|
||||
mode: Literal[
|
||||
"llamacpp", "openai", "openailike", "sagemaker", "mock", "ollama", "tensorrt"
|
||||
"llamacpp", "openai", "openailike", "azopenai", "sagemaker", "mock", "ollama"
|
||||
]
|
||||
max_new_tokens: int = Field(
|
||||
256,
|
||||
@@ -100,10 +100,18 @@ class LLMSettings(BaseModel):
|
||||
"like `HuggingFaceH4/zephyr-7b-beta`. If not set, will load a tokenizer matching "
|
||||
"gpt-3.5-turbo LLM.",
|
||||
)
|
||||
temperature: float = Field(
|
||||
0.1,
|
||||
description="The temperature of the model. Increasing the temperature will make the model answer more creatively. A value of 0.1 would be more factual.",
|
||||
)
|
||||
|
||||
|
||||
class VectorstoreSettings(BaseModel):
|
||||
database: Literal["chroma", "qdrant", "pgvector"]
|
||||
database: Literal["chroma", "qdrant", "postgres"]
|
||||
|
||||
|
||||
class NodeStoreSettings(BaseModel):
|
||||
database: Literal["simple", "postgres"]
|
||||
|
||||
|
||||
class LlamaCPPSettings(BaseModel):
|
||||
@@ -121,20 +129,21 @@ class LlamaCPPSettings(BaseModel):
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
class TensorRTSettings(BaseModel):
|
||||
model_path: str
|
||||
engine_name: str
|
||||
prompt_style: Literal["default", "llama2", "tag", "mistral", "chatml"] = Field(
|
||||
"llama2",
|
||||
description=(
|
||||
"The prompt style to use for the chat engine. "
|
||||
"If `default` - use the default prompt style from the llama_index. It should look like `role: message`.\n"
|
||||
"If `llama2` - use the llama2 prompt style from the llama_index. Based on `<s>`, `[INST]` and `<<SYS>>`.\n"
|
||||
"If `tag` - use the `tag` prompt style. It should look like `<|role|>: message`. \n"
|
||||
"If `mistral` - use the `mistral prompt style. It shoudl look like <s>[INST] {System Prompt} [/INST]</s>[INST] { UserInstructions } [/INST]"
|
||||
"`llama2` is the historic behaviour. `default` might work better with your custom models."
|
||||
),
|
||||
tfs_z: float = Field(
|
||||
1.0,
|
||||
description="Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting.",
|
||||
)
|
||||
top_k: int = Field(
|
||||
40,
|
||||
description="Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)",
|
||||
)
|
||||
top_p: float = Field(
|
||||
0.9,
|
||||
description="Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)",
|
||||
)
|
||||
repeat_penalty: float = Field(
|
||||
1.1,
|
||||
description="Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)",
|
||||
)
|
||||
|
||||
|
||||
@@ -145,14 +154,15 @@ class HuggingFaceSettings(BaseModel):
|
||||
|
||||
|
||||
class EmbeddingSettings(BaseModel):
|
||||
mode: Literal["huggingface", "openai", "sagemaker", "mock"]
|
||||
ingest_mode: Literal["simple", "batch", "parallel"] = Field(
|
||||
mode: Literal["huggingface", "openai", "azopenai", "sagemaker", "ollama", "mock"]
|
||||
ingest_mode: Literal["simple", "batch", "parallel", "pipeline"] = Field(
|
||||
"simple",
|
||||
description=(
|
||||
"The ingest mode to use for the embedding engine:\n"
|
||||
"If `simple` - ingest files sequentially and one by one. It is the historic behaviour.\n"
|
||||
"If `batch` - if multiple files, parse all the files in parallel, "
|
||||
"and send them in batch to the embedding model.\n"
|
||||
"In `pipeline` - The Embedding engine is kept as busy as possible\n"
|
||||
"If `parallel` - parse the files in parallel using multiple cores, and embedd them in parallel.\n"
|
||||
"`parallel` is the fastest mode for local setup, as it parallelize IO RW in the index.\n"
|
||||
"For modes that leverage parallelization, you can specify the number of "
|
||||
@@ -165,11 +175,16 @@ class EmbeddingSettings(BaseModel):
|
||||
"The number of workers to use for file ingestion.\n"
|
||||
"In `batch` mode, this is the number of workers used to parse the files.\n"
|
||||
"In `parallel` mode, this is the number of workers used to parse the files and embed them.\n"
|
||||
"In `pipeline` mode, this is the number of workers that can perform embeddings.\n"
|
||||
"This is only used if `ingest_mode` is not `simple`.\n"
|
||||
"Do not go too high with this number, as it might cause memory issues. (especially in `parallel` mode)\n"
|
||||
"Do not set it higher than your number of threads of your CPU."
|
||||
),
|
||||
)
|
||||
embed_dim: int = Field(
|
||||
384,
|
||||
description="The dimension of the embeddings stored in the Postgres database",
|
||||
)
|
||||
|
||||
|
||||
class SagemakerSettings(BaseModel):
|
||||
@@ -194,10 +209,69 @@ class OllamaSettings(BaseModel):
|
||||
"http://localhost:11434",
|
||||
description="Base URL of Ollama API. Example: 'https://localhost:11434'.",
|
||||
)
|
||||
model: str = Field(
|
||||
embedding_api_base: str = Field(
|
||||
api_base, # default is same as api_base, unless specified differently
|
||||
description="Base URL of Ollama embedding API. Defaults to the same value as api_base",
|
||||
)
|
||||
llm_model: str = Field(
|
||||
None,
|
||||
description="Model to use. Example: 'llama2-uncensored'.",
|
||||
)
|
||||
embedding_model: str = Field(
|
||||
None,
|
||||
description="Model to use. Example: 'nomic-embed-text'.",
|
||||
)
|
||||
keep_alive: str = Field(
|
||||
"5m",
|
||||
description="Time the model will stay loaded in memory after a request. examples: 5m, 5h, '-1' ",
|
||||
)
|
||||
tfs_z: float = Field(
|
||||
1.0,
|
||||
description="Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting.",
|
||||
)
|
||||
num_predict: int = Field(
|
||||
None,
|
||||
description="Maximum number of tokens to predict when generating text. (Default: 128, -1 = infinite generation, -2 = fill context)",
|
||||
)
|
||||
top_k: int = Field(
|
||||
40,
|
||||
description="Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)",
|
||||
)
|
||||
top_p: float = Field(
|
||||
0.9,
|
||||
description="Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)",
|
||||
)
|
||||
repeat_last_n: int = Field(
|
||||
64,
|
||||
description="Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)",
|
||||
)
|
||||
repeat_penalty: float = Field(
|
||||
1.1,
|
||||
description="Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)",
|
||||
)
|
||||
request_timeout: float = Field(
|
||||
120.0,
|
||||
description="Time elapsed until ollama times out the request. Default is 120s. Format is float. ",
|
||||
)
|
||||
|
||||
|
||||
class AzureOpenAISettings(BaseModel):
|
||||
api_key: str
|
||||
azure_endpoint: str
|
||||
api_version: str = Field(
|
||||
"2023_05_15",
|
||||
description="The API version to use for this operation. This follows the YYYY-MM-DD format.",
|
||||
)
|
||||
embedding_deployment_name: str
|
||||
embedding_model: str = Field(
|
||||
"text-embedding-ada-002",
|
||||
description="OpenAI Model to use. Example: 'text-embedding-ada-002'.",
|
||||
)
|
||||
llm_deployment_name: str
|
||||
llm_model: str = Field(
|
||||
"gpt-35-turbo",
|
||||
description="OpenAI Model to use. Example: 'gpt-4'.",
|
||||
)
|
||||
|
||||
|
||||
class UISettings(BaseModel):
|
||||
@@ -218,7 +292,34 @@ class UISettings(BaseModel):
|
||||
)
|
||||
|
||||
|
||||
class PGVectorSettings(BaseModel):
|
||||
class RerankSettings(BaseModel):
|
||||
enabled: bool = Field(
|
||||
False,
|
||||
description="This value controls whether a reranker should be included in the RAG pipeline.",
|
||||
)
|
||||
model: str = Field(
|
||||
"cross-encoder/ms-marco-MiniLM-L-2-v2",
|
||||
description="Rerank model to use. Limited to SentenceTransformer cross-encoder models.",
|
||||
)
|
||||
top_n: int = Field(
|
||||
2,
|
||||
description="This value controls the number of documents returned by the RAG pipeline.",
|
||||
)
|
||||
|
||||
|
||||
class RagSettings(BaseModel):
|
||||
similarity_top_k: int = Field(
|
||||
2,
|
||||
description="This value controls the number of documents returned by the RAG pipeline or considered for reranking if enabled.",
|
||||
)
|
||||
similarity_value: float = Field(
|
||||
None,
|
||||
description="If set, any documents retrieved from the RAG must meet a certain match score. Acceptable values are between 0 and 1.",
|
||||
)
|
||||
rerank: RerankSettings
|
||||
|
||||
|
||||
class PostgresSettings(BaseModel):
|
||||
host: str = Field(
|
||||
"localhost",
|
||||
description="The server hosting the Postgres database",
|
||||
@@ -239,17 +340,9 @@ class PGVectorSettings(BaseModel):
|
||||
"postgres",
|
||||
description="The database to use to connect to the Postgres database",
|
||||
)
|
||||
embed_dim: int = Field(
|
||||
384,
|
||||
description="The dimension of the embeddings stored in the Postgres database",
|
||||
)
|
||||
schema_name: str = Field(
|
||||
"public",
|
||||
description="The name of the schema in the Postgres database where the embeddings are stored",
|
||||
)
|
||||
table_name: str = Field(
|
||||
"embeddings",
|
||||
description="The name of the table in the Postgres database where the embeddings are stored",
|
||||
description="The name of the schema in the Postgres database to use",
|
||||
)
|
||||
|
||||
|
||||
@@ -314,14 +407,16 @@ class Settings(BaseModel):
|
||||
llm: LLMSettings
|
||||
embedding: EmbeddingSettings
|
||||
llamacpp: LlamaCPPSettings
|
||||
tensorrt: TensorRTSettings
|
||||
huggingface: HuggingFaceSettings
|
||||
sagemaker: SagemakerSettings
|
||||
openai: OpenAISettings
|
||||
ollama: OllamaSettings
|
||||
azopenai: AzureOpenAISettings
|
||||
vectorstore: VectorstoreSettings
|
||||
nodestore: NodeStoreSettings
|
||||
rag: RagSettings
|
||||
qdrant: QdrantSettings | None = None
|
||||
pgvector: PGVectorSettings | None = None
|
||||
postgres: PostgresSettings | None = None
|
||||
|
||||
|
||||
"""
|
||||
|
@@ -1,4 +1,5 @@
|
||||
"""This file should be imported only and only if you want to run the UI locally."""
|
||||
"""This file should be imported if and only if you want to run the UI locally."""
|
||||
|
||||
import itertools
|
||||
import logging
|
||||
import time
|
||||
@@ -44,8 +45,8 @@ class Source(BaseModel):
|
||||
frozen = True
|
||||
|
||||
@staticmethod
|
||||
def curate_sources(sources: list[Chunk]) -> set["Source"]:
|
||||
curated_sources = set()
|
||||
def curate_sources(sources: list[Chunk]) -> list["Source"]:
|
||||
curated_sources = []
|
||||
|
||||
for chunk in sources:
|
||||
doc_metadata = chunk.document.doc_metadata
|
||||
@@ -54,7 +55,10 @@ class Source(BaseModel):
|
||||
page_label = doc_metadata.get("page_label", "-") if doc_metadata else "-"
|
||||
|
||||
source = Source(file=file_name, page=page_label, text=chunk.text)
|
||||
curated_sources.add(source)
|
||||
curated_sources.append(source)
|
||||
curated_sources = list(
|
||||
dict.fromkeys(curated_sources).keys()
|
||||
) # Unique sources only
|
||||
|
||||
return curated_sources
|
||||
|
||||
@@ -96,10 +100,15 @@ class PrivateGptUi:
|
||||
if completion_gen.sources:
|
||||
full_response += SOURCES_SEPARATOR
|
||||
cur_sources = Source.curate_sources(completion_gen.sources)
|
||||
sources_text = "\n\n\n".join(
|
||||
f"{index}. {source.file} (page {source.page})"
|
||||
for index, source in enumerate(cur_sources, start=1)
|
||||
)
|
||||
sources_text = "\n\n\n"
|
||||
used_files = set()
|
||||
for index, source in enumerate(cur_sources, start=1):
|
||||
if f"{source.file}-{source.page}" not in used_files:
|
||||
sources_text = (
|
||||
sources_text
|
||||
+ f"{index}. {source.file} (page {source.page}) \n\n"
|
||||
)
|
||||
used_files.add(f"{source.file}-{source.page}")
|
||||
full_response += sources_text
|
||||
yield full_response
|
||||
|
||||
@@ -409,11 +418,54 @@ class PrivateGptUi:
|
||||
inputs=system_prompt_input,
|
||||
)
|
||||
|
||||
def get_model_label() -> str | None:
|
||||
"""Get model label from llm mode setting YAML.
|
||||
|
||||
Raises:
|
||||
ValueError: If an invalid 'llm_mode' is encountered.
|
||||
|
||||
Returns:
|
||||
str: The corresponding model label.
|
||||
"""
|
||||
# Get model label from llm mode setting YAML
|
||||
# Labels: local, openai, openailike, sagemaker, mock, ollama
|
||||
config_settings = settings()
|
||||
if config_settings is None:
|
||||
raise ValueError("Settings are not configured.")
|
||||
|
||||
# Get llm_mode from settings
|
||||
llm_mode = config_settings.llm.mode
|
||||
|
||||
# Mapping of 'llm_mode' to corresponding model labels
|
||||
model_mapping = {
|
||||
"llamacpp": config_settings.llamacpp.llm_hf_model_file,
|
||||
"openai": config_settings.openai.model,
|
||||
"openailike": config_settings.openai.model,
|
||||
"sagemaker": config_settings.sagemaker.llm_endpoint_name,
|
||||
"mock": llm_mode,
|
||||
"ollama": config_settings.ollama.llm_model,
|
||||
}
|
||||
|
||||
if llm_mode not in model_mapping:
|
||||
print(f"Invalid 'llm mode': {llm_mode}")
|
||||
return None
|
||||
|
||||
return model_mapping[llm_mode]
|
||||
|
||||
with gr.Column(scale=7, elem_id="col"):
|
||||
# Determine the model label based on the value of PGPT_PROFILES
|
||||
model_label = get_model_label()
|
||||
if model_label is not None:
|
||||
label_text = (
|
||||
f"LLM: {settings().llm.mode} | Model: {model_label}"
|
||||
)
|
||||
else:
|
||||
label_text = f"LLM: {settings().llm.mode}"
|
||||
|
||||
_ = gr.ChatInterface(
|
||||
self._chat,
|
||||
chatbot=gr.Chatbot(
|
||||
label=f"LLM: {settings().llm.mode}",
|
||||
label=label_text,
|
||||
show_copy_button=True,
|
||||
elem_id="chatbot",
|
||||
render=False,
|
||||
|
122
private_gpt/utils/eta.py
Normal file
122
private_gpt/utils/eta.py
Normal file
@@ -0,0 +1,122 @@
|
||||
import datetime
|
||||
import logging
|
||||
import math
|
||||
import time
|
||||
from collections import deque
|
||||
from typing import Any
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def human_time(*args: Any, **kwargs: Any) -> str:
|
||||
def timedelta_total_seconds(timedelta: datetime.timedelta) -> float:
|
||||
return (
|
||||
timedelta.microseconds
|
||||
+ 0.0
|
||||
+ (timedelta.seconds + timedelta.days * 24 * 3600) * 10**6
|
||||
) / 10**6
|
||||
|
||||
secs = float(timedelta_total_seconds(datetime.timedelta(*args, **kwargs)))
|
||||
# We want (ms) precision below 2 seconds
|
||||
if secs < 2:
|
||||
return f"{secs * 1000}ms"
|
||||
units = [("y", 86400 * 365), ("d", 86400), ("h", 3600), ("m", 60), ("s", 1)]
|
||||
parts = []
|
||||
for unit, mul in units:
|
||||
if secs / mul >= 1 or mul == 1:
|
||||
if mul > 1:
|
||||
n = int(math.floor(secs / mul))
|
||||
secs -= n * mul
|
||||
else:
|
||||
# >2s we drop the (ms) component.
|
||||
n = int(secs)
|
||||
if n:
|
||||
parts.append(f"{n}{unit}")
|
||||
return " ".join(parts)
|
||||
|
||||
|
||||
def eta(iterator: list[Any]) -> Any:
|
||||
"""Report an ETA after 30s and every 60s thereafter."""
|
||||
total = len(iterator)
|
||||
_eta = ETA(total)
|
||||
_eta.needReport(30)
|
||||
for processed, data in enumerate(iterator, start=1):
|
||||
yield data
|
||||
_eta.update(processed)
|
||||
if _eta.needReport(60):
|
||||
logger.info(f"{processed}/{total} - ETA {_eta.human_time()}")
|
||||
|
||||
|
||||
class ETA:
|
||||
"""Predict how long something will take to complete."""
|
||||
|
||||
def __init__(self, total: int):
|
||||
self.total: int = total # Total expected records.
|
||||
self.rate: float = 0.0 # per second
|
||||
self._timing_data: deque[tuple[float, int]] = deque(maxlen=100)
|
||||
self.secondsLeft: float = 0.0
|
||||
self.nexttime: float = 0.0
|
||||
|
||||
def human_time(self) -> str:
|
||||
if self._calc():
|
||||
return f"{human_time(seconds=self.secondsLeft)} @ {int(self.rate * 60)}/min"
|
||||
return "(computing)"
|
||||
|
||||
def update(self, count: int) -> None:
|
||||
# count should be in the range 0 to self.total
|
||||
assert count > 0
|
||||
assert count <= self.total
|
||||
self._timing_data.append((time.time(), count)) # (X,Y) for pearson
|
||||
|
||||
def needReport(self, whenSecs: int) -> bool:
|
||||
now = time.time()
|
||||
if now > self.nexttime:
|
||||
self.nexttime = now + whenSecs
|
||||
return True
|
||||
return False
|
||||
|
||||
def _calc(self) -> bool:
|
||||
# A sample before a prediction. Need two points to compute slope!
|
||||
if len(self._timing_data) < 3:
|
||||
return False
|
||||
|
||||
# http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient
|
||||
# Calculate means and standard deviations.
|
||||
samples = len(self._timing_data)
|
||||
# column wise sum of the timing tuples to compute their mean.
|
||||
mean_x, mean_y = (
|
||||
sum(i) / samples for i in zip(*self._timing_data, strict=False)
|
||||
)
|
||||
std_x = math.sqrt(
|
||||
sum(pow(i[0] - mean_x, 2) for i in self._timing_data) / (samples - 1)
|
||||
)
|
||||
std_y = math.sqrt(
|
||||
sum(pow(i[1] - mean_y, 2) for i in self._timing_data) / (samples - 1)
|
||||
)
|
||||
|
||||
# Calculate coefficient.
|
||||
sum_xy, sum_sq_v_x, sum_sq_v_y = 0.0, 0.0, 0
|
||||
for x, y in self._timing_data:
|
||||
x -= mean_x
|
||||
y -= mean_y
|
||||
sum_xy += x * y
|
||||
sum_sq_v_x += pow(x, 2)
|
||||
sum_sq_v_y += pow(y, 2)
|
||||
pearson_r = sum_xy / math.sqrt(sum_sq_v_x * sum_sq_v_y)
|
||||
|
||||
# Calculate regression line.
|
||||
# y = mx + b where m is the slope and b is the y-intercept.
|
||||
m = self.rate = pearson_r * (std_y / std_x)
|
||||
y = self.total
|
||||
b = mean_y - m * mean_x
|
||||
x = (y - b) / m
|
||||
|
||||
# Calculate fitted line (transformed/shifted regression line horizontally).
|
||||
fitted_b = self._timing_data[-1][1] - (m * self._timing_data[-1][0])
|
||||
fitted_x = (y - fitted_b) / m
|
||||
_, count = self._timing_data[-1] # adjust last data point progress count
|
||||
adjusted_x = ((fitted_x - x) * (count / self.total)) + x
|
||||
eta_epoch = adjusted_x
|
||||
|
||||
self.secondsLeft = max([eta_epoch - time.time(), 0])
|
||||
return True
|
@@ -1,34 +1,47 @@
|
||||
[tool.poetry]
|
||||
name = "private-gpt"
|
||||
version = "0.2.0"
|
||||
version = "0.4.0"
|
||||
description = "Private GPT"
|
||||
authors = ["Zylon <hi@zylon.ai>"]
|
||||
|
||||
[tool.poetry.dependencies]
|
||||
python = ">=3.10,<3.12"
|
||||
python = ">=3.11,<3.12"
|
||||
# PrivateGPT
|
||||
fastapi = { extras = ["all"], version = "^0.110.0" }
|
||||
python-multipart = "^0.0.9"
|
||||
injector = "^0.21.0"
|
||||
pyyaml = "^6.0.1"
|
||||
watchdog = "^4.0.0"
|
||||
transformers = "^4.38.1"
|
||||
transformers = "^4.38.2"
|
||||
# LlamaIndex core libs
|
||||
llama-index-core = "^0.10.13"
|
||||
llama-index-core = "^0.10.14"
|
||||
llama-index-readers-file = "^0.1.6"
|
||||
# Optional LlamaIndex integration libs
|
||||
llama-index-llms-llama-cpp = {version = "^0.1.3", optional = true}
|
||||
llama-index-llms-openai = {version = "^0.1.6", optional = true}
|
||||
llama-index-llms-openai-like = {version ="^0.1.3", optional = true}
|
||||
llama-index-llms-ollama = {version ="^0.1.2", optional = true}
|
||||
llama-index-llms-azure-openai = {version ="^0.1.5", optional = true}
|
||||
llama-index-embeddings-ollama = {version ="^0.1.2", optional = true}
|
||||
llama-index-embeddings-huggingface = {version ="^0.1.4", optional = true}
|
||||
llama-index-embeddings-openai = {version ="^0.1.6", optional = true}
|
||||
llama-index-embeddings-azure-openai = {version ="^0.1.6", optional = true}
|
||||
llama-index-vector-stores-qdrant = {version ="^0.1.3", optional = true}
|
||||
llama-index-vector-stores-chroma = {version ="^0.1.4", optional = true}
|
||||
llama-index-vector-stores-postgres = {version ="^0.1.2", optional = true}
|
||||
llama-index-llms-nvidia-tensorrt = {version ="^0.1.2", optional = true}
|
||||
llama-index-storage-docstore-postgres = {version ="^0.1.2", optional = true}
|
||||
llama-index-storage-index-store-postgres = {version ="^0.1.2", optional = true}
|
||||
# Postgres
|
||||
psycopg2-binary = {version ="^2.9.9", optional = true}
|
||||
asyncpg = {version="^0.29.0", optional = true}
|
||||
|
||||
# Optional Sagemaker dependency
|
||||
boto3 = {version ="^1.34.51", optional = true}
|
||||
|
||||
# Optional Reranker dependencies
|
||||
torch = {version ="^2.1.2", optional = true}
|
||||
sentence-transformers = {version ="^2.6.1", optional = true}
|
||||
|
||||
# Optional UI
|
||||
gradio = {version ="^4.19.2", optional = true}
|
||||
|
||||
@@ -39,13 +52,17 @@ llms-openai = ["llama-index-llms-openai"]
|
||||
llms-openai-like = ["llama-index-llms-openai-like"]
|
||||
llms-ollama = ["llama-index-llms-ollama"]
|
||||
llms-sagemaker = ["boto3"]
|
||||
llms-nvidia-tensorrt = ["llama-index-llms-nvidia-tensorrt"]
|
||||
llms-azopenai = ["llama-index-llms-azure-openai"]
|
||||
embeddings-ollama = ["llama-index-embeddings-ollama"]
|
||||
embeddings-huggingface = ["llama-index-embeddings-huggingface"]
|
||||
embeddings-openai = ["llama-index-embeddings-openai"]
|
||||
embeddings-sagemaker = ["boto3"]
|
||||
embeddings-azopenai = ["llama-index-embeddings-azure-openai"]
|
||||
vector-stores-qdrant = ["llama-index-vector-stores-qdrant"]
|
||||
vector-stores-chroma = ["llama-index-vector-stores-chroma"]
|
||||
vector-stores-postgres = ["llama-index-vector-stores-postgres"]
|
||||
storage-nodestore-postgres = ["llama-index-storage-docstore-postgres","llama-index-storage-index-store-postgres","psycopg2-binary","asyncpg"]
|
||||
rerank-sentence-transformers = ["torch", "sentence-transformers"]
|
||||
|
||||
[tool.poetry.group.dev.dependencies]
|
||||
black = "^22"
|
||||
|
@@ -10,7 +10,7 @@ from private_gpt.settings.settings import settings
|
||||
|
||||
resume_download = True
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser(prog='Setup: Download models from huggingface')
|
||||
parser = argparse.ArgumentParser(prog='Setup: Download models from Hugging Face')
|
||||
parser.add_argument('--resume', default=True, action=argparse.BooleanOptionalAction, help='Enable/Disable resume_download options to restart the download progress interrupted')
|
||||
args = parser.parse_args()
|
||||
resume_download = args.resume
|
||||
|
165
scripts/utils.py
165
scripts/utils.py
@@ -1,10 +1,22 @@
|
||||
import argparse
|
||||
import os
|
||||
import shutil
|
||||
from typing import Any, ClassVar
|
||||
|
||||
from private_gpt.paths import local_data_path
|
||||
from private_gpt.settings.settings import settings
|
||||
|
||||
|
||||
def wipe():
|
||||
path = "local_data"
|
||||
def wipe_file(file: str) -> None:
|
||||
if os.path.isfile(file):
|
||||
os.remove(file)
|
||||
print(f" - Deleted {file}")
|
||||
|
||||
|
||||
def wipe_tree(path: str) -> None:
|
||||
if not os.path.exists(path):
|
||||
print(f"Warning: Path not found {path}")
|
||||
return
|
||||
print(f"Wiping {path}...")
|
||||
all_files = os.listdir(path)
|
||||
|
||||
@@ -24,14 +36,149 @@ def wipe():
|
||||
continue
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
commands = {
|
||||
"wipe": wipe,
|
||||
class Postgres:
|
||||
tables: ClassVar[dict[str, list[str]]] = {
|
||||
"nodestore": ["data_docstore", "data_indexstore"],
|
||||
"vectorstore": ["data_embeddings"],
|
||||
}
|
||||
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument(
|
||||
"mode", help="select a mode to run", choices=list(commands.keys())
|
||||
def __init__(self) -> None:
|
||||
try:
|
||||
import psycopg2
|
||||
except ModuleNotFoundError:
|
||||
raise ModuleNotFoundError("Postgres dependencies not found") from None
|
||||
|
||||
connection = settings().postgres.model_dump(exclude_none=True)
|
||||
self.schema = connection.pop("schema_name")
|
||||
self.conn = psycopg2.connect(**connection)
|
||||
|
||||
def wipe(self, storetype: str) -> None:
|
||||
cur = self.conn.cursor()
|
||||
try:
|
||||
for table in self.tables[storetype]:
|
||||
sql = f"DROP TABLE IF EXISTS {self.schema}.{table}"
|
||||
cur.execute(sql)
|
||||
print(f"Table {self.schema}.{table} dropped.")
|
||||
self.conn.commit()
|
||||
finally:
|
||||
cur.close()
|
||||
|
||||
def stats(self, store_type: str) -> None:
|
||||
template = "SELECT '{table}', COUNT(*), pg_size_pretty(pg_total_relation_size('{table}')) FROM {table}"
|
||||
sql = " UNION ALL ".join(
|
||||
template.format(table=tbl) for tbl in self.tables[store_type]
|
||||
)
|
||||
|
||||
cur = self.conn.cursor()
|
||||
try:
|
||||
print(f"Storage for Postgres {store_type}.")
|
||||
print("{:<15} | {:>15} | {:>9}".format("Table", "Rows", "Size"))
|
||||
print("-" * 45) # Print a line separator
|
||||
|
||||
cur.execute(sql)
|
||||
for row in cur.fetchall():
|
||||
formatted_row_count = f"{row[1]:,}"
|
||||
print(f"{row[0]:<15} | {formatted_row_count:>15} | {row[2]:>9}")
|
||||
|
||||
print()
|
||||
finally:
|
||||
cur.close()
|
||||
|
||||
def __del__(self):
|
||||
if hasattr(self, "conn") and self.conn:
|
||||
self.conn.close()
|
||||
|
||||
|
||||
class Simple:
|
||||
def wipe(self, store_type: str) -> None:
|
||||
assert store_type == "nodestore"
|
||||
from llama_index.core.storage.docstore.types import (
|
||||
DEFAULT_PERSIST_FNAME as DOCSTORE,
|
||||
)
|
||||
from llama_index.core.storage.index_store.types import (
|
||||
DEFAULT_PERSIST_FNAME as INDEXSTORE,
|
||||
)
|
||||
|
||||
for store in (DOCSTORE, INDEXSTORE):
|
||||
wipe_file(str((local_data_path / store).absolute()))
|
||||
|
||||
|
||||
class Chroma:
|
||||
def wipe(self, store_type: str) -> None:
|
||||
assert store_type == "vectorstore"
|
||||
wipe_tree(str((local_data_path / "chroma_db").absolute()))
|
||||
|
||||
|
||||
class Qdrant:
|
||||
COLLECTION = (
|
||||
"make_this_parameterizable_per_api_call" # ?! see vector_store_component.py
|
||||
)
|
||||
|
||||
def __init__(self) -> None:
|
||||
try:
|
||||
from qdrant_client import QdrantClient # type: ignore
|
||||
except ImportError:
|
||||
raise ImportError("Qdrant dependencies not found") from None
|
||||
self.client = QdrantClient(**settings().qdrant.model_dump(exclude_none=True))
|
||||
|
||||
def wipe(self, store_type: str) -> None:
|
||||
assert store_type == "vectorstore"
|
||||
try:
|
||||
self.client.delete_collection(self.COLLECTION)
|
||||
print("Collection dropped successfully.")
|
||||
except Exception as e:
|
||||
print("Error dropping collection:", e)
|
||||
|
||||
def stats(self, store_type: str) -> None:
|
||||
print(f"Storage for Qdrant {store_type}.")
|
||||
try:
|
||||
collection_data = self.client.get_collection(self.COLLECTION)
|
||||
if collection_data:
|
||||
# Collection Info
|
||||
# https://qdrant.tech/documentation/concepts/collections/
|
||||
print(f"\tPoints: {collection_data.points_count:,}")
|
||||
print(f"\tVectors: {collection_data.vectors_count:,}")
|
||||
print(f"\tIndex Vectors: {collection_data.indexed_vectors_count:,}")
|
||||
return
|
||||
except ValueError:
|
||||
pass
|
||||
print("\t- Qdrant collection not found or empty")
|
||||
|
||||
|
||||
class Command:
|
||||
DB_HANDLERS: ClassVar[dict[str, Any]] = {
|
||||
"simple": Simple, # node store
|
||||
"chroma": Chroma, # vector store
|
||||
"postgres": Postgres, # node, index and vector store
|
||||
"qdrant": Qdrant, # vector store
|
||||
}
|
||||
|
||||
def for_each_store(self, cmd: str):
|
||||
for store_type in ("nodestore", "vectorstore"):
|
||||
database = getattr(settings(), store_type).database
|
||||
handler_class = self.DB_HANDLERS.get(database)
|
||||
if handler_class is None:
|
||||
print(f"No handler found for database '{database}'")
|
||||
continue
|
||||
handler_instance = handler_class() # Instantiate the class
|
||||
# If the DB can handle this cmd dispatch it.
|
||||
if hasattr(handler_instance, cmd) and callable(
|
||||
func := getattr(handler_instance, cmd)
|
||||
):
|
||||
func(store_type)
|
||||
else:
|
||||
print(
|
||||
f"Unable to execute command '{cmd}' on '{store_type}' in database '{database}'"
|
||||
)
|
||||
|
||||
def execute(self, cmd: str) -> None:
|
||||
if cmd in ("wipe", "stats"):
|
||||
self.for_each_store(cmd)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument("mode", help="select a mode to run", choices=["wipe", "stats"])
|
||||
args = parser.parse_args()
|
||||
commands[args.mode.lower()]()
|
||||
|
||||
Command().execute(args.mode.lower())
|
||||
|
17
settings-azopenai.yaml
Normal file
17
settings-azopenai.yaml
Normal file
@@ -0,0 +1,17 @@
|
||||
server:
|
||||
env_name: ${APP_ENV:azopenai}
|
||||
|
||||
llm:
|
||||
mode: azopenai
|
||||
|
||||
embedding:
|
||||
mode: azopenai
|
||||
|
||||
azopenai:
|
||||
api_key: ${AZ_OPENAI_API_KEY:}
|
||||
azure_endpoint: ${AZ_OPENAI_ENDPOINT:}
|
||||
embedding_deployment_name: ${AZ_OPENAI_EMBEDDING_DEPLOYMENT_NAME:}
|
||||
llm_deployment_name: ${AZ_OPENAI_LLM_DEPLOYMENT_NAME:}
|
||||
api_version: "2023-05-15"
|
||||
embedding_model: text-embedding-ada-002
|
||||
llm_model: gpt-35-turbo
|
@@ -19,6 +19,17 @@ sagemaker:
|
||||
llm_endpoint_name: ${PGPT_SAGEMAKER_LLM_ENDPOINT_NAME:}
|
||||
embedding_endpoint_name: ${PGPT_SAGEMAKER_EMBEDDING_ENDPOINT_NAME:}
|
||||
|
||||
ollama:
|
||||
llm_model: ${PGPT_OLLAMA_LLM_MODEL:mistral}
|
||||
embedding_model: ${PGPT_OLLAMA_EMBEDDING_MODEL:nomic-embed-text}
|
||||
api_base: ${PGPT_OLLAMA_API_BASE:http://ollama:11434}
|
||||
tfs_z: ${PGPT_OLLAMA_TFS_Z:1.0}
|
||||
top_k: ${PGPT_OLLAMA_TOP_K:40}
|
||||
top_p: ${PGPT_OLLAMA_TOP_P:0.9}
|
||||
repeat_last_n: ${PGPT_OLLAMA_REPEAT_LAST_N:64}
|
||||
repeat_penalty: ${PGPT_OLLAMA_REPEAT_PENALTY:1.2}
|
||||
request_timeout: ${PGPT_OLLAMA_REQUEST_TIMEOUT:600.0}
|
||||
|
||||
ui:
|
||||
enabled: true
|
||||
path: /
|
||||
|
@@ -1,3 +1,4 @@
|
||||
# poetry install --extras "ui llms-llama-cpp vector-stores-qdrant embeddings-huggingface"
|
||||
server:
|
||||
env_name: ${APP_ENV:local}
|
||||
|
||||
|
34
settings-ollama-pg.yaml
Normal file
34
settings-ollama-pg.yaml
Normal file
@@ -0,0 +1,34 @@
|
||||
# Using ollama and postgres for the vector, doc and index store. Ollama is also used for embeddings.
|
||||
# To use install these extras:
|
||||
# poetry install --extras "llms-ollama ui vector-stores-postgres embeddings-ollama storage-nodestore-postgres"
|
||||
server:
|
||||
env_name: ${APP_ENV:ollama}
|
||||
|
||||
llm:
|
||||
mode: ollama
|
||||
max_new_tokens: 512
|
||||
context_window: 3900
|
||||
|
||||
embedding:
|
||||
mode: ollama
|
||||
embed_dim: 768
|
||||
|
||||
ollama:
|
||||
llm_model: mistral
|
||||
embedding_model: nomic-embed-text
|
||||
api_base: http://localhost:11434
|
||||
|
||||
nodestore:
|
||||
database: postgres
|
||||
|
||||
vectorstore:
|
||||
database: postgres
|
||||
|
||||
postgres:
|
||||
host: localhost
|
||||
port: 5432
|
||||
database: postgres
|
||||
user: postgres
|
||||
password: admin
|
||||
schema_name: private_gpt
|
||||
|
@@ -5,20 +5,26 @@ llm:
|
||||
mode: ollama
|
||||
max_new_tokens: 512
|
||||
context_window: 3900
|
||||
|
||||
ollama:
|
||||
model: llama2
|
||||
api_base: http://localhost:11434
|
||||
temperature: 0.1 #The temperature of the model. Increasing the temperature will make the model answer more creatively. A value of 0.1 would be more factual. (Default: 0.1)
|
||||
|
||||
embedding:
|
||||
mode: huggingface
|
||||
mode: ollama
|
||||
|
||||
huggingface:
|
||||
embedding_hf_model_name: BAAI/bge-small-en-v1.5
|
||||
ollama:
|
||||
llm_model: mistral
|
||||
embedding_model: nomic-embed-text
|
||||
api_base: http://localhost:11434
|
||||
keep_alive: 5m
|
||||
# embedding_api_base: http://ollama_embedding:11434 # uncomment if your embedding model runs on another ollama
|
||||
tfs_z: 1.0 # Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting.
|
||||
top_k: 40 # Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)
|
||||
top_p: 0.9 # Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)
|
||||
repeat_last_n: 64 # Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)
|
||||
repeat_penalty: 1.2 # Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)
|
||||
request_timeout: 120.0 # Time elapsed until ollama times out the request. Default is 120s. Format is float.
|
||||
|
||||
vectorstore:
|
||||
database: qdrant
|
||||
|
||||
qdrant:
|
||||
path: local_data/private_gpt/qdrant
|
||||
|
||||
|
@@ -1,25 +0,0 @@
|
||||
server:
|
||||
env_name: ${APP_ENV:tensorrt}
|
||||
|
||||
llm:
|
||||
mode: tensorrt
|
||||
max_new_tokens: 512
|
||||
context_window: 3900
|
||||
|
||||
tensorrt:
|
||||
model_path: models/tensorrt
|
||||
engine_name: llama_float16_tp1_rank0.engine
|
||||
prompt_style: "llama2"
|
||||
|
||||
embedding:
|
||||
mode: huggingface
|
||||
|
||||
huggingface:
|
||||
embedding_hf_model_name: BAAI/bge-small-en-v1.5
|
||||
|
||||
vectorstore:
|
||||
database: qdrant
|
||||
|
||||
qdrant:
|
||||
path: local_data/private_gpt/qdrant
|
||||
|
@@ -39,16 +39,33 @@ llm:
|
||||
# Should be matching the selected model
|
||||
max_new_tokens: 512
|
||||
context_window: 3900
|
||||
tokenizer: mistralai/Mistral-7B-Instruct-v0.2
|
||||
temperature: 0.1 # The temperature of the model. Increasing the temperature will make the model answer more creatively. A value of 0.1 would be more factual. (Default: 0.1)
|
||||
|
||||
rag:
|
||||
similarity_top_k: 2
|
||||
#This value controls how many "top" documents the RAG returns to use in the context.
|
||||
#similarity_value: 0.45
|
||||
#This value is disabled by default. If you enable this settings, the RAG will only use articles that meet a certain percentage score.
|
||||
rerank:
|
||||
enabled: false
|
||||
model: cross-encoder/ms-marco-MiniLM-L-2-v2
|
||||
top_n: 1
|
||||
|
||||
llamacpp:
|
||||
prompt_style: "mistral"
|
||||
llm_hf_repo_id: TheBloke/Mistral-7B-Instruct-v0.2-GGUF
|
||||
llm_hf_model_file: mistral-7b-instruct-v0.2.Q4_K_M.gguf
|
||||
tfs_z: 1.0 # Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting
|
||||
top_k: 40 # Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)
|
||||
top_p: 1.0 # Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)
|
||||
repeat_penalty: 1.1 # Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)
|
||||
|
||||
embedding:
|
||||
# Should be matching the value above in most cases
|
||||
mode: huggingface
|
||||
ingest_mode: simple
|
||||
embed_dim: 384 # 384 is for BAAI/bge-small-en-v1.5
|
||||
|
||||
huggingface:
|
||||
embedding_hf_model_name: BAAI/bge-small-en-v1.5
|
||||
@@ -56,18 +73,19 @@ huggingface:
|
||||
vectorstore:
|
||||
database: qdrant
|
||||
|
||||
nodestore:
|
||||
database: simple
|
||||
|
||||
qdrant:
|
||||
path: local_data/private_gpt/qdrant
|
||||
|
||||
pgvector:
|
||||
postgres:
|
||||
host: localhost
|
||||
port: 5432
|
||||
database: postgres
|
||||
user: postgres
|
||||
password: postgres
|
||||
embed_dim: 384 # 384 is for BAAI/bge-small-en-v1.5
|
||||
schema_name: private_gpt
|
||||
table_name: embeddings
|
||||
|
||||
sagemaker:
|
||||
llm_endpoint_name: huggingface-pytorch-tgi-inference-2023-09-25-19-53-32-140
|
||||
@@ -78,9 +96,18 @@ openai:
|
||||
model: gpt-3.5-turbo
|
||||
|
||||
ollama:
|
||||
model: llama2-uncensored
|
||||
llm_model: llama2
|
||||
embedding_model: nomic-embed-text
|
||||
api_base: http://localhost:11434
|
||||
keep_alive: 5m
|
||||
# embedding_api_base: http://ollama_embedding:11434 # uncomment if your embedding model runs on another ollama
|
||||
request_timeout: 120.0
|
||||
|
||||
tensorrt:
|
||||
model_path: models/tensorrt
|
||||
engine_name: llama_float16_tp1_rank0.engine
|
||||
prompt_style: "llama2"
|
||||
azopenai:
|
||||
api_key: ${AZ_OPENAI_API_KEY:}
|
||||
azure_endpoint: ${AZ_OPENAI_ENDPOINT:}
|
||||
embedding_deployment_name: ${AZ_OPENAI_EMBEDDING_DEPLOYMENT_NAME:}
|
||||
llm_deployment_name: ${AZ_OPENAI_LLM_DEPLOYMENT_NAME:}
|
||||
api_version: "2023-05-15"
|
||||
embedding_model: text-embedding-ada-002
|
||||
llm_model: gpt-35-turbo
|
||||
|
@@ -5,6 +5,7 @@ NOTE: We are not testing the switch based on the config in
|
||||
is currently architecture (it is hard to patch the `settings` and the app while
|
||||
the tests are directly importing them).
|
||||
"""
|
||||
|
||||
from typing import Annotated
|
||||
|
||||
import pytest
|
||||
|
2
tiktoken_cache/.gitignore
vendored
Normal file
2
tiktoken_cache/.gitignore
vendored
Normal file
@@ -0,0 +1,2 @@
|
||||
*
|
||||
!.gitignore
|
@@ -1 +1 @@
|
||||
0.3.0
|
||||
0.5.0
|
||||
|
Reference in New Issue
Block a user