doc:refactor install document and application document

This commit is contained in:
aries_ckt
2023-08-16 23:20:08 +08:00
parent 732fd0e7e7
commit 63af66ccc1
39 changed files with 3031 additions and 458 deletions

View File

@@ -0,0 +1,144 @@
# Installation From Source
This tutorial gives you a quick walkthrough about use DB-GPT with you environment and data.
## Installation
To get started, install DB-GPT with the following steps.
### 1. Hardware Requirements
As our project has the ability to achieve ChatGPT performance of over 85%, there are certain hardware requirements. However, overall, the project can be deployed and used on consumer-grade graphics cards. The specific hardware requirements for deployment are as follows:
| GPU | VRAM Size | Performance |
|----------|-----------| ------------------------------------------- |
| RTX 4090 | 24 GB | Smooth conversation inference |
| RTX 3090 | 24 GB | Smooth conversation inference, better than V100 |
| V100 | 16 GB | Conversation inference possible, noticeable stutter |
| T4 | 16 GB | Conversation inference possible, noticeable stutter |
if your VRAM Size is not enough, DB-GPT supported 8-bit quantization and 4-bit quantization.
Here are some of the VRAM size usage of the models we tested in some common scenarios.
| Model | Quantize | VRAM Size |
| --------- | --------- | --------- |
| vicuna-7b-v1.5 | 4-bit | 8 GB |
| vicuna-7b-v1.5 | 8-bit | 12 GB |
| vicuna-13b-v1.5 | 4-bit | 12 GB |
| vicuna-13b-v1.5 | 8-bit | 20 GB |
| llama-2-7b | 4-bit | 8 GB |
| llama-2-7b | 8-bit | 12 GB |
| llama-2-13b | 4-bit | 12 GB |
| llama-2-13b | 8-bit | 20 GB |
| llama-2-70b | 4-bit | 48 GB |
| llama-2-70b | 8-bit | 80 GB |
| baichuan-7b | 4-bit | 8 GB |
| baichuan-7b | 8-bit | 12 GB |
| baichuan-13b | 4-bit | 12 GB |
| baichuan-13b | 8-bit | 20 GB |
### 2. Install
```bash
git clone https://github.com/eosphoros-ai/DB-GPT.git
```
We use Sqlite as default database, so there is no need for database installation. If you choose to connect to other databases, you can follow our tutorial for installation and configuration.
For the entire installation process of DB-GPT, we use the miniconda3 virtual environment. Create a virtual environment and install the Python dependencies.
[How to install Miniconda](https://docs.conda.io/en/latest/miniconda.html)
```bash
python>=3.10
conda create -n dbgpt_env python=3.10
conda activate dbgpt_env
pip install -r requirements.txt
```
Before use DB-GPT Knowledge
```bash
python -m spacy download zh_core_web_sm
```
Once the environment is installed, we have to create a new folder "models" in the DB-GPT project, and then we can put all the models downloaded from huggingface in this directory
```{tip}
Notice make sure you have install git-lfs
centos:yum install git-lfs
ubuntu:app-get install git-lfs
macos:brew install git-lfs
```
```bash
cd DB-GPT
mkdir models and cd models
#### llm model
git clone https://huggingface.co/lmsys/vicuna-13b-v1.5
or
git clone https://huggingface.co/THUDM/chatglm2-6b
#### embedding model
git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
or
git clone https://huggingface.co/moka-ai/m3e-large
```
The model files are large and will take a long time to download. During the download, let's configure the .env file, which needs to be copied and created from the .env.template
if you want to use openai llm service, see [LLM Use FAQ](https://db-gpt.readthedocs.io/en/latest/getting_started/faq/llm/llm_faq.html)
```{tip}
cp .env.template .env
```
You can configure basic parameters in the .env file, for example setting LLM_MODEL to the model to be used
([Vicuna-v1.5](https://huggingface.co/lmsys/vicuna-13b-v1.5) based on llama-2 has been released, we recommend you set `LLM_MODEL=vicuna-13b-v1.5` to try this model)
### 3. Run
You can refer to this document to obtain the Vicuna weights: [Vicuna](https://github.com/lm-sys/FastChat/blob/main/README.md#model-weights) .
If you have difficulty with this step, you can also directly use the model from [this link](https://huggingface.co/Tribbiani/vicuna-7b) as a replacement.
set .env configuration set your vector store type, eg:VECTOR_STORE_TYPE=Chroma, now we support Chroma and Milvus(version > 2.1)
1.Run db-gpt server
```bash
$ python pilot/server/dbgpt_server.py
```
Open http://localhost:5000 with your browser to see the product.
If you want to access an external LLM service, you need to 1.set the variables LLM_MODEL=YOUR_MODEL_NAME MODEL_SERVER=YOUR_MODEL_SERVEReg:http://localhost:5000 in the .env file.
2.execute dbgpt_server.py in light mode
If you want to learn about dbgpt-webui, read https://github./csunny/DB-GPT/tree/new-page-framework/datacenter
```bash
$ python pilot/server/dbgpt_server.py --light
```
### 4. Multiple GPUs
DB-GPT will use all available gpu by default. And you can modify the setting `CUDA_VISIBLE_DEVICES=0,1` in `.env` file to use the specific gpu IDs.
Optionally, you can also specify the gpu ID to use before the starting command, as shown below:
````shell
# Specify 1 gpu
CUDA_VISIBLE_DEVICES=0 python3 pilot/server/dbgpt_server.py
# Specify 4 gpus
CUDA_VISIBLE_DEVICES=3,4,5,6 python3 pilot/server/dbgpt_server.py
````
You can modify the setting `MAX_GPU_MEMORY=xxGib` in `.env` file to configure the maximum memory used by each GPU.
### 5. Not Enough Memory
DB-GPT supported 8-bit quantization and 4-bit quantization.
You can modify the setting `QUANTIZE_8bit=True` or `QUANTIZE_4bit=True` in `.env` file to use quantization(8-bit quantization is enabled by default).
Llama-2-70b with 8-bit quantization can run with 80 GB of VRAM, and 4-bit quantization can run with 48 GB of VRAM.
Note: you need to install the latest dependencies according to [requirements.txt](https://github.com/eosphoros-ai/DB-GPT/blob/main/requirements.txt).

View File

@@ -0,0 +1,87 @@
Docker Install
==================================
### Docker (Experimental)
#### 1. Building Docker image
```bash
$ bash docker/build_all_images.sh
```
Review images by listing them:
```bash
$ docker images|grep db-gpt
```
Output should look something like the following:
```
db-gpt-allinone latest e1ffd20b85ac 45 minutes ago 14.5GB
db-gpt latest e36fb0cca5d9 3 hours ago 14GB
```
You can pass some parameters to docker/build_all_images.sh.
```bash
$ bash docker/build_all_images.sh \
--base-image nvidia/cuda:11.8.0-devel-ubuntu22.04 \
--pip-index-url https://pypi.tuna.tsinghua.edu.cn/simple \
--language zh
```
You can execute the command `bash docker/build_all_images.sh --help` to see more usage.
#### 2. Run all in one docker container
**Run with local model**
```bash
$ docker run --gpus "device=0" -d -p 3306:3306 \
-p 5000:5000 \
-e LOCAL_DB_HOST=127.0.0.1 \
-e LOCAL_DB_PASSWORD=aa123456 \
-e MYSQL_ROOT_PASSWORD=aa123456 \
-e LLM_MODEL=vicuna-13b \
-e LANGUAGE=zh \
-v /data/models:/app/models \
--name db-gpt-allinone \
db-gpt-allinone
```
Open http://localhost:5000 with your browser to see the product.
- `-e LLM_MODEL=vicuna-13b`, means we use vicuna-13b as llm model, see /pilot/configs/model_config.LLM_MODEL_CONFIG
- `-v /data/models:/app/models`, means we mount the local model file directory `/data/models` to the docker container directory `/app/models`, please replace it with your model file directory.
You can see log with command:
```bash
$ docker logs db-gpt-allinone -f
```
**Run with openai interface**
```bash
$ PROXY_API_KEY="You api key"
$ PROXY_SERVER_URL="https://api.openai.com/v1/chat/completions"
$ docker run --gpus "device=0" -d -p 3306:3306 \
-p 5000:5000 \
-e LOCAL_DB_HOST=127.0.0.1 \
-e LOCAL_DB_PASSWORD=aa123456 \
-e MYSQL_ROOT_PASSWORD=aa123456 \
-e LLM_MODEL=proxyllm \
-e PROXY_API_KEY=$PROXY_API_KEY \
-e PROXY_SERVER_URL=$PROXY_SERVER_URL \
-e LANGUAGE=zh \
-v /data/models/text2vec-large-chinese:/app/models/text2vec-large-chinese \
--name db-gpt-allinone \
db-gpt-allinone
```
- `-e LLM_MODEL=proxyllm`, means we use proxy llm(openai interface, fastchat interface...)
- `-v /data/models/text2vec-large-chinese:/app/models/text2vec-large-chinese`, means we mount the local text2vec model to the docker container.
Open http://localhost:5000 with your browser to see the product.

View File

@@ -0,0 +1,26 @@
Docker Compose
==================================
#### Run with docker compose
```bash
$ docker compose up -d
```
Output should look something like the following:
```
[+] Building 0.0s (0/0)
[+] Running 2/2
✔ Container db-gpt-db-1 Started 0.4s
✔ Container db-gpt-webserver-1 Started
```
You can see log with command:
```bash
$ docker logs db-gpt-webserver-1 -f
```
Open http://localhost:5000 with your browser to see the product.
You can open docker-compose.yml in the project root directory to see more details.

View File

@@ -0,0 +1,122 @@
Env Parameter
==================================
```{admonition} LLM MODEL Config
LLM Model Name, see /pilot/configs/model_config.LLM_MODEL_CONFIG
* LLM_MODEL=vicuna-13b
MODEL_SERVER_ADDRESS
* MODEL_SERVER=http://127.0.0.1:8000
LIMIT_MODEL_CONCURRENCY
* LIMIT_MODEL_CONCURRENCY=5
MAX_POSITION_EMBEDDINGS
* MAX_POSITION_EMBEDDINGS=4096
QUANTIZE_QLORA
* QUANTIZE_QLORA=True
QUANTIZE_8bit
* QUANTIZE_8bit=True
```
```{admonition} LLM PROXY Settings
OPENAI Key
* PROXY_API_KEY={your-openai-sk}
* PROXY_SERVER_URL=https://api.openai.com/v1/chat/completions
from https://bard.google.com/ f12-> application-> __Secure-1PSID
* BARD_PROXY_API_KEY={your-bard-token}
```
```{admonition} DATABASE SETTINGS
### SQLite database (Current default database)
* LOCAL_DB_PATH=data/default_sqlite.db
* LOCAL_DB_TYPE=sqlite # Database Type default:sqlite
### MYSQL database
* LOCAL_DB_TYPE=mysql
* LOCAL_DB_USER=root
* LOCAL_DB_PASSWORD=aa12345678
* LOCAL_DB_HOST=127.0.0.1
* LOCAL_DB_PORT=3306
```
```{admonition} EMBEDDING SETTINGS
EMBEDDING MODEL Name, see /pilot/configs/model_config.LLM_MODEL_CONFIG
* EMBEDDING_MODEL=text2vec
Embedding Chunk size, default 500
* KNOWLEDGE_CHUNK_SIZE=500
Embedding Chunk Overlap, default 100
* KNOWLEDGE_CHUNK_OVERLAP=100
embeding recall top k,5
* KNOWLEDGE_SEARCH_TOP_SIZE=5
embeding recall max token ,2000
* KNOWLEDGE_SEARCH_MAX_TOKEN=5
```
```{admonition} Vector Store SETTINGS
#### Chroma
* VECTOR_STORE_TYPE=Chroma
#### MILVUS
* VECTOR_STORE_TYPE=Milvus
* MILVUS_URL=127.0.0.1
* MILVUS_PORT=19530
* MILVUS_USERNAME
* MILVUS_PASSWORD
* MILVUS_SECURE=
#### WEAVIATE
* VECTOR_STORE_TYPE=Weaviate
* WEAVIATE_URL=https://kt-region-m8hcy0wc.weaviate.network
```
```{admonition} Vector Store SETTINGS
#### Chroma
* VECTOR_STORE_TYPE=Chroma
#### MILVUS
* VECTOR_STORE_TYPE=Milvus
* MILVUS_URL=127.0.0.1
* MILVUS_PORT=19530
* MILVUS_USERNAME
* MILVUS_PASSWORD
* MILVUS_SECURE=
#### WEAVIATE
* WEAVIATE_URL=https://kt-region-m8hcy0wc.weaviate.network
```
```{admonition} Multi-GPU Setting
See https://developer.nvidia.com/blog/cuda-pro-tip-control-gpu-visibility-cuda_visible_devices/
If CUDA_VISIBLE_DEVICES is not configured, all available gpus will be used
* CUDA_VISIBLE_DEVICES=0
Optionally, you can also specify the gpu ID to use before the starting command
* CUDA_VISIBLE_DEVICES=3,4,5,6
You can configure the maximum memory used by each GPU.
* MAX_GPU_MEMORY=16Gib
```
```{admonition} Other Setting
#### Language Settings(influence prompt language)
* LANGUAGE=en
* LANGUAGE=zh
```