mirror of
https://github.com/csunny/DB-GPT.git
synced 2026-01-29 21:49:35 +00:00
doc:refactor install document and application document
This commit is contained in:
144
docs/getting_started/install/deploy/deploy.md
Normal file
144
docs/getting_started/install/deploy/deploy.md
Normal file
@@ -0,0 +1,144 @@
|
||||
# Installation From Source
|
||||
|
||||
This tutorial gives you a quick walkthrough about use DB-GPT with you environment and data.
|
||||
|
||||
## Installation
|
||||
|
||||
To get started, install DB-GPT with the following steps.
|
||||
|
||||
### 1. Hardware Requirements
|
||||
As our project has the ability to achieve ChatGPT performance of over 85%, there are certain hardware requirements. However, overall, the project can be deployed and used on consumer-grade graphics cards. The specific hardware requirements for deployment are as follows:
|
||||
|
||||
| GPU | VRAM Size | Performance |
|
||||
|----------|-----------| ------------------------------------------- |
|
||||
| RTX 4090 | 24 GB | Smooth conversation inference |
|
||||
| RTX 3090 | 24 GB | Smooth conversation inference, better than V100 |
|
||||
| V100 | 16 GB | Conversation inference possible, noticeable stutter |
|
||||
| T4 | 16 GB | Conversation inference possible, noticeable stutter |
|
||||
|
||||
if your VRAM Size is not enough, DB-GPT supported 8-bit quantization and 4-bit quantization.
|
||||
|
||||
Here are some of the VRAM size usage of the models we tested in some common scenarios.
|
||||
|
||||
| Model | Quantize | VRAM Size |
|
||||
| --------- | --------- | --------- |
|
||||
| vicuna-7b-v1.5 | 4-bit | 8 GB |
|
||||
| vicuna-7b-v1.5 | 8-bit | 12 GB |
|
||||
| vicuna-13b-v1.5 | 4-bit | 12 GB |
|
||||
| vicuna-13b-v1.5 | 8-bit | 20 GB |
|
||||
| llama-2-7b | 4-bit | 8 GB |
|
||||
| llama-2-7b | 8-bit | 12 GB |
|
||||
| llama-2-13b | 4-bit | 12 GB |
|
||||
| llama-2-13b | 8-bit | 20 GB |
|
||||
| llama-2-70b | 4-bit | 48 GB |
|
||||
| llama-2-70b | 8-bit | 80 GB |
|
||||
| baichuan-7b | 4-bit | 8 GB |
|
||||
| baichuan-7b | 8-bit | 12 GB |
|
||||
| baichuan-13b | 4-bit | 12 GB |
|
||||
| baichuan-13b | 8-bit | 20 GB |
|
||||
|
||||
### 2. Install
|
||||
```bash
|
||||
git clone https://github.com/eosphoros-ai/DB-GPT.git
|
||||
```
|
||||
|
||||
We use Sqlite as default database, so there is no need for database installation. If you choose to connect to other databases, you can follow our tutorial for installation and configuration.
|
||||
For the entire installation process of DB-GPT, we use the miniconda3 virtual environment. Create a virtual environment and install the Python dependencies.
|
||||
[How to install Miniconda](https://docs.conda.io/en/latest/miniconda.html)
|
||||
```bash
|
||||
python>=3.10
|
||||
conda create -n dbgpt_env python=3.10
|
||||
conda activate dbgpt_env
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
Before use DB-GPT Knowledge
|
||||
```bash
|
||||
python -m spacy download zh_core_web_sm
|
||||
|
||||
```
|
||||
|
||||
Once the environment is installed, we have to create a new folder "models" in the DB-GPT project, and then we can put all the models downloaded from huggingface in this directory
|
||||
|
||||
```{tip}
|
||||
Notice make sure you have install git-lfs
|
||||
centos:yum install git-lfs
|
||||
ubuntu:app-get install git-lfs
|
||||
macos:brew install git-lfs
|
||||
```
|
||||
|
||||
```bash
|
||||
cd DB-GPT
|
||||
mkdir models and cd models
|
||||
#### llm model
|
||||
git clone https://huggingface.co/lmsys/vicuna-13b-v1.5
|
||||
or
|
||||
git clone https://huggingface.co/THUDM/chatglm2-6b
|
||||
|
||||
#### embedding model
|
||||
git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
|
||||
or
|
||||
git clone https://huggingface.co/moka-ai/m3e-large
|
||||
```
|
||||
|
||||
The model files are large and will take a long time to download. During the download, let's configure the .env file, which needs to be copied and created from the .env.template
|
||||
|
||||
if you want to use openai llm service, see [LLM Use FAQ](https://db-gpt.readthedocs.io/en/latest/getting_started/faq/llm/llm_faq.html)
|
||||
|
||||
```{tip}
|
||||
cp .env.template .env
|
||||
```
|
||||
|
||||
You can configure basic parameters in the .env file, for example setting LLM_MODEL to the model to be used
|
||||
|
||||
([Vicuna-v1.5](https://huggingface.co/lmsys/vicuna-13b-v1.5) based on llama-2 has been released, we recommend you set `LLM_MODEL=vicuna-13b-v1.5` to try this model)
|
||||
|
||||
### 3. Run
|
||||
You can refer to this document to obtain the Vicuna weights: [Vicuna](https://github.com/lm-sys/FastChat/blob/main/README.md#model-weights) .
|
||||
|
||||
If you have difficulty with this step, you can also directly use the model from [this link](https://huggingface.co/Tribbiani/vicuna-7b) as a replacement.
|
||||
|
||||
set .env configuration set your vector store type, eg:VECTOR_STORE_TYPE=Chroma, now we support Chroma and Milvus(version > 2.1)
|
||||
|
||||
|
||||
1.Run db-gpt server
|
||||
|
||||
```bash
|
||||
$ python pilot/server/dbgpt_server.py
|
||||
```
|
||||
Open http://localhost:5000 with your browser to see the product.
|
||||
|
||||
If you want to access an external LLM service, you need to 1.set the variables LLM_MODEL=YOUR_MODEL_NAME MODEL_SERVER=YOUR_MODEL_SERVER(eg:http://localhost:5000) in the .env file.
|
||||
2.execute dbgpt_server.py in light mode
|
||||
|
||||
If you want to learn about dbgpt-webui, read https://github./csunny/DB-GPT/tree/new-page-framework/datacenter
|
||||
|
||||
```bash
|
||||
$ python pilot/server/dbgpt_server.py --light
|
||||
```
|
||||
|
||||
|
||||
### 4. Multiple GPUs
|
||||
|
||||
DB-GPT will use all available gpu by default. And you can modify the setting `CUDA_VISIBLE_DEVICES=0,1` in `.env` file to use the specific gpu IDs.
|
||||
|
||||
Optionally, you can also specify the gpu ID to use before the starting command, as shown below:
|
||||
|
||||
````shell
|
||||
# Specify 1 gpu
|
||||
CUDA_VISIBLE_DEVICES=0 python3 pilot/server/dbgpt_server.py
|
||||
|
||||
# Specify 4 gpus
|
||||
CUDA_VISIBLE_DEVICES=3,4,5,6 python3 pilot/server/dbgpt_server.py
|
||||
````
|
||||
|
||||
You can modify the setting `MAX_GPU_MEMORY=xxGib` in `.env` file to configure the maximum memory used by each GPU.
|
||||
|
||||
### 5. Not Enough Memory
|
||||
|
||||
DB-GPT supported 8-bit quantization and 4-bit quantization.
|
||||
|
||||
You can modify the setting `QUANTIZE_8bit=True` or `QUANTIZE_4bit=True` in `.env` file to use quantization(8-bit quantization is enabled by default).
|
||||
|
||||
Llama-2-70b with 8-bit quantization can run with 80 GB of VRAM, and 4-bit quantization can run with 48 GB of VRAM.
|
||||
|
||||
Note: you need to install the latest dependencies according to [requirements.txt](https://github.com/eosphoros-ai/DB-GPT/blob/main/requirements.txt).
|
||||
87
docs/getting_started/install/docker/docker.md
Normal file
87
docs/getting_started/install/docker/docker.md
Normal file
@@ -0,0 +1,87 @@
|
||||
Docker Install
|
||||
==================================
|
||||
|
||||
### Docker (Experimental)
|
||||
|
||||
#### 1. Building Docker image
|
||||
|
||||
```bash
|
||||
$ bash docker/build_all_images.sh
|
||||
```
|
||||
|
||||
Review images by listing them:
|
||||
|
||||
```bash
|
||||
$ docker images|grep db-gpt
|
||||
```
|
||||
|
||||
Output should look something like the following:
|
||||
|
||||
```
|
||||
db-gpt-allinone latest e1ffd20b85ac 45 minutes ago 14.5GB
|
||||
db-gpt latest e36fb0cca5d9 3 hours ago 14GB
|
||||
```
|
||||
|
||||
You can pass some parameters to docker/build_all_images.sh.
|
||||
```bash
|
||||
$ bash docker/build_all_images.sh \
|
||||
--base-image nvidia/cuda:11.8.0-devel-ubuntu22.04 \
|
||||
--pip-index-url https://pypi.tuna.tsinghua.edu.cn/simple \
|
||||
--language zh
|
||||
```
|
||||
|
||||
You can execute the command `bash docker/build_all_images.sh --help` to see more usage.
|
||||
|
||||
#### 2. Run all in one docker container
|
||||
|
||||
**Run with local model**
|
||||
|
||||
```bash
|
||||
$ docker run --gpus "device=0" -d -p 3306:3306 \
|
||||
-p 5000:5000 \
|
||||
-e LOCAL_DB_HOST=127.0.0.1 \
|
||||
-e LOCAL_DB_PASSWORD=aa123456 \
|
||||
-e MYSQL_ROOT_PASSWORD=aa123456 \
|
||||
-e LLM_MODEL=vicuna-13b \
|
||||
-e LANGUAGE=zh \
|
||||
-v /data/models:/app/models \
|
||||
--name db-gpt-allinone \
|
||||
db-gpt-allinone
|
||||
```
|
||||
|
||||
Open http://localhost:5000 with your browser to see the product.
|
||||
|
||||
|
||||
- `-e LLM_MODEL=vicuna-13b`, means we use vicuna-13b as llm model, see /pilot/configs/model_config.LLM_MODEL_CONFIG
|
||||
- `-v /data/models:/app/models`, means we mount the local model file directory `/data/models` to the docker container directory `/app/models`, please replace it with your model file directory.
|
||||
|
||||
You can see log with command:
|
||||
|
||||
```bash
|
||||
$ docker logs db-gpt-allinone -f
|
||||
```
|
||||
|
||||
**Run with openai interface**
|
||||
|
||||
```bash
|
||||
$ PROXY_API_KEY="You api key"
|
||||
$ PROXY_SERVER_URL="https://api.openai.com/v1/chat/completions"
|
||||
$ docker run --gpus "device=0" -d -p 3306:3306 \
|
||||
-p 5000:5000 \
|
||||
-e LOCAL_DB_HOST=127.0.0.1 \
|
||||
-e LOCAL_DB_PASSWORD=aa123456 \
|
||||
-e MYSQL_ROOT_PASSWORD=aa123456 \
|
||||
-e LLM_MODEL=proxyllm \
|
||||
-e PROXY_API_KEY=$PROXY_API_KEY \
|
||||
-e PROXY_SERVER_URL=$PROXY_SERVER_URL \
|
||||
-e LANGUAGE=zh \
|
||||
-v /data/models/text2vec-large-chinese:/app/models/text2vec-large-chinese \
|
||||
--name db-gpt-allinone \
|
||||
db-gpt-allinone
|
||||
```
|
||||
|
||||
- `-e LLM_MODEL=proxyllm`, means we use proxy llm(openai interface, fastchat interface...)
|
||||
- `-v /data/models/text2vec-large-chinese:/app/models/text2vec-large-chinese`, means we mount the local text2vec model to the docker container.
|
||||
|
||||
|
||||
Open http://localhost:5000 with your browser to see the product.
|
||||
@@ -0,0 +1,26 @@
|
||||
Docker Compose
|
||||
==================================
|
||||
|
||||
#### Run with docker compose
|
||||
|
||||
```bash
|
||||
$ docker compose up -d
|
||||
```
|
||||
|
||||
Output should look something like the following:
|
||||
```
|
||||
[+] Building 0.0s (0/0)
|
||||
[+] Running 2/2
|
||||
✔ Container db-gpt-db-1 Started 0.4s
|
||||
✔ Container db-gpt-webserver-1 Started
|
||||
```
|
||||
|
||||
You can see log with command:
|
||||
|
||||
```bash
|
||||
$ docker logs db-gpt-webserver-1 -f
|
||||
```
|
||||
|
||||
Open http://localhost:5000 with your browser to see the product.
|
||||
|
||||
You can open docker-compose.yml in the project root directory to see more details.
|
||||
122
docs/getting_started/install/environment/environment.md
Normal file
122
docs/getting_started/install/environment/environment.md
Normal file
@@ -0,0 +1,122 @@
|
||||
Env Parameter
|
||||
==================================
|
||||
|
||||
```{admonition} LLM MODEL Config
|
||||
LLM Model Name, see /pilot/configs/model_config.LLM_MODEL_CONFIG
|
||||
* LLM_MODEL=vicuna-13b
|
||||
|
||||
MODEL_SERVER_ADDRESS
|
||||
* MODEL_SERVER=http://127.0.0.1:8000
|
||||
LIMIT_MODEL_CONCURRENCY
|
||||
|
||||
* LIMIT_MODEL_CONCURRENCY=5
|
||||
|
||||
MAX_POSITION_EMBEDDINGS
|
||||
|
||||
* MAX_POSITION_EMBEDDINGS=4096
|
||||
|
||||
QUANTIZE_QLORA
|
||||
|
||||
* QUANTIZE_QLORA=True
|
||||
|
||||
QUANTIZE_8bit
|
||||
|
||||
* QUANTIZE_8bit=True
|
||||
```
|
||||
|
||||
```{admonition} LLM PROXY Settings
|
||||
OPENAI Key
|
||||
|
||||
* PROXY_API_KEY={your-openai-sk}
|
||||
* PROXY_SERVER_URL=https://api.openai.com/v1/chat/completions
|
||||
|
||||
from https://bard.google.com/ f12-> application-> __Secure-1PSID
|
||||
|
||||
* BARD_PROXY_API_KEY={your-bard-token}
|
||||
```
|
||||
|
||||
```{admonition} DATABASE SETTINGS
|
||||
### SQLite database (Current default database)
|
||||
* LOCAL_DB_PATH=data/default_sqlite.db
|
||||
* LOCAL_DB_TYPE=sqlite # Database Type default:sqlite
|
||||
|
||||
### MYSQL database
|
||||
* LOCAL_DB_TYPE=mysql
|
||||
* LOCAL_DB_USER=root
|
||||
* LOCAL_DB_PASSWORD=aa12345678
|
||||
* LOCAL_DB_HOST=127.0.0.1
|
||||
* LOCAL_DB_PORT=3306
|
||||
```
|
||||
|
||||
```{admonition} EMBEDDING SETTINGS
|
||||
EMBEDDING MODEL Name, see /pilot/configs/model_config.LLM_MODEL_CONFIG
|
||||
* EMBEDDING_MODEL=text2vec
|
||||
|
||||
Embedding Chunk size, default 500
|
||||
|
||||
* KNOWLEDGE_CHUNK_SIZE=500
|
||||
|
||||
Embedding Chunk Overlap, default 100
|
||||
* KNOWLEDGE_CHUNK_OVERLAP=100
|
||||
|
||||
embeding recall top k,5
|
||||
|
||||
* KNOWLEDGE_SEARCH_TOP_SIZE=5
|
||||
|
||||
embeding recall max token ,2000
|
||||
|
||||
* KNOWLEDGE_SEARCH_MAX_TOKEN=5
|
||||
```
|
||||
|
||||
```{admonition} Vector Store SETTINGS
|
||||
#### Chroma
|
||||
* VECTOR_STORE_TYPE=Chroma
|
||||
#### MILVUS
|
||||
* VECTOR_STORE_TYPE=Milvus
|
||||
* MILVUS_URL=127.0.0.1
|
||||
* MILVUS_PORT=19530
|
||||
* MILVUS_USERNAME
|
||||
* MILVUS_PASSWORD
|
||||
* MILVUS_SECURE=
|
||||
|
||||
#### WEAVIATE
|
||||
* VECTOR_STORE_TYPE=Weaviate
|
||||
* WEAVIATE_URL=https://kt-region-m8hcy0wc.weaviate.network
|
||||
```
|
||||
|
||||
```{admonition} Vector Store SETTINGS
|
||||
#### Chroma
|
||||
* VECTOR_STORE_TYPE=Chroma
|
||||
#### MILVUS
|
||||
* VECTOR_STORE_TYPE=Milvus
|
||||
* MILVUS_URL=127.0.0.1
|
||||
* MILVUS_PORT=19530
|
||||
* MILVUS_USERNAME
|
||||
* MILVUS_PASSWORD
|
||||
* MILVUS_SECURE=
|
||||
|
||||
#### WEAVIATE
|
||||
* WEAVIATE_URL=https://kt-region-m8hcy0wc.weaviate.network
|
||||
```
|
||||
|
||||
```{admonition} Multi-GPU Setting
|
||||
See https://developer.nvidia.com/blog/cuda-pro-tip-control-gpu-visibility-cuda_visible_devices/
|
||||
If CUDA_VISIBLE_DEVICES is not configured, all available gpus will be used
|
||||
|
||||
* CUDA_VISIBLE_DEVICES=0
|
||||
|
||||
Optionally, you can also specify the gpu ID to use before the starting command
|
||||
|
||||
* CUDA_VISIBLE_DEVICES=3,4,5,6
|
||||
|
||||
You can configure the maximum memory used by each GPU.
|
||||
|
||||
* MAX_GPU_MEMORY=16Gib
|
||||
```
|
||||
|
||||
```{admonition} Other Setting
|
||||
#### Language Settings(influence prompt language)
|
||||
* LANGUAGE=en
|
||||
* LANGUAGE=zh
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user