doc:refactor install document and application document

2026-01-29 21:49:35 +00:00 · 2023-08-16 23:20:08 +08:00
parent 732fd0e7e7
commit 63af66ccc1
39 changed files with 3031 additions and 458 deletions
--- a/docs/getting_started/install/deploy/deploy.md
+++ b/docs/getting_started/install/deploy/deploy.md
@@ -0,0 +1,144 @@
+# Installation From Source
+
+This tutorial gives you a quick walkthrough about use DB-GPT with you environment and data.
+
+## Installation
+
+To get started, install DB-GPT with the following steps.
+
+### 1. Hardware Requirements 
+As our project has the ability to achieve ChatGPT performance of over 85%, there are certain hardware requirements. However, overall, the project can be deployed and used on consumer-grade graphics cards. The specific hardware requirements for deployment are as follows:
+
+| GPU      | VRAM Size | Performance                                 |
+|----------|-----------| ------------------------------------------- |
+| RTX 4090 | 24 GB     | Smooth conversation inference        |
+| RTX 3090 | 24 GB     | Smooth conversation inference, better than V100 |
+| V100     | 16 GB     | Conversation inference possible, noticeable stutter |
+| T4       | 16 GB     | Conversation inference possible, noticeable stutter |
+
+if your VRAM Size is not enough, DB-GPT supported 8-bit quantization and 4-bit quantization.
+
+Here are some of the VRAM size usage of the models we tested in some common scenarios.
+
+| Model     |  Quantize | VRAM Size |
+| --------- | --------- | --------- |
+| vicuna-7b-v1.5  | 4-bit  | 8 GB     |
+| vicuna-7b-v1.5  | 8-bit  | 12 GB     |
+| vicuna-13b-v1.5  | 4-bit  | 12 GB     |
+| vicuna-13b-v1.5  | 8-bit  | 20 GB     |
+| llama-2-7b  | 4-bit  | 8 GB     |
+| llama-2-7b  | 8-bit  | 12 GB     |
+| llama-2-13b  | 4-bit  | 12 GB     | 
+| llama-2-13b  | 8-bit  | 20 GB     |
+| llama-2-70b  | 4-bit  | 48 GB     |
+| llama-2-70b  | 8-bit  | 80 GB     |
+| baichuan-7b  | 4-bit  | 8 GB     |
+| baichuan-7b  | 8-bit  | 12 GB     |
+| baichuan-13b  | 4-bit  | 12 GB     |
+| baichuan-13b  | 8-bit  | 20 GB     |
+
+### 2. Install
+```bash
+git clone https://github.com/eosphoros-ai/DB-GPT.git
+```
+
+We use Sqlite as default database, so there is no need for database installation.  If you choose to connect to other databases, you can follow our tutorial for installation and configuration. 
+For the entire installation process of DB-GPT, we use the miniconda3 virtual environment. Create a virtual environment and install the Python dependencies.
+[How to install Miniconda](https://docs.conda.io/en/latest/miniconda.html)
+```bash
+python>=3.10
+conda create -n dbgpt_env python=3.10
+conda activate dbgpt_env
+pip install -r requirements.txt
+```
+Before use DB-GPT Knowledge
+```bash
+python -m spacy download zh_core_web_sm
+
+```
+
+Once the environment is installed, we have to create a new folder "models" in the DB-GPT project, and then we can put all the models downloaded from huggingface in this directory
+
+```{tip}
+Notice make sure you have install git-lfs
+centos:yum install git-lfs
+ubuntu:app-get install git-lfs
+macos:brew install git-lfs
+```
+
+```bash
+cd DB-GPT
+mkdir models and cd models
+#### llm model
+git clone https://huggingface.co/lmsys/vicuna-13b-v1.5
+or
+git clone https://huggingface.co/THUDM/chatglm2-6b
+
+#### embedding model
+git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
+or
+git clone https://huggingface.co/moka-ai/m3e-large
+```
+
+The model files are large and will take a long time to download. During the download, let's configure the .env file, which needs to be copied and created from the .env.template
+
+if you want to use openai llm service, see [LLM Use FAQ](https://db-gpt.readthedocs.io/en/latest/getting_started/faq/llm/llm_faq.html)
+
+```{tip}
+cp .env.template .env
+```
+
+You can configure basic parameters in the .env file, for example setting LLM_MODEL to the model to be used
+
+([Vicuna-v1.5](https://huggingface.co/lmsys/vicuna-13b-v1.5) based on llama-2 has been released, we recommend you set `LLM_MODEL=vicuna-13b-v1.5` to try this model)
+
+### 3. Run
+You can refer to this document to obtain the Vicuna weights: [Vicuna](https://github.com/lm-sys/FastChat/blob/main/README.md#model-weights) .
+
+If you have difficulty with this step, you can also directly use the model from [this link](https://huggingface.co/Tribbiani/vicuna-7b) as a replacement.
+
+set .env configuration set your vector store type, eg:VECTOR_STORE_TYPE=Chroma, now we support Chroma and Milvus(version > 2.1)
+
+
+1.Run db-gpt server 
+
+```bash
+$ python pilot/server/dbgpt_server.py
+```
+Open http://localhost:5000 with your browser to see the product.
+
+If you want to access an external LLM service, you need to 1.set the variables LLM_MODEL=YOUR_MODEL_NAME MODEL_SERVER=YOUR_MODEL_SERVER（eg:http://localhost:5000） in the .env file.
+2.execute dbgpt_server.py in light mode
+
+If you want to learn about dbgpt-webui, read https://github./csunny/DB-GPT/tree/new-page-framework/datacenter
+
+```bash
+$ python pilot/server/dbgpt_server.py --light
+```
+
+
+### 4. Multiple GPUs
+
+DB-GPT will use all available gpu by default. And you can modify the setting `CUDA_VISIBLE_DEVICES=0,1` in `.env` file to use the specific gpu IDs.
+
+Optionally, you can also specify the gpu ID to use before the starting command, as shown below:
+
+````shell
+# Specify 1 gpu
+CUDA_VISIBLE_DEVICES=0 python3 pilot/server/dbgpt_server.py
+
+# Specify 4 gpus
+CUDA_VISIBLE_DEVICES=3,4,5,6 python3 pilot/server/dbgpt_server.py
+````
+
+You can modify the setting `MAX_GPU_MEMORY=xxGib` in `.env` file to configure the maximum memory used by each GPU.
+
+### 5. Not Enough Memory
+
+DB-GPT supported 8-bit quantization and 4-bit quantization.
+
+You can modify the setting `QUANTIZE_8bit=True` or `QUANTIZE_4bit=True` in `.env` file to use quantization(8-bit quantization is enabled by default).
+
+Llama-2-70b with 8-bit quantization can run with 80 GB of VRAM, and 4-bit quantization can run with 48 GB of VRAM.
+
+Note: you need to install the latest dependencies according to [requirements.txt](https://github.com/eosphoros-ai/DB-GPT/blob/main/requirements.txt).
--- a/docs/getting_started/install/docker/docker.md
+++ b/docs/getting_started/install/docker/docker.md
@@ -0,0 +1,87 @@
+Docker Install
+==================================
+
+### Docker (Experimental)
+
+#### 1. Building Docker image
+
+```bash
+$ bash docker/build_all_images.sh
+```
+
+Review images by listing them:
+
+```bash
+$ docker images|grep db-gpt
+```
+
+Output should look something like the following:
+
+```
+db-gpt-allinone    latest     e1ffd20b85ac   45 minutes ago   14.5GB
+db-gpt             latest     e36fb0cca5d9   3 hours ago      14GB
+```
+
+You can pass some parameters to docker/build_all_images.sh.
+```bash
+$ bash docker/build_all_images.sh \
+--base-image nvidia/cuda:11.8.0-devel-ubuntu22.04 \
+--pip-index-url https://pypi.tuna.tsinghua.edu.cn/simple \
+--language zh
+```
+
+You can execute the command `bash docker/build_all_images.sh --help` to see more usage.
+
+#### 2. Run all in one docker container
+
+**Run with local model**
+
+```bash
+$ docker run --gpus "device=0" -d -p 3306:3306 \
+    -p 5000:5000 \
+    -e LOCAL_DB_HOST=127.0.0.1 \
+    -e LOCAL_DB_PASSWORD=aa123456 \
+    -e MYSQL_ROOT_PASSWORD=aa123456 \
+    -e LLM_MODEL=vicuna-13b \
+    -e LANGUAGE=zh \
+    -v /data/models:/app/models \
+    --name db-gpt-allinone \
+    db-gpt-allinone
+```
+
+Open http://localhost:5000 with your browser to see the product.
+
+
+- `-e LLM_MODEL=vicuna-13b`, means we use vicuna-13b as llm model, see /pilot/configs/model_config.LLM_MODEL_CONFIG
+- `-v /data/models:/app/models`, means we mount the local model file directory `/data/models` to the docker container directory `/app/models`, please replace it with your model file directory.
+
+You can see log with command:
+
+```bash
+$ docker logs db-gpt-allinone -f
+```
+
+**Run with openai interface**
+
+```bash
+$ PROXY_API_KEY="You api key"
+$ PROXY_SERVER_URL="https://api.openai.com/v1/chat/completions"
+$ docker run --gpus "device=0" -d -p 3306:3306 \
+    -p 5000:5000 \
+    -e LOCAL_DB_HOST=127.0.0.1 \
+    -e LOCAL_DB_PASSWORD=aa123456 \
+    -e MYSQL_ROOT_PASSWORD=aa123456 \
+    -e LLM_MODEL=proxyllm \
+    -e PROXY_API_KEY=$PROXY_API_KEY \
+    -e PROXY_SERVER_URL=$PROXY_SERVER_URL \
+    -e LANGUAGE=zh \
+    -v /data/models/text2vec-large-chinese:/app/models/text2vec-large-chinese \
+    --name db-gpt-allinone \
+    db-gpt-allinone
+```
+
+- `-e LLM_MODEL=proxyllm`, means we use proxy llm(openai interface, fastchat interface...)
+- `-v /data/models/text2vec-large-chinese:/app/models/text2vec-large-chinese`, means we mount the local text2vec model to the docker container.
+
+
+Open http://localhost:5000 with your browser to see the product.
--- a/docs/getting_started/install/docker_compose/docker_compose.md
+++ b/docs/getting_started/install/docker_compose/docker_compose.md
@@ -0,0 +1,26 @@
+Docker Compose
+==================================
+
+#### Run with docker compose
+
+```bash
+$ docker compose up -d
+```
+
+Output should look something like the following:
+```
+[+] Building 0.0s (0/0)
+[+] Running 2/2
+ ✔ Container db-gpt-db-1         Started                                                                                                                                                                                          0.4s
+ ✔ Container db-gpt-webserver-1  Started
+```
+
+You can see log with command:
+
+```bash
+$ docker logs db-gpt-webserver-1 -f
+```
+
+Open http://localhost:5000 with your browser to see the product.
+
+You can open docker-compose.yml in the project root directory to see more details.
--- a/docs/getting_started/install/environment/environment.md
+++ b/docs/getting_started/install/environment/environment.md
@@ -0,0 +1,122 @@
+Env Parameter
+==================================
+
+```{admonition} LLM MODEL Config
+LLM Model Name, see /pilot/configs/model_config.LLM_MODEL_CONFIG
+* LLM_MODEL=vicuna-13b 
+
+MODEL_SERVER_ADDRESS
+* MODEL_SERVER=http://127.0.0.1:8000 
+LIMIT_MODEL_CONCURRENCY
+
+* LIMIT_MODEL_CONCURRENCY=5 
+
+MAX_POSITION_EMBEDDINGS
+
+* MAX_POSITION_EMBEDDINGS=4096 
+
+QUANTIZE_QLORA
+
+* QUANTIZE_QLORA=True
+
+QUANTIZE_8bit
+
+* QUANTIZE_8bit=True 
+```
+
+```{admonition} LLM PROXY Settings
+OPENAI Key
+
+* PROXY_API_KEY={your-openai-sk}
+* PROXY_SERVER_URL=https://api.openai.com/v1/chat/completions
+
+from https://bard.google.com/     f12-> application-> __Secure-1PSID
+
+* BARD_PROXY_API_KEY={your-bard-token}
+```
+
+```{admonition} DATABASE SETTINGS
+### SQLite database (Current default database)
+* LOCAL_DB_PATH=data/default_sqlite.db
+* LOCAL_DB_TYPE=sqlite # Database Type default:sqlite
+
+### MYSQL database
+* LOCAL_DB_TYPE=mysql
+* LOCAL_DB_USER=root
+* LOCAL_DB_PASSWORD=aa12345678
+* LOCAL_DB_HOST=127.0.0.1
+* LOCAL_DB_PORT=3306
+```
+
+```{admonition} EMBEDDING SETTINGS
+EMBEDDING MODEL Name, see /pilot/configs/model_config.LLM_MODEL_CONFIG
+* EMBEDDING_MODEL=text2vec 
+
+Embedding Chunk size, default 500
+
+* KNOWLEDGE_CHUNK_SIZE=500 
+
+Embedding Chunk Overlap, default 100
+* KNOWLEDGE_CHUNK_OVERLAP=100
+
+embeding recall top k,5
+
+* KNOWLEDGE_SEARCH_TOP_SIZE=5 
+
+embeding recall max token ,2000
+
+* KNOWLEDGE_SEARCH_MAX_TOKEN=5 
+```
+
+```{admonition} Vector Store SETTINGS
+#### Chroma
+* VECTOR_STORE_TYPE=Chroma
+#### MILVUS
+* VECTOR_STORE_TYPE=Milvus
+* MILVUS_URL=127.0.0.1
+* MILVUS_PORT=19530
+* MILVUS_USERNAME
+* MILVUS_PASSWORD
+* MILVUS_SECURE=
+
+#### WEAVIATE
+* VECTOR_STORE_TYPE=Weaviate
+* WEAVIATE_URL=https://kt-region-m8hcy0wc.weaviate.network
+```
+
+```{admonition} Vector Store SETTINGS
+#### Chroma
+* VECTOR_STORE_TYPE=Chroma
+#### MILVUS
+* VECTOR_STORE_TYPE=Milvus
+* MILVUS_URL=127.0.0.1
+* MILVUS_PORT=19530
+* MILVUS_USERNAME
+* MILVUS_PASSWORD
+* MILVUS_SECURE=
+
+#### WEAVIATE
+* WEAVIATE_URL=https://kt-region-m8hcy0wc.weaviate.network
+```
+
+```{admonition} Multi-GPU Setting
+See https://developer.nvidia.com/blog/cuda-pro-tip-control-gpu-visibility-cuda_visible_devices/
+If CUDA_VISIBLE_DEVICES is not configured, all available gpus will be used
+
+* CUDA_VISIBLE_DEVICES=0
+
+Optionally, you can also specify the gpu ID to use before the starting command
+
+* CUDA_VISIBLE_DEVICES=3,4,5,6
+
+You can configure the maximum memory used by each GPU.
+
+* MAX_GPU_MEMORY=16Gib
+```
+
+```{admonition} Other Setting
+#### Language Settings(influence prompt language)
+* LANGUAGE=en
+* LANGUAGE=zh
+```
+