mirror of
https://github.com/csunny/DB-GPT.git
synced 2025-09-11 22:09:44 +00:00
doc:update deploy doc
This commit is contained in:
@@ -18,7 +18,7 @@ DB-GPT product is a Web application that you can chat database, chat knowledge,
|
||||
:name: deploy
|
||||
:hidden:
|
||||
|
||||
./install/deploy/deploy.md
|
||||
./install/deploy.rst
|
||||
./install/docker/docker.md
|
||||
./install/docker_compose/docker_compose.md
|
||||
./install/cluster/cluster.rst
|
||||
|
438
docs/getting_started/install/deploy.rst
Normal file
438
docs/getting_started/install/deploy.rst
Normal file
@@ -0,0 +1,438 @@
|
||||
.. _installation:
|
||||
|
||||
Installation From Source
|
||||
==============
|
||||
|
||||
To get started, install DB-GPT with the following steps.
|
||||
|
||||
DB-GPT can be deployed on servers with low hardware requirements or on servers with high hardware requirements.
|
||||
You can install DB-GPT by Using third-part LLM REST API Service OpenAI, Azure.
|
||||
|
||||
And you can also install DB-GPT by deploy LLM Service by download LLM model.
|
||||
|
||||
|
||||
1.Preparation
|
||||
-----------------
|
||||
**Download DB-GPT**
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
git clone https://github.com/eosphoros-ai/DB-GPT.git
|
||||
|
||||
**Install Miniconda**
|
||||
|
||||
We use Sqlite as default database, so there is no need for database installation. If you choose to connect to other databases, you can follow our tutorial for installation and configuration.
|
||||
For the entire installation process of DB-GPT, we use the miniconda3 virtual environment. Create a virtual environment and install the Python dependencies.
|
||||
:ref: `https://docs.conda.io/en/latest/miniconda.html<How to install Miniconda>`
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
python>=3.10
|
||||
conda create -n dbgpt_env python=3.10
|
||||
conda activate dbgpt_env
|
||||
# it will take some minutes
|
||||
pip install -e ".[default]"
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
cp .env.template .env
|
||||
|
||||
2.Deploy LLM Service
|
||||
-----------------
|
||||
DB-GPT can be deployed on servers with low hardware requirements or on servers with high hardware requirements.
|
||||
|
||||
If you are low hardware requirements you can install DB-GPT by Using third-part LLM REST API Service OpenAI, Azure, tongyi.
|
||||
|
||||
.. tabs::
|
||||
|
||||
.. tab:: OpenAI
|
||||
|
||||
Download embedding model
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
cd DB-GPT
|
||||
mkdir models and cd models
|
||||
|
||||
#### embedding model
|
||||
git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
|
||||
or
|
||||
git clone https://huggingface.co/moka-ai/m3e-large
|
||||
|
||||
Configure LLM_MODEL and PROXY_API_URL and API_KEY in `.env` file
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
LLM_MODEL=chatgpt_proxyllm
|
||||
PROXY_API_KEY={your-openai-sk}
|
||||
PROXY_SERVER_URL=https://api.openai.com/v1/chat/completions
|
||||
|
||||
.. tip::
|
||||
|
||||
Make sure your .env configuration is not overwritten
|
||||
|
||||
|
||||
.. tab:: Vicuna
|
||||
([Vicuna-v1.5](https://huggingface.co/lmsys/vicuna-13b-v1.5) based on llama-2 has been released, we recommend you set `LLM_MODEL=vicuna-13b-v1.5` to try this model)
|
||||
|
||||
.. list-table:: vicuna-v1.5 hardware requirements
|
||||
:widths: 50 50 50
|
||||
:header-rows: 1
|
||||
|
||||
* - Model
|
||||
- Quantize
|
||||
- VRAM Size
|
||||
* - vicuna-7b-v1.5
|
||||
- 4-bit
|
||||
- 8 GB
|
||||
* - vicuna-7b-v1.5
|
||||
- 8-bit
|
||||
- 12 GB
|
||||
* - vicuna-13b-v1.5
|
||||
- 4-bit
|
||||
- 12 GB
|
||||
* - vicuna-13b-v1.5
|
||||
- 8-bit
|
||||
- 20 GB
|
||||
|
||||
|
||||
.. note::
|
||||
|
||||
Notice make sure you have install git-lfs
|
||||
|
||||
centos:yum install git-lfs
|
||||
|
||||
ubuntu:apt-get install git-lfs
|
||||
|
||||
macos:brew install git-lfs
|
||||
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
cd DB-GPT
|
||||
mkdir models and cd models
|
||||
|
||||
#### embedding model
|
||||
git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
|
||||
or
|
||||
git clone https://huggingface.co/moka-ai/m3e-large
|
||||
|
||||
#### llm model, if you use openai or Azure or tongyi llm api service, you don't need to download llm model
|
||||
git clone https://huggingface.co/lmsys/vicuna-13b-v1.5
|
||||
|
||||
The model files are large and will take a long time to download.
|
||||
|
||||
**Configure LLM_MODEL in `.env` file**
|
||||
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
LLM_MODEL=vicuna-13b-v1.5
|
||||
|
||||
.. tab:: Baichuan
|
||||
|
||||
.. list-table:: Baichuan hardware requirements
|
||||
:widths: 50 50 50
|
||||
:header-rows: 1
|
||||
|
||||
* - Model
|
||||
- Quantize
|
||||
- VRAM Size
|
||||
* - baichuan-7b
|
||||
- 4-bit
|
||||
- 8 GB
|
||||
* - baichuan-7b
|
||||
- 8-bit
|
||||
- 12 GB
|
||||
* - baichuan-13b
|
||||
- 4-bit
|
||||
- 12 GB
|
||||
* - baichuan-13b
|
||||
- 8-bit
|
||||
- 20 GB
|
||||
|
||||
.. note::
|
||||
|
||||
Notice make sure you have install git-lfs
|
||||
|
||||
centos:yum install git-lfs
|
||||
|
||||
ubuntu:apt-get install git-lfs
|
||||
|
||||
macos:brew install git-lfs
|
||||
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
cd DB-GPT
|
||||
mkdir models and cd models
|
||||
|
||||
#### embedding model
|
||||
git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
|
||||
or
|
||||
git clone https://huggingface.co/moka-ai/m3e-large
|
||||
|
||||
#### llm model
|
||||
git clone https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat
|
||||
or
|
||||
git clone https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat
|
||||
|
||||
The model files are large and will take a long time to download.
|
||||
|
||||
**Configure LLM_MODEL in `.env` file**
|
||||
|
||||
please rename Baichuan path to "baichuan2-13b" or "baichuan2-7b"
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
LLM_MODEL=baichuan2-13b
|
||||
|
||||
.. tab:: ChatGLM
|
||||
|
||||
.. note::
|
||||
|
||||
Notice make sure you have install git-lfs
|
||||
|
||||
centos:yum install git-lfs
|
||||
|
||||
ubuntu:apt-get install git-lfs
|
||||
|
||||
macos:brew install git-lfs
|
||||
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
cd DB-GPT
|
||||
mkdir models and cd models
|
||||
|
||||
#### embedding model
|
||||
git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
|
||||
or
|
||||
git clone https://huggingface.co/moka-ai/m3e-large
|
||||
|
||||
#### llm model
|
||||
git clone https://huggingface.co/THUDM/chatglm2-6b
|
||||
|
||||
The model files are large and will take a long time to download.
|
||||
|
||||
**Configure LLM_MODEL in `.env` file**
|
||||
|
||||
please rename chatglm model path to "chatglm2-6b"
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
LLM_MODEL=chatglm2-6b
|
||||
|
||||
.. tab:: Other LLM API
|
||||
|
||||
Download embedding model
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
cd DB-GPT
|
||||
mkdir models and cd models
|
||||
|
||||
#### embedding model
|
||||
git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
|
||||
or
|
||||
git clone https://huggingface.co/moka-ai/m3e-large
|
||||
|
||||
.. note::
|
||||
|
||||
* OpenAI
|
||||
* Azure
|
||||
* Aliyun tongyi
|
||||
* Baidu wenxin
|
||||
* Zhipu
|
||||
* Baichuan
|
||||
* Bard
|
||||
|
||||
Configure LLM_MODEL and PROXY_API_URL and API_KEY in `.env` file
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
#OpenAI
|
||||
LLM_MODEL=chatgpt_proxyllm
|
||||
PROXY_API_KEY={your-openai-sk}
|
||||
PROXY_SERVER_URL=https://api.openai.com/v1/chat/completions
|
||||
|
||||
#Azure
|
||||
LLM_MODEL=chatgpt_proxyllm
|
||||
PROXY_API_KEY={your-azure-sk}
|
||||
PROXY_API_BASE=https://{your domain}.openai.azure.com/
|
||||
PROXY_API_TYPE=azure
|
||||
PROXY_SERVER_URL=xxxx
|
||||
PROXY_API_VERSION=2023-05-15
|
||||
PROXYLLM_BACKEND=gpt-35-turbo
|
||||
|
||||
#Aliyun tongyi
|
||||
LLM_MODEL=tongyi_proxyllm
|
||||
TONGYI_PROXY_API_KEY={your-tongyi-sk}
|
||||
PROXY_SERVER_URL={your_service_url}
|
||||
|
||||
## Baidu wenxin
|
||||
LLM_MODEL=wenxin_proxyllm
|
||||
PROXY_SERVER_URL={your_service_url}
|
||||
WEN_XIN_MODEL_VERSION={version}
|
||||
WEN_XIN_API_KEY={your-wenxin-sk}
|
||||
WEN_XIN_SECRET_KEY={your-wenxin-sct}
|
||||
|
||||
## Zhipu
|
||||
LLM_MODEL=zhipu_proxyllm
|
||||
PROXY_SERVER_URL={your_service_url}
|
||||
ZHIPU_MODEL_VERSION={version}
|
||||
ZHIPU_PROXY_API_KEY={your-zhipu-sk}
|
||||
|
||||
## Baichuan
|
||||
LLM_MODEL=bc_proxyllm
|
||||
PROXY_SERVER_URL={your_service_url}
|
||||
BAICHUN_MODEL_NAME={version}
|
||||
BAICHUAN_PROXY_API_KEY={your-baichuan-sk}
|
||||
BAICHUAN_PROXY_API_SECRET={your-baichuan-sct}
|
||||
|
||||
## bard
|
||||
LLM_MODEL=bard_proxyllm
|
||||
PROXY_SERVER_URL={your_service_url}
|
||||
# from https://bard.google.com/ f12-> application-> __Secure-1PSID
|
||||
BARD_PROXY_API_KEY={your-bard-token}
|
||||
|
||||
.. tip::
|
||||
|
||||
Make sure your .env configuration is not overwritten
|
||||
|
||||
.. tab:: llama.cpp
|
||||
|
||||
DB-GPT already supports [llama.cpp](https://github.com/ggerganov/llama.cpp) via [llama-cpp-python](https://github.com/abetlen/llama-cpp-python).
|
||||
|
||||
**Preparing Model Files**
|
||||
|
||||
To use llama.cpp, you need to prepare a gguf format model file, and there are two common ways to obtain it, you can choose either:
|
||||
|
||||
**1. Download a pre-converted model file.**
|
||||
|
||||
Suppose you want to use [Vicuna 13B v1.5](https://huggingface.co/lmsys/vicuna-13b-v1.5), you can download the file already converted from [TheBloke/vicuna-13B-v1.5-GGUF](https://huggingface.co/TheBloke/vicuna-13B-v1.5-GGUF), only one file is needed. Download it to the `models` directory and rename it to `ggml-model-q4_0.gguf`.
|
||||
|
||||
.. code-block::
|
||||
|
||||
wget https://huggingface.co/TheBloke/vicuna-13B-v1.5-GGUF/resolve/main/vicuna-13b-v1.5.Q4_K_M.gguf -O models/ggml-model-q4_0.gguf
|
||||
|
||||
**2. Convert It Yourself**
|
||||
|
||||
You can convert the model file yourself according to the instructions in [llama.cpp#prepare-data--run](https://github.com/ggerganov/llama.cpp#prepare-data--run), and put the converted file in the models directory and rename it to `ggml-model-q4_0.gguf`.
|
||||
|
||||
**Installing Dependencies**
|
||||
|
||||
llama.cpp is an optional dependency in DB-GPT, and you can manually install it using the following command:
|
||||
|
||||
.. code-block::
|
||||
|
||||
pip install -e ".[llama_cpp]"
|
||||
|
||||
|
||||
**3.Modifying the Configuration File**
|
||||
|
||||
Next, you can directly modify your `.env` file to enable llama.cpp.
|
||||
|
||||
.. code-block::
|
||||
|
||||
LLM_MODEL=llama-cpp
|
||||
llama_cpp_prompt_template=vicuna_v1.1
|
||||
|
||||
Then you can run it according to [Run](https://db-gpt.readthedocs.io/en/latest/getting_started/install/deploy/deploy.html#run).
|
||||
|
||||
|
||||
**More Configurations**
|
||||
|
||||
In DB-GPT, the model configuration can be done through `{model name}_{config key}`.
|
||||
|
||||
.. list-table:: More Configurations
|
||||
:widths: 50 50 50
|
||||
:header-rows: 1
|
||||
|
||||
* - Environment Variable Key
|
||||
- Default
|
||||
- Description
|
||||
* - llama_cpp_prompt_template
|
||||
- None
|
||||
- Prompt template name, now support: zero_shot, vicuna_v1.1,alpaca,llama-2,baichuan-chat,internlm-chat, If None, the prompt template is automatically determined from model path。
|
||||
* - llama_cpp_model_path
|
||||
- None
|
||||
- Model path
|
||||
* - llama_cpp_n_gpu_layers
|
||||
- 1000000000
|
||||
- Number of layers to offload to the GPU, Set this to 1000000000 to offload all layers to the GPU. If your GPU VRAM is not enough, you can set a low number, eg: 10
|
||||
* - llama_cpp_n_threads
|
||||
- None
|
||||
- Number of threads to use. If None, the number of threads is automatically determined
|
||||
* - llama_cpp_n_batch
|
||||
- 512
|
||||
- Maximum number of prompt tokens to batch together when calling llama_eval
|
||||
* - llama_cpp_n_gqa
|
||||
- None
|
||||
- Grouped-query attention. Must be 8 for llama-2 70b.
|
||||
* - llama_cpp_rms_norm_eps
|
||||
- 5e-06
|
||||
- 5e-6 is a good value for llama-2 models.
|
||||
* - llama_cpp_cache_capacity
|
||||
- None
|
||||
- Maximum cache capacity. Examples: 2000MiB, 2GiB
|
||||
* - llama_cpp_prefer_cpu
|
||||
- False
|
||||
- If a GPU is available, it will be preferred by default, unless prefer_cpu=False is configured.
|
||||
|
||||
|
||||
.. tab:: vllm
|
||||
|
||||
vLLM is a fast and easy-to-use library for LLM inference and serving.
|
||||
|
||||
**Running vLLM**
|
||||
|
||||
**1.Installing Dependencies**
|
||||
|
||||
vLLM is an optional dependency in DB-GPT, and you can manually install it using the following command:
|
||||
|
||||
.. code-block::
|
||||
|
||||
pip install -e ".[vllm]"
|
||||
|
||||
**2.Modifying the Configuration File**
|
||||
|
||||
Next, you can directly modify your .env file to enable vllm.
|
||||
|
||||
.. code-block::
|
||||
|
||||
LLM_MODEL=vicuna-13b-v1.5
|
||||
MODEL_TYPE=vllm
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
3.Prepare sql example(Optional)
|
||||
-----------------
|
||||
**(Optional) load examples into SQLite**
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
bash ./scripts/examples/load_examples.sh
|
||||
|
||||
|
||||
On windows platform:
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
.\scripts\examples\load_examples.bat
|
||||
|
||||
4.Run db-gpt server
|
||||
-----------------
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
|
||||
python pilot/server/dbgpt_server.py
|
||||
|
||||
**Open http://localhost:5000 with your browser to see the product.**
|
||||
|
Reference in New Issue
Block a user