mirror of
https://github.com/csunny/DB-GPT.git
synced 2025-10-04 08:35:40 +00:00
241 lines
8.2 KiB
Markdown
241 lines
8.2 KiB
Markdown
Cluster Deployment
|
|
==================================
|
|
(local-cluster-index)=
|
|
## Model cluster deployment
|
|
|
|
|
|
**Installing Command-Line Tool**
|
|
|
|
All operations below are performed using the `dbgpt` command. To use the `dbgpt` command, you need to install the DB-GPT project with `pip install -e ".[default]"`. Alternatively, you can use `python pilot/scripts/cli_scripts.py` as a substitute for the `dbgpt` command.
|
|
|
|
### Launch Model Controller
|
|
|
|
```bash
|
|
dbgpt start controller
|
|
```
|
|
|
|
By default, the Model Controller starts on port 8000.
|
|
|
|
|
|
### Launch LLM Model Worker
|
|
|
|
If you are starting `chatglm2-6b`:
|
|
|
|
```bash
|
|
dbgpt start worker --model_name chatglm2-6b \
|
|
--model_path /app/models/chatglm2-6b \
|
|
--port 8001 \
|
|
--controller_addr http://127.0.0.1:8000
|
|
```
|
|
|
|
If you are starting `vicuna-13b-v1.5`:
|
|
|
|
```bash
|
|
dbgpt start worker --model_name vicuna-13b-v1.5 \
|
|
--model_path /app/models/vicuna-13b-v1.5 \
|
|
--port 8002 \
|
|
--controller_addr http://127.0.0.1:8000
|
|
```
|
|
|
|
Note: Be sure to use your own model name and model path.
|
|
|
|
### Launch Embedding Model Worker
|
|
|
|
```bash
|
|
|
|
dbgpt start worker --model_name text2vec \
|
|
--model_path /app/models/text2vec-large-chinese \
|
|
--worker_type text2vec \
|
|
--port 8003 \
|
|
--controller_addr http://127.0.0.1:8000
|
|
```
|
|
|
|
Note: Be sure to use your own model name and model path.
|
|
|
|
Check your model:
|
|
|
|
```bash
|
|
dbgpt model list
|
|
```
|
|
|
|
You will see the following output:
|
|
```
|
|
+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
|
|
| Model Name | Model Type | Host | Port | Healthy | Enabled | Prompt Template | Last Heartbeat |
|
|
+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
|
|
| chatglm2-6b | llm | 172.17.0.2 | 8001 | True | True | | 2023-09-12T23:04:31.287654 |
|
|
| WorkerManager | service | 172.17.0.2 | 8001 | True | True | | 2023-09-12T23:04:31.286668 |
|
|
| WorkerManager | service | 172.17.0.2 | 8003 | True | True | | 2023-09-12T23:04:29.845617 |
|
|
| WorkerManager | service | 172.17.0.2 | 8002 | True | True | | 2023-09-12T23:04:24.598439 |
|
|
| text2vec | text2vec | 172.17.0.2 | 8003 | True | True | | 2023-09-12T23:04:29.844796 |
|
|
| vicuna-13b-v1.5 | llm | 172.17.0.2 | 8002 | True | True | | 2023-09-12T23:04:24.597775 |
|
|
+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
|
|
```
|
|
|
|
### Connect to the model service in the webserver (dbgpt_server)
|
|
|
|
**First, modify the `.env` file to change the model name and the Model Controller connection address.**
|
|
|
|
```bash
|
|
LLM_MODEL=vicuna-13b-v1.5
|
|
# The current default MODEL_SERVER address is the address of the Model Controller
|
|
MODEL_SERVER=http://127.0.0.1:8000
|
|
```
|
|
|
|
#### Start the webserver
|
|
|
|
```bash
|
|
dbgpt start webserver --light
|
|
```
|
|
|
|
`--light` indicates not to start the embedded model service.
|
|
|
|
Alternatively, you can prepend the command with `LLM_MODEL=chatglm2-6b` to start:
|
|
|
|
```bash
|
|
LLM_MODEL=chatglm2-6b dbgpt start webserver --light
|
|
```
|
|
|
|
|
|
### More Command-Line Usages
|
|
|
|
You can view more command-line usages through the help command.
|
|
|
|
**View the `dbgpt` help**
|
|
```bash
|
|
dbgpt --help
|
|
```
|
|
|
|
You will see the basic command parameters and usage:
|
|
|
|
```
|
|
Usage: dbgpt [OPTIONS] COMMAND [ARGS]...
|
|
|
|
Options:
|
|
--log-level TEXT Log level
|
|
--version Show the version and exit.
|
|
--help Show this message and exit.
|
|
|
|
Commands:
|
|
install Install dependencies, plugins, etc.
|
|
knowledge Knowledge command line tool
|
|
model Clients that manage model serving
|
|
start Start specific server.
|
|
stop Start specific server.
|
|
```
|
|
|
|
**View the `dbgpt start` help**
|
|
|
|
```bash
|
|
dbgpt start --help
|
|
```
|
|
|
|
Here you can see the related commands and usage for start:
|
|
|
|
```
|
|
Usage: dbgpt start [OPTIONS] COMMAND [ARGS]...
|
|
|
|
Start specific server.
|
|
|
|
Options:
|
|
--help Show this message and exit.
|
|
|
|
Commands:
|
|
apiserver Start apiserver(TODO)
|
|
controller Start model controller
|
|
webserver Start webserver(dbgpt_server.py)
|
|
worker Start model worker
|
|
```
|
|
|
|
**View the `dbgpt start worker`help**
|
|
|
|
```bash
|
|
dbgpt start worker --help
|
|
```
|
|
|
|
Here you can see the parameters to start Model Worker:
|
|
|
|
```
|
|
Usage: dbgpt start worker [OPTIONS]
|
|
|
|
Start model worker
|
|
|
|
Options:
|
|
--model_name TEXT Model name [required]
|
|
--model_path TEXT Model path [required]
|
|
--worker_type TEXT Worker type
|
|
--worker_class TEXT Model worker class,
|
|
pilot.model.cluster.DefaultModelWorker
|
|
--host TEXT Model worker deploy host [default: 0.0.0.0]
|
|
--port INTEGER Model worker deploy port [default: 8001]
|
|
--daemon Run Model Worker in background
|
|
--limit_model_concurrency INTEGER
|
|
Model concurrency limit [default: 5]
|
|
--standalone Standalone mode. If True, embedded Run
|
|
ModelController
|
|
--register Register current worker to model controller
|
|
[default: True]
|
|
--worker_register_host TEXT The ip address of current worker to register
|
|
to ModelController. If None, the address is
|
|
automatically determined
|
|
--controller_addr TEXT The Model controller address to register
|
|
--send_heartbeat Send heartbeat to model controller
|
|
[default: True]
|
|
--heartbeat_interval INTEGER The interval for sending heartbeats
|
|
(seconds) [default: 20]
|
|
--device TEXT Device to run model. If None, the device is
|
|
automatically determined
|
|
--model_type TEXT Model type, huggingface, llama.cpp and proxy
|
|
[default: huggingface]
|
|
--prompt_template TEXT Prompt template. If None, the prompt
|
|
template is automatically determined from
|
|
model path, supported template: zero_shot,vi
|
|
cuna_v1.1,llama-2,alpaca,baichuan-chat
|
|
--max_context_size INTEGER Maximum context size [default: 4096]
|
|
--num_gpus INTEGER The number of gpus you expect to use, if it
|
|
is empty, use all of them as much as
|
|
possible
|
|
--max_gpu_memory TEXT The maximum memory limit of each GPU, only
|
|
valid in multi-GPU configuration
|
|
--cpu_offloading CPU offloading
|
|
--load_8bit 8-bit quantization
|
|
--load_4bit 4-bit quantization
|
|
--quant_type TEXT Quantization datatypes, `fp4` (four bit
|
|
float) and `nf4` (normal four bit float),
|
|
only valid when load_4bit=True [default:
|
|
nf4]
|
|
--use_double_quant Nested quantization, only valid when
|
|
load_4bit=True [default: True]
|
|
--compute_dtype TEXT Model compute type
|
|
--trust_remote_code Trust remote code [default: True]
|
|
--verbose Show verbose output.
|
|
--help Show this message and exit.
|
|
```
|
|
|
|
**View the `dbgpt model`help**
|
|
|
|
```bash
|
|
dbgpt model --help
|
|
```
|
|
|
|
The `dbgpt model ` command can connect to the Model Controller via the Model Controller address and then manage a remote model:
|
|
|
|
```
|
|
Usage: dbgpt model [OPTIONS] COMMAND [ARGS]...
|
|
|
|
Clients that manage model serving
|
|
|
|
Options:
|
|
--address TEXT Address of the Model Controller to connect to. Just support
|
|
light deploy model, If the environment variable
|
|
CONTROLLER_ADDRESS is configured, read from the environment
|
|
variable
|
|
--help Show this message and exit.
|
|
|
|
Commands:
|
|
chat Interact with your bot from the command line
|
|
list List model instances
|
|
restart Restart model instances
|
|
start Start model instances
|
|
stop Stop model instances
|
|
``` |