Files
DB-GPT/docs/getting_started/install/cluster/vms/index.md
2023-10-17 19:46:34 +08:00

241 lines
8.2 KiB
Markdown

Cluster Deployment
==================================
(local-cluster-index)=
## Model cluster deployment
**Installing Command-Line Tool**
All operations below are performed using the `dbgpt` command. To use the `dbgpt` command, you need to install the DB-GPT project with `pip install -e ".[default]"`. Alternatively, you can use `python pilot/scripts/cli_scripts.py` as a substitute for the `dbgpt` command.
### Launch Model Controller
```bash
dbgpt start controller
```
By default, the Model Controller starts on port 8000.
### Launch LLM Model Worker
If you are starting `chatglm2-6b`:
```bash
dbgpt start worker --model_name chatglm2-6b \
--model_path /app/models/chatglm2-6b \
--port 8001 \
--controller_addr http://127.0.0.1:8000
```
If you are starting `vicuna-13b-v1.5`:
```bash
dbgpt start worker --model_name vicuna-13b-v1.5 \
--model_path /app/models/vicuna-13b-v1.5 \
--port 8002 \
--controller_addr http://127.0.0.1:8000
```
Note: Be sure to use your own model name and model path.
### Launch Embedding Model Worker
```bash
dbgpt start worker --model_name text2vec \
--model_path /app/models/text2vec-large-chinese \
--worker_type text2vec \
--port 8003 \
--controller_addr http://127.0.0.1:8000
```
Note: Be sure to use your own model name and model path.
Check your model:
```bash
dbgpt model list
```
You will see the following output:
```
+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
| Model Name | Model Type | Host | Port | Healthy | Enabled | Prompt Template | Last Heartbeat |
+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
| chatglm2-6b | llm | 172.17.0.2 | 8001 | True | True | | 2023-09-12T23:04:31.287654 |
| WorkerManager | service | 172.17.0.2 | 8001 | True | True | | 2023-09-12T23:04:31.286668 |
| WorkerManager | service | 172.17.0.2 | 8003 | True | True | | 2023-09-12T23:04:29.845617 |
| WorkerManager | service | 172.17.0.2 | 8002 | True | True | | 2023-09-12T23:04:24.598439 |
| text2vec | text2vec | 172.17.0.2 | 8003 | True | True | | 2023-09-12T23:04:29.844796 |
| vicuna-13b-v1.5 | llm | 172.17.0.2 | 8002 | True | True | | 2023-09-12T23:04:24.597775 |
+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
```
### Connect to the model service in the webserver (dbgpt_server)
**First, modify the `.env` file to change the model name and the Model Controller connection address.**
```bash
LLM_MODEL=vicuna-13b-v1.5
# The current default MODEL_SERVER address is the address of the Model Controller
MODEL_SERVER=http://127.0.0.1:8000
```
#### Start the webserver
```bash
dbgpt start webserver --light
```
`--light` indicates not to start the embedded model service.
Alternatively, you can prepend the command with `LLM_MODEL=chatglm2-6b` to start:
```bash
LLM_MODEL=chatglm2-6b dbgpt start webserver --light
```
### More Command-Line Usages
You can view more command-line usages through the help command.
**View the `dbgpt` help**
```bash
dbgpt --help
```
You will see the basic command parameters and usage:
```
Usage: dbgpt [OPTIONS] COMMAND [ARGS]...
Options:
--log-level TEXT Log level
--version Show the version and exit.
--help Show this message and exit.
Commands:
install Install dependencies, plugins, etc.
knowledge Knowledge command line tool
model Clients that manage model serving
start Start specific server.
stop Start specific server.
```
**View the `dbgpt start` help**
```bash
dbgpt start --help
```
Here you can see the related commands and usage for start:
```
Usage: dbgpt start [OPTIONS] COMMAND [ARGS]...
Start specific server.
Options:
--help Show this message and exit.
Commands:
apiserver Start apiserver(TODO)
controller Start model controller
webserver Start webserver(dbgpt_server.py)
worker Start model worker
```
**View the `dbgpt start worker`help**
```bash
dbgpt start worker --help
```
Here you can see the parameters to start Model Worker:
```
Usage: dbgpt start worker [OPTIONS]
Start model worker
Options:
--model_name TEXT Model name [required]
--model_path TEXT Model path [required]
--worker_type TEXT Worker type
--worker_class TEXT Model worker class,
pilot.model.cluster.DefaultModelWorker
--host TEXT Model worker deploy host [default: 0.0.0.0]
--port INTEGER Model worker deploy port [default: 8001]
--daemon Run Model Worker in background
--limit_model_concurrency INTEGER
Model concurrency limit [default: 5]
--standalone Standalone mode. If True, embedded Run
ModelController
--register Register current worker to model controller
[default: True]
--worker_register_host TEXT The ip address of current worker to register
to ModelController. If None, the address is
automatically determined
--controller_addr TEXT The Model controller address to register
--send_heartbeat Send heartbeat to model controller
[default: True]
--heartbeat_interval INTEGER The interval for sending heartbeats
(seconds) [default: 20]
--device TEXT Device to run model. If None, the device is
automatically determined
--model_type TEXT Model type, huggingface, llama.cpp and proxy
[default: huggingface]
--prompt_template TEXT Prompt template. If None, the prompt
template is automatically determined from
model path, supported template: zero_shot,vi
cuna_v1.1,llama-2,alpaca,baichuan-chat
--max_context_size INTEGER Maximum context size [default: 4096]
--num_gpus INTEGER The number of gpus you expect to use, if it
is empty, use all of them as much as
possible
--max_gpu_memory TEXT The maximum memory limit of each GPU, only
valid in multi-GPU configuration
--cpu_offloading CPU offloading
--load_8bit 8-bit quantization
--load_4bit 4-bit quantization
--quant_type TEXT Quantization datatypes, `fp4` (four bit
float) and `nf4` (normal four bit float),
only valid when load_4bit=True [default:
nf4]
--use_double_quant Nested quantization, only valid when
load_4bit=True [default: True]
--compute_dtype TEXT Model compute type
--trust_remote_code Trust remote code [default: True]
--verbose Show verbose output.
--help Show this message and exit.
```
**View the `dbgpt model`help**
```bash
dbgpt model --help
```
The `dbgpt model ` command can connect to the Model Controller via the Model Controller address and then manage a remote model:
```
Usage: dbgpt model [OPTIONS] COMMAND [ARGS]...
Clients that manage model serving
Options:
--address TEXT Address of the Model Controller to connect to. Just support
light deploy model, If the environment variable
CONTROLLER_ADDRESS is configured, read from the environment
variable
--help Show this message and exit.
Commands:
chat Interact with your bot from the command line
list List model instances
restart Restart model instances
start Start model instances
stop Stop model instances
```