feat(model): multi-model supports embedding model and simple component design implementation

2025-09-05 11:01:09 +00:00 · 2023-09-13 12:14:03 +08:00
parent 68d30dd4bb
commit 581cf361bf
47 changed files with 1050 additions and 211 deletions
--- a/docs/getting_started/install.rst
+++ b/docs/getting_started/install.rst
@@ -9,6 +9,7 @@ DB-GPT product is a Web application that you can chat database, chat knowledge,
 - docker
 - docker_compose
 - environment
+- cluster deployment
 - deploy_faq

 .. toctree::
@@ -20,6 +21,7 @@ DB-GPT product is a Web application that you can chat database, chat knowledge,
   ./install/deploy/deploy.md
   ./install/docker/docker.md
   ./install/docker_compose/docker_compose.md
+   ./install/cluster/cluster.rst
   ./install/llm/llm.rst
   ./install/environment/environment.md
   ./install/faq/deploy_faq.md
--- a/docs/getting_started/install/cluster/cluster.rst
+++ b/docs/getting_started/install/cluster/cluster.rst
@@ -0,0 +1,19 @@
+Cluster deployment
+==================================
+
+In order to deploy DB-GPT to multiple nodes, you can deploy a cluster. The cluster architecture diagram is as follows:
+
+.. raw:: html
+
+    <img src="../../../_static/img/muti-model-cluster-overview.png" />
+
+
+* On :ref:`Deploying on local machine <local-cluster-index>`. Local cluster deployment.
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Cluster deployment
+   :name: cluster_deploy
+   :hidden:
+
+   ./vms/index.md
--- a/docs/getting_started/install/cluster/kubernetes/index.md
+++ b/docs/getting_started/install/cluster/kubernetes/index.md
@@ -0,0 +1,3 @@
+Kubernetes cluster deployment
+==================================
+(kubernetes-cluster-index)=
--- a/docs/getting_started/install/llm/cluster/model_cluster.md
+++ b/docs/getting_started/install/llm/cluster/model_cluster.md
@@ -1,6 +1,6 @@
-Cluster deployment
+Local cluster deployment
 ==================================
-
+(local-cluster-index)=
 ## Model cluster deployment


@@ -17,7 +17,7 @@ dbgpt start controller
 By default, the Model Controller starts on port 8000.


-### Launch Model Worker
+### Launch LLM Model Worker

 If you are starting `chatglm2-6b`:

@@ -39,6 +39,18 @@ dbgpt start worker --model_name vicuna-13b-v1.5 \

 Note: Be sure to use your own model name and model path.

+### Launch Embedding Model Worker
+
+```bash
+
+dbgpt start worker --model_name text2vec \
+--model_path /app/models/text2vec-large-chinese \
+--worker_type text2vec \
+--port 8003 \
+--controller_addr http://127.0.0.1:8000
+```
+
+Note: Be sure to use your own model name and model path.

 Check your model:

@@ -51,8 +63,12 @@ You will see the following output:
 +-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
 |    Model Name   | Model Type |    Host    | Port | Healthy | Enabled | Prompt Template |       Last Heartbeat       |
 +-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
-|   chatglm2-6b   |    llm     | 172.17.0.6 | 8001 |   True  |   True  |       None      | 2023-08-31T04:48:45.252939 |
-| vicuna-13b-v1.5 |    llm     | 172.17.0.6 | 8002 |   True  |   True  |       None      | 2023-08-31T04:48:55.136676 |
+|   chatglm2-6b   |    llm     | 172.17.0.2 | 8001 |   True  |   True  |                 | 2023-09-12T23:04:31.287654 |
+|  WorkerManager  |  service   | 172.17.0.2 | 8001 |   True  |   True  |                 | 2023-09-12T23:04:31.286668 |
+|  WorkerManager  |  service   | 172.17.0.2 | 8003 |   True  |   True  |                 | 2023-09-12T23:04:29.845617 |
+|  WorkerManager  |  service   | 172.17.0.2 | 8002 |   True  |   True  |                 | 2023-09-12T23:04:24.598439 |
+|     text2vec    |  text2vec  | 172.17.0.2 | 8003 |   True  |   True  |                 | 2023-09-12T23:04:29.844796 |
+| vicuna-13b-v1.5 |    llm     | 172.17.0.2 | 8002 |   True  |   True  |                 | 2023-09-12T23:04:24.597775 |
 +-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
 ```

@@ -69,7 +85,7 @@ MODEL_SERVER=http://127.0.0.1:8000
 #### Start the webserver

 ```bash
-python pilot/server/dbgpt_server.py --light
+dbgpt start webserver --light
 ```

 `--light`  indicates not to start the embedded model service.
@@ -77,7 +93,7 @@ python pilot/server/dbgpt_server.py --light
 Alternatively, you can prepend the command with `LLM_MODEL=chatglm2-6b` to start:

 ```bash
-LLM_MODEL=chatglm2-6b python pilot/server/dbgpt_server.py --light
+LLM_MODEL=chatglm2-6b dbgpt start webserver --light
 ```


@@ -101,9 +117,11 @@ Options:
  --help            Show this message and exit.

 Commands:
-  model  Clients that manage model serving
-  start  Start specific server.
-  stop   Start specific server.
+  install    Install dependencies, plugins, etc.
+  knowledge  Knowledge command line tool
+  model      Clients that manage model serving
+  start      Start specific server.
+  stop       Start specific server.
 ```

 **View the `dbgpt start` help**
@@ -146,10 +164,11 @@ Options:
  --model_name TEXT               Model name  [required]
  --model_path TEXT               Model path  [required]
  --worker_type TEXT              Worker type
-  --worker_class TEXT             Model worker class, pilot.model.worker.defau
-                                  lt_worker.DefaultModelWorker
+  --worker_class TEXT             Model worker class,
+                                  pilot.model.cluster.DefaultModelWorker
  --host TEXT                     Model worker deploy host  [default: 0.0.0.0]
-  --port INTEGER                  Model worker deploy port  [default: 8000]
+  --port INTEGER                  Model worker deploy port  [default: 8001]
+  --daemon                        Run Model Worker in background
  --limit_model_concurrency INTEGER
                                  Model concurrency limit  [default: 5]
  --standalone                    Standalone mode. If True, embedded Run
@@ -166,7 +185,7 @@ Options:
                                  (seconds)  [default: 20]
  --device TEXT                   Device to run model. If None, the device is
                                  automatically determined
-  --model_type TEXT               Model type, huggingface or llama.cpp
+  --model_type TEXT               Model type, huggingface, llama.cpp and proxy
                                  [default: huggingface]
  --prompt_template TEXT          Prompt template. If None, the prompt
                                  template is automatically determined from
@@ -190,7 +209,7 @@ Options:
  --compute_dtype TEXT            Model compute type
  --trust_remote_code             Trust remote code  [default: True]
  --verbose                       Show verbose output.
-  --help                          Show this message and exit.
+  --help                          Show this message and exit. 
 ```

 **View the `dbgpt model`help**
@@ -208,10 +227,13 @@ Usage: dbgpt model [OPTIONS] COMMAND [ARGS]...

 Options:
  --address TEXT  Address of the Model Controller to connect to. Just support
-                  light deploy model  [default: http://127.0.0.1:8000]
+                  light deploy model, If the environment variable
+                  CONTROLLER_ADDRESS is configured, read from the environment
+                  variable
  --help          Show this message and exit.

 Commands:
+  chat     Interact with your bot from the command line
  list     List model instances
  restart  Restart model instances
  start    Start model instances
--- a/docs/getting_started/install/llm/llm.rst
+++ b/docs/getting_started/install/llm/llm.rst
@@ -6,6 +6,7 @@ DB-GPT provides a management and deployment solution for multiple models. This c


 Multi LLMs Support, Supports multiple large language models, currently supporting
+  - 🔥 Baichuan2(7b,13b)
  - 🔥 Vicuna-v1.5(7b,13b)
  - 🔥 llama-2(7b,13b,70b)
  - WizardLM-v1.2(13b)
@@ -19,7 +20,6 @@ Multi LLMs Support, Supports multiple large language models, currently supportin

 - llama_cpp
 - quantization
- cluster deployment

 .. toctree::
   :maxdepth: 2
@@ -29,4 +29,3 @@ Multi LLMs Support, Supports multiple large language models, currently supportin

   ./llama/llama_cpp.md
   ./quantization/quantization.md
-   ./cluster/model_cluster.md