feat(model): Support deploy rerank model (#1522)

2025-09-03 18:17:45 +00:00 · 2024-05-16 14:50:16 +08:00
parent 559affe87d
commit 593e974405
29 changed files with 814 additions and 75 deletions
--- a/docs/docs/installation/advanced_usage/OpenAI_SDK_call.md
+++ b/docs/docs/installation/advanced_usage/OpenAI_SDK_call.md
@@ -55,7 +55,6 @@ curl http://127.0.0.1:8100/api/v1/embeddings \
 ```


-
 ## Verify via OpenAI SDK

 ```bash
@@ -72,3 +71,49 @@ completion = openai.ChatCompletion.create(
 print(completion.choices[0].message.content)
 ```

+## (Experimental) Rerank Open API
+
+The rerank API is an experimental feature that can be used to rerank the candidate list. 
+
+1. Use cURL to verify the rerank API.
+```bash
+curl http://127.0.0.1:8100/api/v1/beta/relevance \
+-H "Authorization: Bearer EMPTY" \
+-H "Content-Type: application/json" \
+-d '{
+    "model": "bge-reranker-base",
+    "query": "what is awel talk about?",
+    "documents": [
+      "Agentic Workflow Expression Language(AWEL) is a set of intelligent agent workflow expression language specially designed for large model application development.",
+      "Autonomous agents have long been a research focus in academic and industry communities",
+      "AWEL is divided into three levels in deign, namely the operator layer, AgentFream layer and DSL layer.",
+      "Elon musk is a famous entrepreneur and inventor, he is the founder of SpaceX and Tesla."
+    ]
+}'
+```
+
+2. Use python to verify the rerank API.
+```python
+from dbgpt.rag.embedding import OpenAPIRerankEmbeddings
+
+rerank = OpenAPIRerankEmbeddings(api_key="EMPTY", model_name="bge-reranker-base")
+rerank.predict(
+    query="what is awel talk about?", 
+    candidates=[
+        "Agentic Workflow Expression Language(AWEL) is a set of intelligent agent workflow expression language specially designed for large model application development.",
+        "Autonomous agents have long been a research focus in academic and industry communities",
+        "AWEL is divided into three levels in deign, namely the operator layer, AgentFream layer and DSL layer.",
+        "Elon musk is a famous entrepreneur and inventor, he is the founder of SpaceX and Tesla."
+    ]
+)
+```
+
+The output is as follows:
+```bash
+[
+ 0.9685816764831543,
+ 3.7338297261158004e-05,
+ 0.03692878410220146,
+ 3.73825132555794e-05
+]
+```
--- a/docs/docs/installation/model_service/cluster.md
+++ b/docs/docs/installation/model_service/cluster.md
@@ -3,21 +3,21 @@
 ## Install command line tools
 All the following operations are completed through the `dbgpt` command. To use the `dbgpt` command, you first need to install the `DB-GPT` project. You can install it through the following command

-```python
+```shell
 $ pip install -e ".[default]"
 ```
 It can also be used in script mode
-```python
+```shell
 $ python pilot/scripts/cli_scripts.py
 ```

 ## Start Model Controller
-```python
+```shell
 $ dbgpt start controller
 ```

 ## View log
-```python
+```shell
 $ docker logs db-gpt-webserver-1 -f
 ```
 By default, `Model Server` will start on port `8000`
@@ -28,7 +28,7 @@ By default, `Model Server` will start on port `8000`
 Start `chatglm2-6b` model Worker
 :::

-```python
+```shell
 dbgpt start worker --model_name chatglm2-6b \
 --model_path /app/models/chatglm2-6b \
 --port 8001 \
@@ -40,7 +40,7 @@ dbgpt start worker --model_name chatglm2-6b \
 Start `vicuna-13b-v1.5` model Worker
 :::

-```python
+```shell
 dbgpt start worker --model_name vicuna-13b-v1.5 \
 --model_path /app/models/vicuna-13b-v1.5 \
 --port 8002 \
@@ -52,9 +52,9 @@ dbgpt start worker --model_name vicuna-13b-v1.5 \
 :::


-## Start the embedding model service
+## Start Embedding Model Worker

-```python
+```shell
 dbgpt start worker --model_name text2vec \
 --model_path /app/models/text2vec-large-chinese \
 --worker_type text2vec \
@@ -66,24 +66,41 @@ dbgpt start worker --model_name text2vec \

 :::

+## Start Reranking Model Worker
+
+```shell
+dbgpt start worker --worker_type text2vec \
+--rerank \
+--model_path /app/models/bge-reranker-base \
+--model_name bge-reranker-base \
+--port 8004 \
+--controller_addr http://127.0.0.1:8000
+```
+:::info note
+⚠️  Make sure to use your own model name and model path.
+
+:::
+
 :::tip
 View and inspect deployed models
 :::


-```python
+```shell
 $ dbgpt model list

-+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
-|    Model Name   | Model Type |    Host    | Port | Healthy | Enabled | Prompt Template |       Last Heartbeat       |
-+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
-|   chatglm2-6b   |    llm     | 172.17.0.2 | 8001 |   True  |   True  |                 | 2023-09-12T23:04:31.287654 |
-|  WorkerManager  |  service   | 172.17.0.2 | 8001 |   True  |   True  |                 | 2023-09-12T23:04:31.286668 |
-|  WorkerManager  |  service   | 172.17.0.2 | 8003 |   True  |   True  |                 | 2023-09-12T23:04:29.845617 |
-|  WorkerManager  |  service   | 172.17.0.2 | 8002 |   True  |   True  |                 | 2023-09-12T23:04:24.598439 |
-|     text2vec    |  text2vec  | 172.17.0.2 | 8003 |   True  |   True  |                 | 2023-09-12T23:04:29.844796 |
-| vicuna-13b-v1.5 |    llm     | 172.17.0.2 | 8002 |   True  |   True  |                 | 2023-09-12T23:04:24.597775 |
-+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
+-------------------+------------+------------+------+---------+---------+-----------------+----------------------------+
+|    Model Name     | Model Type |    Host    | Port | Healthy | Enabled | Prompt Template |       Last Heartbeat       |
+-------------------+------------+------------+------+---------+---------+-----------------+----------------------------+
+|   chatglm2-6b     |    llm     | 172.17.0.2 | 8001 |   True  |   True  |                 | 2023-09-12T23:04:31.287654 |
+|  WorkerManager    |  service   | 172.17.0.2 | 8001 |   True  |   True  |                 | 2023-09-12T23:04:31.286668 |
+|  WorkerManager    |  service   | 172.17.0.2 | 8003 |   True  |   True  |                 | 2023-09-12T23:04:29.845617 |
+|  WorkerManager    |  service   | 172.17.0.2 | 8002 |   True  |   True  |                 | 2023-09-12T23:04:24.598439 |
+|  WorkerManager    |  service   | 172.21.0.5 | 8004 |   True  |   True  |                 | 2023-09-12T23:04:24.598439 |
+|     text2vec      |  text2vec  | 172.17.0.2 | 8003 |   True  |   True  |                 | 2023-09-12T23:04:29.844796 |
+| vicuna-13b-v1.5   |    llm     | 172.17.0.2 | 8002 |   True  |   True  |                 | 2023-09-12T23:04:24.597775 |
+| bge-reranker-base |  text2vec  | 172.21.0.5 | 8004 |   True  |   True  |                 | 2024-05-15T11:36:12.935012 |
+-------------------+------------+------------+------+---------+---------+-----------------+----------------------------+
 ```


@@ -91,13 +108,13 @@ $ dbgpt model list

 The model service deployed as above can be used through dbgpt_server. First modify the `.env` configuration file to change the connection model address

-```python
+```shell
 dbgpt start webserver --light
 ```

 ## Start Webserver 

-```python
+```shell
 LLM_MODEL=vicuna-13b-v1.5
 # The current default MODEL_SERVER address is the address of the Model Controller
 MODEL_SERVER=http://127.0.0.1:8000
@@ -106,7 +123,7 @@ MODEL_SERVER=http://127.0.0.1:8000


 Or it can be started directly by command to formulate the model.
-```python
+```shell
 LLM_MODEL=chatglm2-6b dbgpt start webserver --light
 ```

@@ -118,7 +135,7 @@ For more information about the use of the command line, you can view the command
 View dbgpt help `dbgpt --help`
 :::

-```python
+```shell
 dbgpt --help

 Already connect 'dbgpt'
@@ -143,7 +160,7 @@ Commands:
 Check the dbgpt start command `dbgpt start --help`
 :::

-```python
+```shell
 dbgpt start --help

 Already connect 'dbgpt'
@@ -166,7 +183,7 @@ Commands:
 View the dbgpt start model service help command `dbgpt start worker --help`
 :::

-```python
+```shell
 dbgpt start worker --help

 Already connect 'dbgpt'
@@ -239,7 +256,7 @@ Options:
 View dbgpt model service related commands `dbgpt model --help`
 :::

-```python
+```shell
 dbgpt model --help