doc:llm use faq (#489)

1. add difference between dbgpt_server and dbgpt_server --light
2025-09-06 03:20:41 +00:00 · 2023-08-28 15:15:17 +08:00
parent 7a87a8c9e1 f88acaffea
commit cc59bbb4fe
3 changed files with 95 additions and 50 deletions
--- a/docs/getting_started/faq/llm/llm_faq.md
+++ b/docs/getting_started/faq/llm/llm_faq.md
@@ -7,15 +7,27 @@ LLM_MODEL=proxyllm
 ````

 set your OPENAPI KEY
+
 ````shell
 PROXY_API_KEY={your-openai-sk}
 PROXY_SERVER_URL=https://api.openai.com/v1/chat/completions
 ````

- make sure your openapi API_KEY is available
- 
-##### Q2 how to use MultiGPUs
-DB-GPT will use all available gpu by default. And you can modify the setting `CUDA_VISIBLE_DEVICES=0,1` in `.env` file to use the specific gpu IDs.
+make sure your openapi API_KEY is available
+
+##### Q2 What difference between `python dbgpt_server --light` and `python dbgpt_server`
+
+```{note}
+* `python dbgpt_server --light` dbgpt_server does not start the llm service. Users can deploy the llm service separately by using `python llmserver`, and dbgpt_server accesses the llm service through set the LLM_SERVER environment variable in .env. The purpose is to allow for the separate deployment of dbgpt's backend service and llm service.
+
+* `python dbgpt_server` dbgpt_server service and the llm service are deployed on the same instance. when dbgpt_server starts the service, it also starts the llm service at the same time.
+
+```
+
+##### Q3 how to use MultiGPUs
+
+DB-GPT will use all available gpu by default. And you can modify the setting `CUDA_VISIBLE_DEVICES=0,1` in `.env` file
+to use the specific gpu IDs.

 Optionally, you can also specify the gpu ID to use before the starting command, as shown below:

@@ -29,7 +41,7 @@ CUDA_VISIBLE_DEVICES=3,4,5,6 python3 pilot/server/dbgpt_server.py

 You can modify the setting `MAX_GPU_MEMORY=xxGib` in `.env` file to configure the maximum memory used by each GPU.

-##### Q3 Not Enough Memory
+##### Q4 Not Enough Memory

 DB-GPT supported 8-bit quantization and 4-bit quantization.