doc:update deploy doc

2025-09-11 22:09:44 +00:00 · 2023-11-03 10:24:49 +08:00
parent 606d384a55
commit 9cc6386301
6 changed files with 682 additions and 178 deletions
--- a/docs/getting_started/install.rst
+++ b/docs/getting_started/install.rst
@@ -18,7 +18,7 @@ DB-GPT product is a Web application that you can chat database, chat knowledge,
   :name: deploy
   :hidden:

-   ./install/deploy/deploy.md
+   ./install/deploy.rst
   ./install/docker/docker.md
   ./install/docker_compose/docker_compose.md
   ./install/cluster/cluster.rst
--- a/docs/getting_started/install/deploy.rst
+++ b/docs/getting_started/install/deploy.rst
@@ -0,0 +1,438 @@
+.. _installation:
+
+Installation From Source
+==============
+
+To get started, install DB-GPT with the following steps.
+
+DB-GPT can be deployed on servers with low hardware requirements or on servers with high hardware requirements.
+You can install DB-GPT by Using third-part LLM REST API Service OpenAI, Azure.
+
+And you can also install DB-GPT by deploy LLM Service by download LLM model.
+
+
+1.Preparation
+-----------------
+**Download DB-GPT**
+
+.. code-block:: shell
+
+    git clone https://github.com/eosphoros-ai/DB-GPT.git
+
+**Install Miniconda**
+
+We use Sqlite as default database, so there is no need for database installation.  If you choose to connect to other databases, you can follow our tutorial for installation and configuration.
+For the entire installation process of DB-GPT, we use the miniconda3 virtual environment. Create a virtual environment and install the Python dependencies.
+:ref: `https://docs.conda.io/en/latest/miniconda.html<How to install Miniconda>`
+
+.. code-block:: shell
+
+    python>=3.10
+    conda create -n dbgpt_env python=3.10
+    conda activate dbgpt_env
+    # it will take some minutes
+    pip install -e ".[default]"
+
+.. code-block:: shell
+
+    cp .env.template .env
+
+2.Deploy LLM Service
+-----------------
+DB-GPT can be deployed on servers with low hardware requirements or on servers with high hardware requirements.
+
+If you are low hardware requirements you can install DB-GPT by Using third-part LLM REST API Service OpenAI, Azure, tongyi.
+
+.. tabs::
+
+    .. tab:: OpenAI
+
+        Download embedding model
+
+        .. code-block:: shell
+
+            cd DB-GPT
+            mkdir models and cd models
+
+            #### embedding model
+            git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
+            or
+            git clone https://huggingface.co/moka-ai/m3e-large
+
+        Configure LLM_MODEL and PROXY_API_URL and API_KEY in `.env` file
+
+        .. code-block:: shell
+
+            LLM_MODEL=chatgpt_proxyllm
+            PROXY_API_KEY={your-openai-sk}
+            PROXY_SERVER_URL=https://api.openai.com/v1/chat/completions
+
+        .. tip::
+
+            Make sure your .env configuration is not overwritten
+
+
+    .. tab:: Vicuna
+        ([Vicuna-v1.5](https://huggingface.co/lmsys/vicuna-13b-v1.5) based on llama-2 has been released, we recommend you set `LLM_MODEL=vicuna-13b-v1.5` to try this model)
+
+        .. list-table:: vicuna-v1.5 hardware requirements
+            :widths: 50 50 50
+            :header-rows: 1
+
+            * - Model
+              - Quantize
+              - VRAM Size
+            * - vicuna-7b-v1.5
+              - 4-bit
+              - 8 GB
+            * - vicuna-7b-v1.5
+              - 8-bit
+              - 12 GB
+            * - vicuna-13b-v1.5
+              - 4-bit
+              - 12 GB
+            * - vicuna-13b-v1.5
+              - 8-bit
+              - 20 GB
+
+
+        .. note::
+
+          Notice make sure you have install git-lfs
+
+          centos:yum install git-lfs
+
+          ubuntu:apt-get install git-lfs
+
+          macos:brew install git-lfs
+
+
+        .. code-block:: shell
+
+            cd DB-GPT
+            mkdir models and cd models
+
+            #### embedding model
+            git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
+            or
+            git clone https://huggingface.co/moka-ai/m3e-large
+
+            #### llm model, if you use openai or Azure or tongyi llm api service, you don't need to download llm model
+            git clone https://huggingface.co/lmsys/vicuna-13b-v1.5
+
+        The model files are large and will take a long time to download.
+
+        **Configure LLM_MODEL in `.env` file**
+
+
+        .. code-block:: shell
+
+            LLM_MODEL=vicuna-13b-v1.5
+
+    .. tab:: Baichuan
+
+        .. list-table:: Baichuan hardware requirements
+            :widths: 50 50 50
+            :header-rows: 1
+
+            * - Model
+              - Quantize
+              - VRAM Size
+            * - baichuan-7b
+              - 4-bit
+              - 8 GB
+            * - baichuan-7b
+              - 8-bit
+              - 12 GB
+            * - baichuan-13b
+              - 4-bit
+              - 12 GB
+            * - baichuan-13b
+              - 8-bit
+              - 20 GB
+
+        .. note::
+
+          Notice make sure you have install git-lfs
+
+          centos:yum install git-lfs
+
+          ubuntu:apt-get install git-lfs
+
+          macos:brew install git-lfs
+
+
+        .. code-block:: shell
+
+            cd DB-GPT
+            mkdir models and cd models
+
+            #### embedding model
+            git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
+            or
+            git clone https://huggingface.co/moka-ai/m3e-large
+
+            #### llm model
+            git clone https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat
+            or
+            git clone https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat
+
+        The model files are large and will take a long time to download.
+
+        **Configure LLM_MODEL in `.env` file**
+
+        please rename Baichuan path to "baichuan2-13b" or "baichuan2-7b"
+
+        .. code-block:: shell
+
+            LLM_MODEL=baichuan2-13b
+
+    .. tab:: ChatGLM
+
+        .. note::
+
+          Notice make sure you have install git-lfs
+
+          centos:yum install git-lfs
+
+          ubuntu:apt-get install git-lfs
+
+          macos:brew install git-lfs
+
+
+        .. code-block:: shell
+
+            cd DB-GPT
+            mkdir models and cd models
+
+            #### embedding model
+            git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
+            or
+            git clone https://huggingface.co/moka-ai/m3e-large
+
+            #### llm model
+            git clone https://huggingface.co/THUDM/chatglm2-6b
+
+        The model files are large and will take a long time to download.
+
+        **Configure LLM_MODEL in `.env` file**
+
+        please rename chatglm model path to "chatglm2-6b"
+
+        .. code-block:: shell
+
+            LLM_MODEL=chatglm2-6b
+
+    .. tab:: Other LLM API
+
+        Download embedding model
+
+        .. code-block:: shell
+
+            cd DB-GPT
+            mkdir models and cd models
+
+            #### embedding model
+            git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
+            or
+            git clone https://huggingface.co/moka-ai/m3e-large
+
+        .. note::
+
+            * OpenAI
+            * Azure
+            * Aliyun tongyi
+            * Baidu wenxin
+            * Zhipu
+            * Baichuan
+            * Bard
+
+        Configure LLM_MODEL and PROXY_API_URL and API_KEY in `.env` file
+
+        .. code-block:: shell
+
+            #OpenAI
+            LLM_MODEL=chatgpt_proxyllm
+            PROXY_API_KEY={your-openai-sk}
+            PROXY_SERVER_URL=https://api.openai.com/v1/chat/completions
+
+            #Azure
+            LLM_MODEL=chatgpt_proxyllm
+            PROXY_API_KEY={your-azure-sk}
+            PROXY_API_BASE=https://{your domain}.openai.azure.com/
+            PROXY_API_TYPE=azure
+            PROXY_SERVER_URL=xxxx
+            PROXY_API_VERSION=2023-05-15
+            PROXYLLM_BACKEND=gpt-35-turbo
+
+            #Aliyun tongyi
+            LLM_MODEL=tongyi_proxyllm
+            TONGYI_PROXY_API_KEY={your-tongyi-sk}
+            PROXY_SERVER_URL={your_service_url}
+
+            ## Baidu wenxin
+            LLM_MODEL=wenxin_proxyllm
+            PROXY_SERVER_URL={your_service_url}
+            WEN_XIN_MODEL_VERSION={version}
+            WEN_XIN_API_KEY={your-wenxin-sk}
+            WEN_XIN_SECRET_KEY={your-wenxin-sct}
+
+            ## Zhipu
+            LLM_MODEL=zhipu_proxyllm
+            PROXY_SERVER_URL={your_service_url}
+            ZHIPU_MODEL_VERSION={version}
+            ZHIPU_PROXY_API_KEY={your-zhipu-sk}
+
+            ## Baichuan
+            LLM_MODEL=bc_proxyllm
+            PROXY_SERVER_URL={your_service_url}
+            BAICHUN_MODEL_NAME={version}
+            BAICHUAN_PROXY_API_KEY={your-baichuan-sk}
+            BAICHUAN_PROXY_API_SECRET={your-baichuan-sct}
+
+            ## bard
+            LLM_MODEL=bard_proxyllm
+            PROXY_SERVER_URL={your_service_url}
+            # from https://bard.google.com/     f12-> application-> __Secure-1PSID
+            BARD_PROXY_API_KEY={your-bard-token}
+
+        .. tip::
+
+            Make sure your .env configuration is not overwritten
+
+    .. tab:: llama.cpp
+
+        DB-GPT already supports [llama.cpp](https://github.com/ggerganov/llama.cpp) via [llama-cpp-python](https://github.com/abetlen/llama-cpp-python).
+
+        **Preparing Model Files**
+
+        To use llama.cpp, you need to prepare a gguf format model file, and there are two common ways to obtain it, you can choose either:
+
+        **1. Download a pre-converted model file.**
+
+        Suppose you want to use [Vicuna 13B v1.5](https://huggingface.co/lmsys/vicuna-13b-v1.5), you can download the file already converted from [TheBloke/vicuna-13B-v1.5-GGUF](https://huggingface.co/TheBloke/vicuna-13B-v1.5-GGUF), only one file is needed. Download it to the `models` directory and rename it to `ggml-model-q4_0.gguf`.
+
+        .. code-block::
+
+          wget https://huggingface.co/TheBloke/vicuna-13B-v1.5-GGUF/resolve/main/vicuna-13b-v1.5.Q4_K_M.gguf -O models/ggml-model-q4_0.gguf
+
+        **2. Convert It Yourself**
+
+        You can convert the model file yourself according to the instructions in [llama.cpp#prepare-data--run](https://github.com/ggerganov/llama.cpp#prepare-data--run), and put the converted file in the models directory and rename it to `ggml-model-q4_0.gguf`.
+
+        **Installing Dependencies**
+
+        llama.cpp is an optional dependency in DB-GPT, and you can manually install it using the following command:
+
+        .. code-block::
+
+            pip install -e ".[llama_cpp]"
+
+
+        **3.Modifying the Configuration File**
+
+        Next, you can directly modify your `.env` file to enable llama.cpp.
+
+        .. code-block::
+
+            LLM_MODEL=llama-cpp
+            llama_cpp_prompt_template=vicuna_v1.1
+
+        Then you can run it according to [Run](https://db-gpt.readthedocs.io/en/latest/getting_started/install/deploy/deploy.html#run).
+
+
+        **More Configurations**
+
+        In DB-GPT, the model configuration can be done through  `{model name}_{config key}`.
+
+        .. list-table:: More Configurations
+            :widths: 50 50 50
+            :header-rows: 1
+
+            * - Environment Variable Key
+              - Default
+              - Description
+            * - llama_cpp_prompt_template
+              - None
+              - Prompt template name, now support: zero_shot, vicuna_v1.1,alpaca,llama-2,baichuan-chat,internlm-chat, If None, the prompt template is automatically determined from model path。
+            * - llama_cpp_model_path
+              - None
+              - Model path
+            * - llama_cpp_n_gpu_layers
+              - 1000000000
+              - Number of layers to offload to the GPU, Set this to 1000000000 to offload all layers to the GPU. If your GPU VRAM is not enough, you can set a low number, eg: 10
+            * - llama_cpp_n_threads
+              - None
+              - Number of threads to use. If None, the number of threads is automatically determined
+            * - llama_cpp_n_batch
+              - 512
+              - Maximum number of prompt tokens to batch together when calling llama_eval
+            * - llama_cpp_n_gqa
+              - None
+              - Grouped-query attention. Must be 8 for llama-2 70b.
+            * - llama_cpp_rms_norm_eps
+              - 5e-06
+              - 5e-6 is a good value for llama-2 models.
+            * - llama_cpp_cache_capacity
+              - None
+              - Maximum cache capacity. Examples: 2000MiB, 2GiB
+            * - llama_cpp_prefer_cpu
+              - False
+              - If a GPU is available, it will be preferred by default, unless prefer_cpu=False is configured.
+
+
+    .. tab:: vllm
+
+        vLLM is a fast and easy-to-use library for LLM inference and serving.
+
+        **Running vLLM**
+
+        **1.Installing Dependencies**
+
+        vLLM is an optional dependency in DB-GPT, and you can manually install it using the following command:
+
+        .. code-block::
+
+          pip install -e ".[vllm]"
+
+        **2.Modifying the Configuration File**
+
+        Next, you can directly modify your .env file to enable vllm.
+
+        .. code-block::
+
+            LLM_MODEL=vicuna-13b-v1.5
+            MODEL_TYPE=vllm
+
+
+
+
+
+
+
+
+3.Prepare sql example(Optional)
+-----------------
+**(Optional) load examples into SQLite**
+
+        .. code-block:: shell
+
+            bash ./scripts/examples/load_examples.sh
+
+
+        On windows platform:
+
+        .. code-block:: shell
+
+            .\scripts\examples\load_examples.bat
+
+4.Run db-gpt server
+-----------------
+
+        .. code-block:: shell
+
+
+            python pilot/server/dbgpt_server.py
+
+**Open http://localhost:5000 with your browser to see the product.**
+