Merge remote-tracking branch 'origin/main' into dbgpt_api

2025-08-15 15:03:45 +00:00 · 2023-07-12 13:54:17 +08:00 · 2023-07-12 13:54:17 +08:00 · f85def5a52
commit f85def5a52
parent 7d2b96aeca e4681c9a9d
16 changed files with 160 additions and 36 deletions
--- a/.env.template
+++ b/.env.template
@ -123,3 +123,8 @@ PROXY_SERVER_URL=https://api.openai.com/v1/chat/completions
 # **    SUMMARY_CONFIG
 #*******************************************************************#
 SUMMARY_CONFIG=FAST
+
+#*******************************************************************#
+# **    MUlti-GPU
+#*******************************************************************#
+NUM_GPUS = 1
--- a/README.md
+++ b/README.md
@ -16,11 +16,53 @@

 ## What is DB-GPT?

-As large models are released and iterated upon, they are becoming increasingly intelligent. However, in the process of using large models, we face significant challenges in data security and privacy. We need to ensure that our sensitive data and environments remain completely controlled and avoid any data privacy leaks or security risks. Based on this, we have launched the DB-GPT project to build a complete private large model solution for all database-based scenarios. This solution supports local deployment, allowing it to be applied not only in independent private environments but also to be independently deployed and isolated according to business modules, ensuring that the ability of large models is absolutely private, secure, and controllable.
-
 DB-GPT is an experimental open-source project that uses localized GPT large models to interact with your data and environment. With this solution, you can be assured that there is no risk of data leakage, and your data is 100% private and secure.

-## News
+[![Star History Chart](https://api.star-history.com/svg?repos=csunny/DB-GPT)](https://star-history.com/#csunny/DB-GPT)
+
+## Demo
+
+Run on an RTX 4090 GPU.
+
+https://github.com/csunny/DB-GPT/assets/13723926/55f31781-1d49-4757-b96e-7ef6d3dbcf80
+
+<!-- <video id="video" controls="" preload="auto" poster="assets/exector_sql.png">
+      <source id="mp4" src="https://github.com/csunny/DB-GPT/assets/17919400/654b5a49-5ea4-4c02-b5b2-72d089dcc1f0" type="video/mp4">
+</videos> -->
+
+
+#### Chat with data, and figure charts.
+
+<p align="left">
+  <img src="./assets/dashboard.png" width="800px" />
+</p>
+
+#### Text2SQL, generate SQL from chat
+<p align="left">
+  <img src="./assets/chatSQL.png" width="800px" />
+</p>
+
+#### Chat with database meta information.
+<p align="left">
+  <img src="./assets/chatdb.png" width="800px" />
+</p>
+
+#### Chat with data, and execute results.
+<p align="left">
+  <img src="./assets/chatdata.png" width="800px" />
+</p>
+
+#### Knownledge space to manage docs.
+<p align="left">
+  <img src="./assets/ks.png" width="800px" />
+</p>
+
+#### Chat with knowledge, such as txt、pdf、csv、words. etc
+<p align="left">
+  <img src="./assets/chat_knowledge.png" width="800px" />
+</p>
+
+## Releases 
 - [2023/07/06]🔥🔥🔥Brand-new DB-GPT product with a brand-new web UI. [documents](https://db-gpt.readthedocs.io/en/latest/getting_started/getting_started.html) 
 - [2023/06/25]🔥support chatglm2-6b model. [documents](https://db-gpt.readthedocs.io/en/latest/modules/llms.html) 
 - [2023/06/14] support gpt4all model, which can run at M1/M2, or cpu machine. [documents](https://db-gpt.readthedocs.io/en/latest/modules/llms.html) 
@ -31,23 +73,6 @@ DB-GPT is an experimental open-source project that uses localized GPT large mode
 - [2023/05/15] Chat with documents. [demo](./assets/new_knownledge_en.gif)
 - [2023/05/06] SQL generation and diagnosis. [demo](./assets/demo_en.gif)

-## Demo
-
-Run on an RTX 4090 GPU.
-
-<p align="left">
-  <img src="./assets/dbgpt_demo.gif" width="800px" />
-</p>
-
-
-<!-- <video id="video" controls="" preload="auto" poster="assets/exector_sql.png">
-      <source id="mp4" src="https://github.com/csunny/DB-GPT/assets/17919400/654b5a49-5ea4-4c02-b5b2-72d089dcc1f0" type="video/mp4">
-</videos> -->
-
-
-
-
-
 ## Features

 Currently, we have released multiple key features, which are listed below to demonstrate our current capabilities:
@ -75,7 +100,6 @@ Currently, we have released multiple key features, which are listed below to dem
 DB-GPT creates a vast model operating system using [FastChat](https://github.com/lm-sys/FastChat) and offers a large language model powered by [Vicuna](https://huggingface.co/Tribbiani/vicuna-7b). In addition, we provide private domain knowledge base question-answering capability. Furthermore, we also provide support for additional plugins, and our design natively supports the Auto-GPT plugin.Our vision is to make it easier and more convenient to build  applications around databases and llm.


-
 Is the architecture of the entire DB-GPT shown in the following figure:

 <p align="center">
@ -126,6 +150,12 @@ This project is standing on the shoulders of giants and is not going to work wit

 - Please run `black .` before submitting the code.

+## RoadMap
+
+<p align="left">
+  <img src="./assets/roadmap.jpg" width="800px" />
+</p>
+
 ## Licence

 The MIT License (MIT)
@ -133,5 +163,5 @@ The MIT License (MIT)
 ## Contact Information
 We are working on building a community, if you have any ideas about building the community, feel free to contact us. [Discord](https://discord.gg/4BNdxm5d)

-[![Star History Chart](https://api.star-history.com/svg?repos=csunny/DB-GPT)](https://star-history.com/#csunny/DB-GPT)
+

--- a/README.zh.md
+++ b/README.zh.md
@ -19,9 +19,52 @@

 DB-GPT 是一个开源的以数据库为基础的GPT实验项目，使用本地化的GPT大模型与您的数据和环境进行交互，无数据泄露风险，100% 私密

+[![Star History Chart](https://api.star-history.com/svg?repos=csunny/DB-GPT)](https://star-history.com/#csunny/DB-GPT)
+
+
 [DB-GPT视频介绍](https://www.bilibili.com/video/BV1SM4y1a7Nj/?buvid=551b023900b290f9497610b2155a2668&is_story_h5=false&mid=%2BVyE%2Fwau5woPcUKieCWS0A%3D%3D&p=1&plat_id=116&share_from=ugc&share_medium=iphone&share_plat=ios&share_session_id=5D08B533-82A4-4D40-9615-7826065B4574&share_source=GENERIC&share_tag=s_i&timestamp=1686307943&unique_k=bhO3lgQ&up_id=31375446)  


+## 效果演示
+
+示例通过 RTX 4090 GPU 演示
+
+
+https://github.com/csunny/DB-GPT/assets/13723926/55f31781-1d49-4757-b96e-7ef6d3dbcf80
+
+#### 根据自然语言对话生成分析图表
+
+<p align="left">
+  <img src="./assets/dashboard.png" width="800px" />
+</p>
+
+
+#### 根据自然语言对话生成SQL
+<p align="left">
+  <img src="./assets/chatSQL.png" width="800px" />
+</p>
+
+#### 与数据库元数据信息进行对话, 生成准确SQL语句
+<p align="left">
+  <img src="./assets/chatdb.png" width="800px" />
+</p>
+
+
+#### 与数据对话, 直接查看执行结果
+<p align="left">
+  <img src="./assets/chatdata.png" width="800px" />
+</p>
+
+#### 知识库管理
+<p align="left">
+  <img src="./assets/ks.png" width="800px" />
+</p>
+
+#### 根据知识库对话, 比如pdf、csv、txt、words等等.
+<p align="left">
+  <img src="./assets/chat_knowledge.png" width="800px" />
+</p>
+
 ## 最新发布
 - [2023/07/06]🔥🔥🔥 全新的DB-GPT产品。 [使用文档](https://db-gpt.readthedocs.io/projects/db-gpt-docs-zh-cn/zh_CN/latest/getting_started/getting_started.html)
 - [2023/06/25]🔥 支持ChatGLM2-6B模型。 [使用文档](https://db-gpt.readthedocs.io/projects/db-gpt-docs-zh-cn/zh_CN/latest/modules/llms.html)
@ -54,13 +97,6 @@ DB-GPT 是一个开源的以数据库为基础的GPT实验项目，使用本地
  - 支持多种大语言模型, 当前已支持Vicuna(7b,13b), ChatGLM-6b(int4, int8), guanaco(7b,13b,33b), Gorilla(7b,13b)
  - TODO: codet5p, codegen2

-## 效果演示
-
-示例通过 RTX 4090 GPU 演示
-
-<p align="left">
-  <img src="./assets/dbgpt_demo.gif" width="800px" />
-</p>

 ## 架构方案
 DB-GPT基于 [FastChat](https://github.com/lm-sys/FastChat) 构建大模型运行环境，并提供 vicuna 作为基础的大语言模型。此外，我们通过LangChain提供私域知识库问答能力。同时我们支持插件模式, 在设计上原生支持Auto-GPT插件。我们的愿景是让围绕数据库和LLM构建应用程序更加简便和便捷。
@ -125,6 +161,14 @@ Run the Python interpreter and type the commands:

 这是一个用于数据库的复杂且创新的工具, 我们的项目也在紧急的开发当中, 会陆续发布一些新的feature。如在使用当中有任何具体问题, 优先在项目下提issue, 如有需要, 请联系如下微信，我会尽力提供帮助，同时也非常欢迎大家参与到项目建设中。

+
+
+# 路线图
+
+<p align="left">
+  <img src="./assets/roadmap.jpg" width="800px" />
+</p>
+
 ## 联系我们
 微信群已超扫码加群上限, 进群请添加如下微信帮拉进群。

@ -135,5 +179,3 @@ Run the Python interpreter and type the commands:

 The MIT License (MIT)

-[![Star History Chart](https://api.star-history.com/svg?repos=csunny/DB-GPT)](https://star-history.com/#csunny/DB-GPT)
-
--- a/assets/chart_db_city_users.png
+++ b/assets/chart_db_city_users.png
--- a/assets/chatSQL.png
+++ b/assets/chatSQL.png
--- a/assets/chat_knowledge.png
+++ b/assets/chat_knowledge.png
--- a/assets/chatdata.png
+++ b/assets/chatdata.png
--- a/assets/chatdb.png
+++ b/assets/chatdb.png
--- a/assets/dashboard.png
+++ b/assets/dashboard.png
--- a/assets/dbgpt_demo.gif
+++ b/assets/dbgpt_demo.gif
--- a/assets/ks.png
+++ b/assets/ks.png
--- a/assets/roadmap.jpg
+++ b/assets/roadmap.jpg
--- a/docs/getting_started/tutorials.md
+++ b/docs/getting_started/tutorials.md
@ -21,4 +21,4 @@ DB-GPT is divided into several functions, including chat with knowledge base, ex
 ![sql execute demonstration](../../assets/auto_sql_en.gif)

 ### Plugins
-![db plugins demonstration](../../assets/chart_db_city_users.png)
+![db plugins demonstration](../../assets/dashboard.png)
--- a/docs/use_cases/tool_use_with_plugin.md
+++ b/docs/use_cases/tool_use_with_plugin.md
@ -20,7 +20,7 @@ python /DB-GPT/pilot/webserver.py
 ```
 - Test Case: Use a histogram to analyze the total order amount of users in different cities.
 <p align="center">
-  <img src="../../assets/chart_db_city_users.png" width="680px" />
+  <img src="../../assets/dashboard.png" width="680px" />
 </p>

 - More detail see: [DB-DASHBOARD](https://github.com/csunny/DB-GPT-Plugins/blob/main/src/dbgpt_plugins/Readme.md)
--- a/pilot/configs/config.py
+++ b/pilot/configs/config.py
@ -28,6 +28,8 @@ class Config(metaclass=Singleton):
        self.skip_reprompt = False
        self.temperature = float(os.getenv("TEMPERATURE", 0.7))

+        self.NUM_GPUS = int(os.getenv("NUM_GPUS",1))
+
        self.execute_local_commands = (
            os.getenv("EXECUTE_LOCAL_COMMANDS", "False") == "True"
        )
--- a/pilot/model/adapter.py
+++ b/pilot/model/adapter.py
@ -73,6 +73,40 @@ class VicunaLLMAdapater(BaseLLMAdaper):
        )
        return model, tokenizer

+def auto_configure_device_map(num_gpus):
+    """handling multi gpu calls"""
+    # transformer.word_embeddings occupying 1 floors
+    # transformer.final_layernorm and lm_head occupying 1 floors
+    # transformer.layers occupying 28 floors
+    # Allocate a total of 30 layers to number On gpus cards
+    num_trans_layers = 28
+    per_gpu_layers = 30 / num_gpus
+    #Bugfix: call torch.embedding in Linux and the incoming weight and input are not on the same device, resulting in a RuntimeError
+    #Under Windows, model. device will be set to transformer. word_ Embeddings. device
+    #Under Linux, model. device will be set to lm_ Head.device
+    #When calling chat or stream_ During chat, input_ IDS will be placed on model. device
+    #If transformer. word_ If embeddings. device and model. device are different, it will cause a RuntimeError
+    #Therefore, here we will transform. word_ Embeddings, transformer. final_ Layernorm, lm_ Put all the heads on the first card
+    device_map = {
+        'transformer.embedding.word_embeddings': 0,
+        'transformer.encoder.final_layernorm': 0,
+        'transformer.output_layer': 0,
+        'transformer.rotary_pos_emb': 0,
+        'lm_head': 0
+    }
+
+    used = 2
+    gpu_target = 0
+    for i in range(num_trans_layers):
+        if used >= per_gpu_layers:
+            gpu_target += 1
+            used = 0
+        assert gpu_target < num_gpus
+        device_map[f'transformer.encoder.layers.{i}'] = gpu_target
+        used += 1
+
+    return device_map
+

 class ChatGLMAdapater(BaseLLMAdaper):
    """LLM Adatpter for THUDM/chatglm-6b"""
@ -80,7 +114,7 @@ class ChatGLMAdapater(BaseLLMAdaper):
    def match(self, model_path: str):
        return "chatglm" in model_path

-    def loader(self, model_path: str, from_pretrained_kwargs: dict):
+    def loader(self, model_path: str, from_pretrained_kwargs: dict, device_map=None, num_gpus=CFG.NUM_GPUS):
        tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

        if DEVICE != "cuda":
@ -91,11 +125,22 @@ class ChatGLMAdapater(BaseLLMAdaper):
        else:
            model = (
                AutoModel.from_pretrained(
-                    model_path, trust_remote_code=True, **from_pretrained_kwargs
+                    model_path, trust_remote_code=True,
+                    **from_pretrained_kwargs
                )
                .half()
-                .cuda()
+                # .cuda()
            )
+            from accelerate import dispatch_model
+
+            # model = AutoModel.from_pretrained(model_path, trust_remote_code=True,
+            #                                   **from_pretrained_kwargs).half()
+            #
+            if device_map is None:
+                device_map = auto_configure_device_map(num_gpus)
+
+            model = dispatch_model(model, device_map=device_map)
+
            return model, tokenizer