diff --git a/.env.template b/.env.template
index f3370e229..3ff2d9077 100644
--- a/.env.template
+++ b/.env.template
@@ -123,3 +123,8 @@ PROXY_SERVER_URL=https://api.openai.com/v1/chat/completions
# ** SUMMARY_CONFIG
#*******************************************************************#
SUMMARY_CONFIG=FAST
+
+#*******************************************************************#
+# ** MUlti-GPU
+#*******************************************************************#
+NUM_GPUS = 1
diff --git a/README.md b/README.md
index 1817b4c75..9538418ca 100644
--- a/README.md
+++ b/README.md
@@ -16,11 +16,53 @@
## What is DB-GPT?
-As large models are released and iterated upon, they are becoming increasingly intelligent. However, in the process of using large models, we face significant challenges in data security and privacy. We need to ensure that our sensitive data and environments remain completely controlled and avoid any data privacy leaks or security risks. Based on this, we have launched the DB-GPT project to build a complete private large model solution for all database-based scenarios. This solution supports local deployment, allowing it to be applied not only in independent private environments but also to be independently deployed and isolated according to business modules, ensuring that the ability of large models is absolutely private, secure, and controllable.
-
DB-GPT is an experimental open-source project that uses localized GPT large models to interact with your data and environment. With this solution, you can be assured that there is no risk of data leakage, and your data is 100% private and secure.
-## News
+[](https://star-history.com/#csunny/DB-GPT)
+
+## Demo
+
+Run on an RTX 4090 GPU.
+
+https://github.com/csunny/DB-GPT/assets/13723926/55f31781-1d49-4757-b96e-7ef6d3dbcf80
+
+
+
+
+#### Chat with data, and figure charts.
+
+
+
+
+
+#### Text2SQL, generate SQL from chat
+
+
+
+
+#### Chat with database meta information.
+
+
+
+
+#### Chat with data, and execute results.
+
+
+
+
+#### Knownledge space to manage docs.
+
+
+
+
+#### Chat with knowledge, such as txt、pdf、csv、words. etc
+
+
+
+
+## Releases
- [2023/07/06]🔥🔥🔥Brand-new DB-GPT product with a brand-new web UI. [documents](https://db-gpt.readthedocs.io/en/latest/getting_started/getting_started.html)
- [2023/06/25]🔥support chatglm2-6b model. [documents](https://db-gpt.readthedocs.io/en/latest/modules/llms.html)
- [2023/06/14] support gpt4all model, which can run at M1/M2, or cpu machine. [documents](https://db-gpt.readthedocs.io/en/latest/modules/llms.html)
@@ -31,23 +73,6 @@ DB-GPT is an experimental open-source project that uses localized GPT large mode
- [2023/05/15] Chat with documents. [demo](./assets/new_knownledge_en.gif)
- [2023/05/06] SQL generation and diagnosis. [demo](./assets/demo_en.gif)
-## Demo
-
-Run on an RTX 4090 GPU.
-
-
-
-
-
-
-
-
-
-
-
-
## Features
Currently, we have released multiple key features, which are listed below to demonstrate our current capabilities:
@@ -75,7 +100,6 @@ Currently, we have released multiple key features, which are listed below to dem
DB-GPT creates a vast model operating system using [FastChat](https://github.com/lm-sys/FastChat) and offers a large language model powered by [Vicuna](https://huggingface.co/Tribbiani/vicuna-7b). In addition, we provide private domain knowledge base question-answering capability. Furthermore, we also provide support for additional plugins, and our design natively supports the Auto-GPT plugin.Our vision is to make it easier and more convenient to build applications around databases and llm.
-
Is the architecture of the entire DB-GPT shown in the following figure:
@@ -126,6 +150,12 @@ This project is standing on the shoulders of giants and is not going to work wit
- Please run `black .` before submitting the code.
+## RoadMap
+
+
+
+
+
## Licence
The MIT License (MIT)
@@ -133,5 +163,5 @@ The MIT License (MIT)
## Contact Information
We are working on building a community, if you have any ideas about building the community, feel free to contact us. [Discord](https://discord.gg/4BNdxm5d)
-[](https://star-history.com/#csunny/DB-GPT)
+
diff --git a/README.zh.md b/README.zh.md
index f6d124d50..0bc8e75a4 100644
--- a/README.zh.md
+++ b/README.zh.md
@@ -19,9 +19,52 @@
DB-GPT 是一个开源的以数据库为基础的GPT实验项目,使用本地化的GPT大模型与您的数据和环境进行交互,无数据泄露风险,100% 私密
+[](https://star-history.com/#csunny/DB-GPT)
+
+
[DB-GPT视频介绍](https://www.bilibili.com/video/BV1SM4y1a7Nj/?buvid=551b023900b290f9497610b2155a2668&is_story_h5=false&mid=%2BVyE%2Fwau5woPcUKieCWS0A%3D%3D&p=1&plat_id=116&share_from=ugc&share_medium=iphone&share_plat=ios&share_session_id=5D08B533-82A4-4D40-9615-7826065B4574&share_source=GENERIC&share_tag=s_i×tamp=1686307943&unique_k=bhO3lgQ&up_id=31375446)
+## 效果演示
+
+示例通过 RTX 4090 GPU 演示
+
+
+https://github.com/csunny/DB-GPT/assets/13723926/55f31781-1d49-4757-b96e-7ef6d3dbcf80
+
+#### 根据自然语言对话生成分析图表
+
+
+
+
+
+
+#### 根据自然语言对话生成SQL
+
+
+
+
+#### 与数据库元数据信息进行对话, 生成准确SQL语句
+
+
+
+
+
+#### 与数据对话, 直接查看执行结果
+
+
+
+
+#### 知识库管理
+
+
+
+
+#### 根据知识库对话, 比如pdf、csv、txt、words等等.
+
+
+
+
## 最新发布
- [2023/07/06]🔥🔥🔥 全新的DB-GPT产品。 [使用文档](https://db-gpt.readthedocs.io/projects/db-gpt-docs-zh-cn/zh_CN/latest/getting_started/getting_started.html)
- [2023/06/25]🔥 支持ChatGLM2-6B模型。 [使用文档](https://db-gpt.readthedocs.io/projects/db-gpt-docs-zh-cn/zh_CN/latest/modules/llms.html)
@@ -54,13 +97,6 @@ DB-GPT 是一个开源的以数据库为基础的GPT实验项目,使用本地
- 支持多种大语言模型, 当前已支持Vicuna(7b,13b), ChatGLM-6b(int4, int8), guanaco(7b,13b,33b), Gorilla(7b,13b)
- TODO: codet5p, codegen2
-## 效果演示
-
-示例通过 RTX 4090 GPU 演示
-
-
-
-
## 架构方案
DB-GPT基于 [FastChat](https://github.com/lm-sys/FastChat) 构建大模型运行环境,并提供 vicuna 作为基础的大语言模型。此外,我们通过LangChain提供私域知识库问答能力。同时我们支持插件模式, 在设计上原生支持Auto-GPT插件。我们的愿景是让围绕数据库和LLM构建应用程序更加简便和便捷。
@@ -125,6 +161,14 @@ Run the Python interpreter and type the commands:
这是一个用于数据库的复杂且创新的工具, 我们的项目也在紧急的开发当中, 会陆续发布一些新的feature。如在使用当中有任何具体问题, 优先在项目下提issue, 如有需要, 请联系如下微信,我会尽力提供帮助,同时也非常欢迎大家参与到项目建设中。
+
+
+# 路线图
+
+
+
+
+
## 联系我们
微信群已超扫码加群上限, 进群请添加如下微信帮拉进群。
@@ -135,5 +179,3 @@ Run the Python interpreter and type the commands:
The MIT License (MIT)
-[](https://star-history.com/#csunny/DB-GPT)
-
diff --git a/assets/chart_db_city_users.png b/assets/chart_db_city_users.png
deleted file mode 100644
index 13ccf1753..000000000
Binary files a/assets/chart_db_city_users.png and /dev/null differ
diff --git a/assets/chatSQL.png b/assets/chatSQL.png
new file mode 100644
index 000000000..2d35a1185
Binary files /dev/null and b/assets/chatSQL.png differ
diff --git a/assets/chat_knowledge.png b/assets/chat_knowledge.png
new file mode 100644
index 000000000..c83217708
Binary files /dev/null and b/assets/chat_knowledge.png differ
diff --git a/assets/chatdata.png b/assets/chatdata.png
new file mode 100644
index 000000000..6cb483073
Binary files /dev/null and b/assets/chatdata.png differ
diff --git a/assets/chatdb.png b/assets/chatdb.png
new file mode 100644
index 000000000..2f547db25
Binary files /dev/null and b/assets/chatdb.png differ
diff --git a/assets/dashboard.png b/assets/dashboard.png
new file mode 100644
index 000000000..1b883b4ea
Binary files /dev/null and b/assets/dashboard.png differ
diff --git a/assets/dbgpt_demo.gif b/assets/dbgpt_demo.gif
deleted file mode 100644
index 6defb2e26..000000000
Binary files a/assets/dbgpt_demo.gif and /dev/null differ
diff --git a/assets/ks.png b/assets/ks.png
new file mode 100644
index 000000000..b17ff5609
Binary files /dev/null and b/assets/ks.png differ
diff --git a/assets/roadmap.jpg b/assets/roadmap.jpg
new file mode 100644
index 000000000..4b845dd75
Binary files /dev/null and b/assets/roadmap.jpg differ
diff --git a/docs/getting_started/tutorials.md b/docs/getting_started/tutorials.md
index fae15c47a..2c5ef87c3 100644
--- a/docs/getting_started/tutorials.md
+++ b/docs/getting_started/tutorials.md
@@ -21,4 +21,4 @@ DB-GPT is divided into several functions, including chat with knowledge base, ex

### Plugins
-
\ No newline at end of file
+
\ No newline at end of file
diff --git a/docs/use_cases/tool_use_with_plugin.md b/docs/use_cases/tool_use_with_plugin.md
index aeb1c637d..705062c00 100644
--- a/docs/use_cases/tool_use_with_plugin.md
+++ b/docs/use_cases/tool_use_with_plugin.md
@@ -20,7 +20,7 @@ python /DB-GPT/pilot/webserver.py
```
- Test Case: Use a histogram to analyze the total order amount of users in different cities.
-
+
- More detail see: [DB-DASHBOARD](https://github.com/csunny/DB-GPT-Plugins/blob/main/src/dbgpt_plugins/Readme.md)
diff --git a/pilot/configs/config.py b/pilot/configs/config.py
index 804176357..1c8026df5 100644
--- a/pilot/configs/config.py
+++ b/pilot/configs/config.py
@@ -28,6 +28,8 @@ class Config(metaclass=Singleton):
self.skip_reprompt = False
self.temperature = float(os.getenv("TEMPERATURE", 0.7))
+ self.NUM_GPUS = int(os.getenv("NUM_GPUS",1))
+
self.execute_local_commands = (
os.getenv("EXECUTE_LOCAL_COMMANDS", "False") == "True"
)
diff --git a/pilot/model/adapter.py b/pilot/model/adapter.py
index 2c1089cec..16179af6b 100644
--- a/pilot/model/adapter.py
+++ b/pilot/model/adapter.py
@@ -73,6 +73,40 @@ class VicunaLLMAdapater(BaseLLMAdaper):
)
return model, tokenizer
+def auto_configure_device_map(num_gpus):
+ """handling multi gpu calls"""
+ # transformer.word_embeddings occupying 1 floors
+ # transformer.final_layernorm and lm_head occupying 1 floors
+ # transformer.layers occupying 28 floors
+ # Allocate a total of 30 layers to number On gpus cards
+ num_trans_layers = 28
+ per_gpu_layers = 30 / num_gpus
+ #Bugfix: call torch.embedding in Linux and the incoming weight and input are not on the same device, resulting in a RuntimeError
+ #Under Windows, model. device will be set to transformer. word_ Embeddings. device
+ #Under Linux, model. device will be set to lm_ Head.device
+ #When calling chat or stream_ During chat, input_ IDS will be placed on model. device
+ #If transformer. word_ If embeddings. device and model. device are different, it will cause a RuntimeError
+ #Therefore, here we will transform. word_ Embeddings, transformer. final_ Layernorm, lm_ Put all the heads on the first card
+ device_map = {
+ 'transformer.embedding.word_embeddings': 0,
+ 'transformer.encoder.final_layernorm': 0,
+ 'transformer.output_layer': 0,
+ 'transformer.rotary_pos_emb': 0,
+ 'lm_head': 0
+ }
+
+ used = 2
+ gpu_target = 0
+ for i in range(num_trans_layers):
+ if used >= per_gpu_layers:
+ gpu_target += 1
+ used = 0
+ assert gpu_target < num_gpus
+ device_map[f'transformer.encoder.layers.{i}'] = gpu_target
+ used += 1
+
+ return device_map
+
class ChatGLMAdapater(BaseLLMAdaper):
"""LLM Adatpter for THUDM/chatglm-6b"""
@@ -80,7 +114,7 @@ class ChatGLMAdapater(BaseLLMAdaper):
def match(self, model_path: str):
return "chatglm" in model_path
- def loader(self, model_path: str, from_pretrained_kwargs: dict):
+ def loader(self, model_path: str, from_pretrained_kwargs: dict, device_map=None, num_gpus=CFG.NUM_GPUS):
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
if DEVICE != "cuda":
@@ -91,11 +125,22 @@ class ChatGLMAdapater(BaseLLMAdaper):
else:
model = (
AutoModel.from_pretrained(
- model_path, trust_remote_code=True, **from_pretrained_kwargs
+ model_path, trust_remote_code=True,
+ **from_pretrained_kwargs
)
.half()
- .cuda()
+ # .cuda()
)
+ from accelerate import dispatch_model
+
+ # model = AutoModel.from_pretrained(model_path, trust_remote_code=True,
+ # **from_pretrained_kwargs).half()
+ #
+ if device_map is None:
+ device_map = auto_configure_device_map(num_gpus)
+
+ model = dispatch_model(model, device_map=device_map)
+
return model, tokenizer