docs: readme update & contact (#1097)

2025-09-16 06:30:02 +00:00 · 2024-01-22 09:54:26 +08:00
parent 4f833634df
commit 1484981b72
6 changed files with 96 additions and 199 deletions
--- a/README.md
+++ b/README.md
@@ -33,42 +33,71 @@
  </p>


-[**简体中文**](README.zh.md) | [**Discord**](https://discord.gg/7uQnPuveTY) | [**Documents**](https://docs.dbgpt.site) | [**Wechat**](https://github.com/eosphoros-ai/DB-GPT/blob/main/README.zh.md#%E8%81%94%E7%B3%BB%E6%88%91%E4%BB%AC) | [**Community**](https://github.com/eosphoros-ai/community) | [**Paper**](https://arxiv.org/pdf/2312.17449.pdf)
+[**简体中文**](README.zh.md) | [**Discord**](https://discord.gg/7uQnPuveTY) | [**Documents**](https://docs.dbgpt.site) | [**微信**](https://github.com/eosphoros-ai/DB-GPT/blob/main/README.zh.md#%E8%81%94%E7%B3%BB%E6%88%91%E4%BB%AC) | [**Community**](https://github.com/eosphoros-ai/community) | [**Paper**](https://arxiv.org/pdf/2312.17449.pdf)

 </div>

 ## What is DB-GPT?

-DB-GPT is an open-source framework designed for the realm of large language models (LLMs) within the database field. Its primary purpose is to provide infrastructure that simplifies and streamlines the development of database-related applications. This is accomplished through the development of various technical capabilities, including:
+DB-GPT is an open-source, data-domain large model framework. Its purpose is to build the infrastructure for the large model domain by developing a variety of technical capabilities, including multi-model management, Text2SQL performance optimization, RAG framework and optimization, and Multi-Agents framework collaboration. These capabilities aim to simplify and facilitate the construction of large model applications around databases.

-1. **SMMF(Service-oriented Multi-model Management Framework)**
-2. **Text2SQL Fine-tuning**
-3. **RAG(Retrieval Augmented Generation) framework and optimization**
-4. **Data-Driven Agents framework collaboration**
-5. **GBI(Generative Business intelligence)**
-
-DB-GPT simplifies the creation of these applications based on large language models (LLMs) and databases. 
-
-In the era of Data 3.0, enterprises and developers can take the ability to create customized applications with minimal coding, which harnesses the power of large language models (LLMs) and databases.
+In the Data 3.0 era, based on models and databases, enterprises and developers can build their own bespoke applications with less code.

+### Data Agents
+![data agents](https://github.com/eosphoros-ai/DB-GPT/assets/17919400/ced393b4-9180-437a-90c5-b43633cda8cb)

 ## Contents
- [Install](#install)
- [Demo](#demo)
 - [Introduction](#introduction)
+- [Install](#install)
 - [Features](#features)
 - [Contribution](#contribution)
- [Roadmap](#roadmap)
 - [Contact](#contact-information)

-[DB-GPT Youtube Video](https://www.youtube.com/watch?v=f5_g0OObZBQ)
+## Introduction 
+The architecture of DB-GPT is shown in the following figure:

-## Demo
-##### Chat Data
-![chatdata](https://github.com/eosphoros-ai/DB-GPT/assets/13723926/1f77079e-d018-4eee-982b-9b6a66bf1063)
+<p align="center">
+  <img src="./assets/dbgpt.png" width="800" />
+</p>

-##### Chat Excel
-![excel](https://github.com/eosphoros-ai/DB-GPT/assets/13723926/3044e83b-a71e-41fe-a1e2-98e479e0ab59)
+The core capabilities include the following parts:
+
+- **RAG (Retrieval Augmented Generation)**: RAG is currently the most practically implemented and urgently needed domain. DB-GPT has already implemented a framework based on RAG, allowing users to build knowledge-based applications using the RAG capabilities of DB-GPT.
+
+- **GBI (Generative Business Intelligence)**: Generative BI is one of the core capabilities of the DB-GPT project, providing the foundational data intelligence technology to build enterprise report analysis and business insights.
+
+- **Fine-tuning Framework**: Model fine-tuning is an indispensable capability for any enterprise to implement in vertical and niche domains. DB-GPT provides a complete fine-tuning framework that integrates seamlessly with the DB-GPT project. In recent fine-tuning efforts, an accuracy rate based on the Spider dataset has been achieved at 82.5%.
+
+- **Data-Driven Multi-Agents Framework**: DB-GPT offers a data-driven self-evolving fine-tuning framework, aiming to continuously make decisions and execute based on data.
+
+- **Data Factory**: The Data Factory is mainly about cleaning and processing trustworthy knowledge and data in the era of large models.
+
+- **Data Sources**: Integrating various data sources to seamlessly connect production business data to the core capabilities of DB-GPT.
+
+### SubModule
+- [DB-GPT-Hub](https://github.com/eosphoros-ai/DB-GPT-Hub) Text-to-SQL workflow with high performance by applying Supervised Fine-Tuning (SFT) on Large Language Models (LLMs).
+
+#### Text2SQL Finetune
+- support llms
+  - [x] LLaMA
+  - [x] LLaMA-2
+  - [x] BLOOM
+  - [x] BLOOMZ
+  - [x] Falcon
+  - [x] Baichuan
+  - [x] Baichuan2
+  - [x] InternLM
+  - [x] Qwen
+  - [x] XVERSE
+  - [x] ChatGLM2
+
+-  SFT Accuracy
+As of October 10, 2023, through the fine-tuning of an open-source model with 13 billion parameters using this project, we have achieved execution accuracy on the Spider dataset that surpasses even GPT-4!
+
+[More Information about Text2SQL finetune](https://github.com/eosphoros-ai/DB-GPT-Hub)
+
+- [DB-GPT-Plugins](https://github.com/eosphoros-ai/DB-GPT-Plugins) DB-GPT Plugins that can run Auto-GPT plugin directly
+- [GPT-Vis](https://github.com/eosphoros-ai/GPT-Vis) Visualization protocol

 ## Install 
 ![Docker](https://img.shields.io/badge/docker-%230db7ed.svg?style=for-the-badge&logo=docker&logoColor=white)
@@ -120,26 +149,7 @@ At present, we have introduced several key features to showcase our current capa
 - Support Datasources
  - [Datasources](http://docs.dbgpt.site/docs/modules/connections)

-## Introduction 
-The architecture of DB-GPT is shown in the following figure:

-<p align="center">
-  <img src="./assets/DB-GPT.png" width="800" />
-</p>
-
-The core capabilities primarily consist of the following components:
-1. Multi-Models: We support multiple Large Language Models (LLMs) such as LLaMA/LLaMA2, CodeLLaMA, ChatGLM, QWen, Vicuna, and proxy models like ChatGPT, Baichuan, Tongyi, Wenxin, and more.
-2. Knowledge-Based QA: Our system enables high-quality intelligent Q&A based on local documents such as PDFs, Word documents, Excel files, and other data sources.
-3. Embedding: We offer unified data vector storage and indexing. Data is embedded as vectors and stored in vector databases, allowing for content similarity search.
-4. Multi-Datasources: This feature connects different modules and data sources, facilitating data flow and interaction.
-5. Multi-Agents: Our platform provides Agent and plugin mechanisms, empowering users to customize and enhance the system's behaviour.
-6. Privacy & Security: Rest assured that there is no risk of data leakage, and your data is 100% private and secure.
-7. Text2SQL: We enhance Text-to-SQL performance through Supervised Fine-Tuning (SFT) applied to Large Language Models (LLMs).
-
-### SubModule
- [DB-GPT-Hub](https://github.com/eosphoros-ai/DB-GPT-Hub) Text-to-SQL workflow with high performance by applying Supervised Fine-Tuning (SFT) on Large Language Models (LLMs).
- [DB-GPT-Plugins](https://github.com/eosphoros-ai/DB-GPT-Plugins) DB-GPT Plugins that can run Auto-GPT plugin directly
- [DB-GPT-Web](https://github.com/eosphoros-ai/DB-GPT-Web)  ChatUI for DB-GPT  

 ## Image
 🌐 [AutoDL Image](https://www.codewithgpu.com/i/eosphoros-ai/DB-GPT/dbgpt)
@@ -151,106 +161,8 @@ The core capabilities primarily consist of the following components:
 ## Contribution

 - Please run `black .` before submitting the code.
- To check detailed guidelines for new contributions, please refer [how to contribute](https://github.com/csunny/DB-GPT/blob/main/CONTRIBUTING.md)
+- To check detailed guidelines for new contributions, please refer [how to contribute](https://github.com/eosphoros-ai/DB-GPT/blob/main/CONTRIBUTING.md)

-## RoadMap
-
-<p align="left">
-  <img src="./assets/roadmap.jpg" width="800px" />
-</p>
-
-### KBQA RAG optimization
- [x] Multi Documents
-  - [x] PDF
-  - [x] Excel, CSV
-  - [x] Word
-  - [x] Text
-  - [x] MarkDown
-  - [ ] Code
-  - [ ] Images 
-
- [x] RAG
- [ ] Graph Database
-  - [ ] Neo4j Graph
-  - [ ] Nebula Graph
- [x] Multi-Vector Database
-  - [x] Chroma
-  - [x] Milvus
-  - [x] Weaviate
-  - [x] PGVector
-  - [ ] Elasticsearch
-  - [ ] ClickHouse
-  - [ ] Faiss 
-  
- [ ] Testing and Evaluation Capability Building
-  - [ ] Knowledge QA datasets
-  - [ ] Question collection [easy, medium, hard]:
-  - [ ] Scoring mechanism
-  - [ ] Testing and evaluation using Excel + DB datasets
-  
-### Multi Datasource Support
-
- Multi Datasource Support 
-  - [x] MySQL
-  - [x] PostgreSQL
-  - [x] Spark
-  - [x] DuckDB
-  - [x] Sqlite
-  - [x] MSSQL
-  - [x] ClickHouse
-  - [ ] Oracle
-  - [ ] Redis
-  - [ ] MongoDB
-  - [ ] HBase
-  - [x] Doris
-  - [ ] DB2
-  - [ ] Couchbase
-  - [ ] Elasticsearch
-  - [ ] OceanBase
-  - [ ] TiDB
-  - [ ] StarRocks
-
-### Multi-Models And vLLM
- [x] [Cluster Deployment](https://docs.dbgpt.site/docs/installation/model_service/cluster)
- [x] [Fastchat Support](https://github.com/lm-sys/FastChat)
- [x] [vLLM Support](https://docs.dbgpt.site/docs/installation/advanced_usage/vLLM_inference)
- [ ] Cloud-native environment and support for Ray environment
- [ ] Service Registry(eg:nacos)
- [ ] Compatibility with OpenAI's interfaces
- [ ] Expansion and optimization of embedding models
-
-### Agents market and Plugins
- [x] multi-agents framework
- [x] custom plugin development 
- [x] plugin market
- [ ] Integration with CoT
- [ ] Enrich plugin sample library
- [ ] Support for AutoGPT protocol
- [ ] Integration of multi-agents and visualization capabilities, defining LLM+Vis new standards
-
-### Cost and Observability
- [x] [debugging](https://docs.dbgpt.site/docs/application_manual/advanced_tutorial/debugging)
- [ ] Observability
- [ ] cost & budgets
-
-### Text2SQL Finetune
- support llms
-  - [x] LLaMA
-  - [x] LLaMA-2
-  - [x] BLOOM
-  - [x] BLOOMZ
-  - [x] Falcon
-  - [x] Baichuan
-  - [x] Baichuan2
-  - [x] InternLM
-  - [x] Qwen
-  - [x] XVERSE
-  - [x] ChatGLM2
-
-  SFT Accuracy
-As of October 10, 2023, through the fine-tuning of an open-source model with 13 billion parameters using this project, we have achieved execution accuracy on the Spider dataset that surpasses even GPT-4!
-
-[More Information about Text2SQL finetune](https://github.com/eosphoros-ai/DB-GPT-Hub)

 ## Licence
 The MIT License (MIT)
@@ -272,8 +184,4 @@ If you find `DB-GPT` useful for your research or development, please cite the fo
 We are working on building a community, if you have any ideas for building the community, feel free to contact us.
 [![](https://dcbadge.vercel.app/api/server/7uQnPuveTY?compact=true&style=flat)](https://discord.gg/7uQnPuveTY)

-<p align="center">
-  <img src="./assets/wechat.jpg" width="300px" />
-</p>
-
 [![Star History Chart](https://api.star-history.com/svg?repos=csunny/DB-GPT&type=Date)](https://star-history.com/#csunny/DB-GPT)