mirror of https://github.com/csunny/DB-GPT.git synced 2025-07-24 20:47:46 +00:00

Interact your data and environment using the local GPT, no data leaks, 100% privately, 100% security

database gpt-4 langchain private security vicuna

Go to file

aries_ckt 929e7fe96b refactor:refactor knowledge api 1.delete CFG in embedding_engine api 2.add a text_splitter param in embedding_engine api 3.fmt		2023-07-12 11:07:35 +08:00
.github	ci: ci branch name	2023-05-24 18:53:58 +08:00
assets	doc:new dbgpt_demo.gif and readme	2023-07-06 15:36:11 +08:00
datacenter	Merge remote-tracking branch 'origin/new-page-framework' into dev_ty_06_end	2023-07-05 14:59:56 +08:00
docs	refactor:refactor knowledge api	2023-07-11 16:33:48 +08:00
examples	feat: define framework and split api	2023-06-20 19:36:35 +08:00
pilot	refactor:refactor knowledge api	2023-07-12 11:07:35 +08:00
plugins	fix: rm plugin file and wechat(#159 )	2023-06-06 16:57:28 +08:00
tests	refactor:refactor knowledge api	2023-07-12 11:07:35 +08:00
tools	refactor:refactor knowledge api	2023-07-12 11:07:35 +08:00
.dockerignore	docker ignore plugins too	2023-06-22 16:28:48 -07:00
.env.template	doc:dbgpt-server docs	2023-07-05 17:55:00 +08:00
.gitignore	feat:knowledge management	2023-06-30 15:39:54 +08:00
.plugin_env.template	add plugin_env file, define plugin config strategy.	2023-06-13 15:58:24 +08:00
.readthedocs.yaml	fix	2023-05-25 11:14:58 +08:00
CONTRIBUTING.md	rm oceanbase document	2023-05-08 23:01:52 +08:00
docker-compose.yml	added tunnel	2023-06-23 01:18:34 +00:00
Dockerfile	added tunnel	2023-06-23 01:18:34 +00:00
echarts.min.js	fix	2023-06-13 15:14:50 +08:00
LICENSE	Initial commit	2023-04-13 22:52:44 +08:00
README.md	doc:update discord share link	2023-07-08 18:39:25 +08:00
README.zh.md	doc:update discord share link	2023-07-08 18:39:25 +08:00
requirements.txt	Merge branch 'llm_framework' into dev_ty_06_end	2023-06-25 15:31:09 +08:00
run.sh	Update run.sh	2023-06-12 22:59:46 +08:00
SECURITY.md	Create SECURITY.md	2023-05-13 15:59:34 +08:00
setup.py	doc:version update	2023-07-06 18:08:39 +08:00

README.md

DB-GPT: Revolutionizing Database Interactions with Private LLM Technology

简体中文 |Discord |Documents|Wechat

What is DB-GPT?

As large models are released and iterated upon, they are becoming increasingly intelligent. However, in the process of using large models, we face significant challenges in data security and privacy. We need to ensure that our sensitive data and environments remain completely controlled and avoid any data privacy leaks or security risks. Based on this, we have launched the DB-GPT project to build a complete private large model solution for all database-based scenarios. This solution supports local deployment, allowing it to be applied not only in independent private environments but also to be independently deployed and isolated according to business modules, ensuring that the ability of large models is absolutely private, secure, and controllable.

DB-GPT is an experimental open-source project that uses localized GPT large models to interact with your data and environment. With this solution, you can be assured that there is no risk of data leakage, and your data is 100% private and secure.

News

[2023/07/06]🔥🔥🔥Brand-new DB-GPT product with a brand-new web UI. documents
[2023/06/25]🔥support chatglm2-6b model. documents
[2023/06/14] support gpt4all model, which can run at M1/M2, or cpu machine. documents
[2023/06/01]🔥 On the basis of the Vicuna-13B basic model, task chain calls are implemented through plugins. For example, the implementation of creating a database with a single sentence.demo
[2023/06/01]🔥 QLoRA guanaco(7b, 13b, 33b) support.
[2023/05/28] Learning from crawling data from the Internet demo
[2023/05/21] Generate SQL and execute it automatically. demo
[2023/05/15] Chat with documents. demo
[2023/05/06] SQL generation and diagnosis. demo

Demo

Run on an RTX 4090 GPU.

Features

Currently, we have released multiple key features, which are listed below to demonstrate our current capabilities:

SQL language capabilities
- SQL generation
- SQL diagnosis
Private domain Q&A and data processing
- Knowledge Management(We currently support many document formats: txt, pdf, md, html, doc, ppt, and url.)
- Database knowledge Q&A
- knowledge Embedding
Plugins
- Support custom plugin execution tasks and natively support the Auto-GPT plugin, such as:
- Automatic execution of SQL and retrieval of query results
- Automatic crawling and learning of knowledge
Unified vector storage/indexing of knowledge base
- Support for unstructured data such as PDF, TXT, Markdown, CSV, DOC, PPT, and WebURL
Multi LLMs Support
- Supports multiple large language models, currently supporting Vicuna (7b, 13b), ChatGLM-6b (int4, int8), guanaco(7b,13b,33b), Gorilla(7b,13b)
- TODO: codegen2, codet5p

Introduction

DB-GPT creates a vast model operating system using FastChat and offers a large language model powered by Vicuna. In addition, we provide private domain knowledge base question-answering capability. Furthermore, we also provide support for additional plugins, and our design natively supports the Auto-GPT plugin.Our vision is to make it easier and more convenient to build applications around databases and llm.

Is the architecture of the entire DB-GPT shown in the following figure:

The core capabilities mainly consist of the following parts:

Knowledge base capability: Supports private domain knowledge base question-answering capability.
Large-scale model management capability: Provides a large model operating environment based on FastChat.
Unified data vector storage and indexing: Provides a uniform way to store and index various data types.
Connection module: Used to connect different modules and data sources to achieve data flow and interaction.
Agent and plugins: Provides Agent and plugin mechanisms, allowing users to customize and enhance the system's behavior.
Prompt generation and optimization: Automatically generates high-quality prompts and optimizes them to improve system response efficiency.
Multi-platform product interface: Supports various client products, such as web, mobile applications, and desktop applications.

Install

Quickstart

Usage Instructions

If nltk-related errors occur during the use of the knowledge base, you need to install the nltk toolkit. For more details, please refer to: nltk documents Run the Python interpreter and type the commands:

>>> import nltk
>>> nltk.download()

Acknowledgement

This project is standing on the shoulders of giants and is not going to work without the open-source communities. Special thanks to the following projects for their excellent contribution to the AI industry:

FastChat for providing chat services
vicuna-13b as the base model
langchain tool chain
Auto-GPT universal plugin template
Hugging Face for big model management
Chroma for vector storage
Milvus for distributed vector storage
ChatGLM as the base model
llama_index for enhancing database-related knowledge using in-context learning based on existing knowledge bases.

Contribution

Please run black . before submitting the code.

Licence

The MIT License (MIT)

Contact Information

We are working on building a community, if you have any ideas about building the community, feel free to contact us. Discord