mirror of
https://github.com/csunny/DB-GPT.git
synced 2025-08-15 23:13:15 +00:00
Merge branch 'dbgpt_doc' of https://github.com/csunny/DB-GPT into llm_fxp
This commit is contained in:
commit
e057f4d3af
19
README.md
19
README.md
@ -10,7 +10,7 @@
|
||||
</a>
|
||||
</p>
|
||||
|
||||
[**简体中文**](README.zh.md)|[**Discord**](https://discord.gg/ea6BnZkY)
|
||||
[**简体中文**](README.zh.md)|[**Discord**](https://discord.gg/xfNDzZ9t)
|
||||
</div>
|
||||
|
||||
## What is DB-GPT?
|
||||
@ -35,7 +35,7 @@ Currently, we have released multiple key features, which are listed below to dem
|
||||
- SQL language capabilities
|
||||
- SQL generation
|
||||
- SQL diagnosis
|
||||
- Private domain Q&A and data processing
|
||||
- Private domain Q&A and data processing
|
||||
- Database knowledge Q&A
|
||||
- Data processing
|
||||
- Plugins
|
||||
@ -46,7 +46,7 @@ Currently, we have released multiple key features, which are listed below to dem
|
||||
- Support for unstructured data such as PDF, Markdown, CSV, and WebURL
|
||||
|
||||
- Milti LLMs Support
|
||||
- Supports multiple large language models, currently supporting Vicuna (7b, 13b), ChatGLM-6b (int4, int8), guanaco(7b,13b,33b)
|
||||
- Supports multiple large language models, currently supporting Vicuna (7b, 13b), ChatGLM-6b (int4, int8), guanaco(7b,13b,33b), Gorilla(7b,13b)
|
||||
- TODO: codegen2, codet5p
|
||||
|
||||
|
||||
@ -62,7 +62,7 @@ Run on an RTX 4090 GPU.
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<img src="./assets/new_knownledge_en.gif" width="680px" />
|
||||
<img src="./assets/knownledge_qa_en.jpg" width="680px" />
|
||||
</p>
|
||||
|
||||
## Introduction
|
||||
@ -179,6 +179,15 @@ In the .env configuration file, modify the LANGUAGE parameter to switch between
|
||||
|
||||
1.Place personal knowledge files or folders in the pilot/datasets directory.
|
||||
|
||||
We currently support many document formats: txt, pdf, md, html, doc, ppt, and url.
|
||||
|
||||
before execution:
|
||||
|
||||
```
|
||||
python -m spacy download zh_core_web_sm
|
||||
|
||||
```
|
||||
|
||||
2.set .env configuration set your vector store type, eg:VECTOR_STORE_TYPE=Chroma, now we support Chroma and Milvus(version > 2.1)
|
||||
|
||||
3.Run the knowledge repository script in the tools directory.
|
||||
@ -225,6 +234,6 @@ This project is standing on the shoulders of giants and is not going to work wit
|
||||
The MIT License (MIT)
|
||||
|
||||
## Contact Information
|
||||
We are working on building a community, if you have any ideas about building the community, feel free to contact us. [Discord](https://discord.gg/kMFf77FH)
|
||||
We are working on building a community, if you have any ideas about building the community, feel free to contact us. [Discord](https://discord.gg/xfNDzZ9t)
|
||||
|
||||
[](https://star-history.com/#csunny/DB-GPT)
|
||||
|
13
README.zh.md
13
README.zh.md
@ -18,6 +18,8 @@
|
||||
|
||||
DB-GPT 是一个开源的以数据库为基础的GPT实验项目,使用本地化的GPT大模型与您的数据和环境进行交互,无数据泄露风险,100% 私密,100% 安全。
|
||||
|
||||
[DB-GPT视频介绍](https://www.bilibili.com/video/BV1SM4y1a7Nj/?buvid=551b023900b290f9497610b2155a2668&is_story_h5=false&mid=%2BVyE%2Fwau5woPcUKieCWS0A%3D%3D&p=1&plat_id=116&share_from=ugc&share_medium=iphone&share_plat=ios&share_session_id=5D08B533-82A4-4D40-9615-7826065B4574&share_source=GENERIC&share_tag=s_i×tamp=1686307943&unique_k=bhO3lgQ&up_id=31375446)
|
||||
|
||||
## 最新发布
|
||||
|
||||
- [2023/06/01]🔥 在Vicuna-13B基础模型的基础上,通过插件实现任务链调用。例如单句创建数据库的实现.[演示](./assets/dbgpt_bytebase_plugin.gif)
|
||||
@ -44,7 +46,7 @@ DB-GPT 是一个开源的以数据库为基础的GPT实验项目,使用本地
|
||||
- 非结构化数据支持包括PDF、MarkDown、CSV、WebURL
|
||||
|
||||
- 多模型支持
|
||||
- 支持多种大语言模型, 当前已支持Vicuna(7b,13b), ChatGLM-6b(int4, int8)
|
||||
- 支持多种大语言模型, 当前已支持Vicuna(7b,13b), ChatGLM-6b(int4, int8), guanaco(7b,13b,33b), Gorilla(7b,13b)
|
||||
- TODO: codet5p, codegen2
|
||||
|
||||
## 效果演示
|
||||
@ -174,6 +176,15 @@ $ python webserver.py
|
||||
|
||||
1.将个人知识文件或者文件夹放入pilot/datasets目录中
|
||||
|
||||
当前支持的文档格式: txt, pdf, md, html, doc, ppt, and url.
|
||||
|
||||
在操作之前先执行
|
||||
|
||||
```
|
||||
python -m spacy download zh_core_web_sm
|
||||
|
||||
```
|
||||
|
||||
2.在.env文件指定你的向量数据库类型,VECTOR_STORE_TYPE(默认Chroma),目前支持Chroma,Milvus(需要设置MILVUS_URL和MILVUS_PORT)
|
||||
|
||||
注意Milvus版本需要>2.1
|
||||
|
Binary file not shown.
Before Width: | Height: | Size: 2.2 MiB After Width: | Height: | Size: 5.0 MiB |
BIN
assets/knownledge_qa_en.jpg
Normal file
BIN
assets/knownledge_qa_en.jpg
Normal file
Binary file not shown.
After Width: | Height: | Size: 371 KiB |
Binary file not shown.
Before Width: | Height: | Size: 2.5 MiB |
Binary file not shown.
Before Width: | Height: | Size: 211 KiB After Width: | Height: | Size: 255 KiB |
@ -47,6 +47,12 @@ templates_path = ["_templates"]
|
||||
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
|
||||
|
||||
|
||||
# multi language config
|
||||
language = "en" # ['en', 'zh_CN'] #
|
||||
locales_dirs = ["./locales/"]
|
||||
gettext_compact = False
|
||||
gettext_uuid = True
|
||||
|
||||
# -- Options for HTML output -------------------------------------------------
|
||||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
|
||||
|
||||
|
@ -32,6 +32,21 @@ conda activate dbgpt_env
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
Once the environment is installed, we have to create a new folder "models" in the DB-GPT project, and then we can put all the models downloaded from huggingface in this directory
|
||||
|
||||
```
|
||||
git clone https://huggingface.co/Tribbiani/vicuna-13b
|
||||
git clone https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
|
||||
```
|
||||
|
||||
The model files are large and will take a long time to download. During the download, let's configure the .env file, which needs to be copied and created from the .env.template
|
||||
|
||||
```
|
||||
cp .env.template .env
|
||||
```
|
||||
|
||||
You can configure basic parameters in the .env file, for example setting LLM_MODEL to the model to be used
|
||||
|
||||
### 3. Run
|
||||
You can refer to this document to obtain the Vicuna weights: [Vicuna](https://github.com/lm-sys/FastChat/blob/main/README.md#model-weights) .
|
||||
|
||||
|
@ -3,4 +3,14 @@
|
||||
|
||||
This is a collection of DB-GPT tutorials on Medium.
|
||||
|
||||
Comming soon...
|
||||
###Introduce
|
||||
[What is DB-GPT](https://www.youtube.com/watch?v=QszhVJerc0I) by csunny (https://github.com/csunny/DB-GPT):
|
||||
|
||||
### Knowledge
|
||||
|
||||
[How to Create your own knowledge repository](https://db-gpt.readthedocs.io/en/latest/modules/knownledge.html)
|
||||
|
||||
[Add new Knowledge demonstration](../../assets/new_knownledge_en.gif)
|
||||
|
||||
### DB Plugins
|
||||
[db plugins demonstration](../../assets/auto_sql_en.gif)
|
25
docs/locales/zh_CN/LC_MESSAGES/ecosystem.po
Normal file
25
docs/locales/zh_CN/LC_MESSAGES/ecosystem.po
Normal file
@ -0,0 +1,25 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../ecosystem.md:1 2a67e31428d84197939447c3decf9768
|
||||
msgid "Ecosystem"
|
||||
msgstr "环境系统"
|
||||
|
25
docs/locales/zh_CN/LC_MESSAGES/getting_started/concepts.po
Normal file
25
docs/locales/zh_CN/LC_MESSAGES/getting_started/concepts.po
Normal file
@ -0,0 +1,25 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../getting_started/concepts.md:1 bbfc919428fd48f886677ada33b9c495
|
||||
msgid "Concepts"
|
||||
msgstr "概念"
|
||||
|
@ -0,0 +1,179 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../getting_started/getting_started.md:1 cf1947dea9a843dd8b6fff68642f29b1
|
||||
msgid "Quickstart Guide"
|
||||
msgstr "使用指南"
|
||||
|
||||
#: ../../getting_started/getting_started.md:3 4184879bf5b34521a95e497f4747241a
|
||||
msgid ""
|
||||
"This tutorial gives you a quick walkthrough about use DB-GPT with you "
|
||||
"environment and data."
|
||||
msgstr "本教程为您提供了关于如何使用DB-GPT的使用指南。"
|
||||
|
||||
#: ../../getting_started/getting_started.md:5 7431b72cc1504b8bbcafb7512a6b6c92
|
||||
msgid "Installation"
|
||||
msgstr "安装"
|
||||
|
||||
#: ../../getting_started/getting_started.md:7 b8faf2ec4e034855a2674ffcade8cee2
|
||||
msgid "To get started, install DB-GPT with the following steps."
|
||||
msgstr "请按照以下步骤安装DB-GPT"
|
||||
|
||||
#: ../../getting_started/getting_started.md:9 ae0f536a064647cda04ea3d253991d80
|
||||
msgid "1. Hardware Requirements"
|
||||
msgstr "1. 硬件要求"
|
||||
|
||||
#: ../../getting_started/getting_started.md:10 8fa637100e644b478e0d6858f0a5b63d
|
||||
msgid ""
|
||||
"As our project has the ability to achieve ChatGPT performance of over "
|
||||
"85%, there are certain hardware requirements. However, overall, the "
|
||||
"project can be deployed and used on consumer-grade graphics cards. The "
|
||||
"specific hardware requirements for deployment are as follows:"
|
||||
msgstr "由于我们的项目有能力达到85%以上的ChatGPT性能,所以对硬件有一定的要求。"
|
||||
"但总体来说,我们在消费级的显卡上即可完成项目的部署使用,具体部署的硬件说明如下:"
|
||||
|
||||
#: ../../getting_started/getting_started.md c68539579083407882fb0d28943d40db
|
||||
msgid "GPU"
|
||||
msgstr "GPU"
|
||||
|
||||
#: ../../getting_started/getting_started.md 613fbe77d41a4a20a30c3c9a0b6ec20c
|
||||
msgid "VRAM Size"
|
||||
msgstr "显存大小"
|
||||
|
||||
#: ../../getting_started/getting_started.md c0b7f8249d3d4c629ba5deb8188a49b4
|
||||
msgid "Performance"
|
||||
msgstr "显存大小"
|
||||
|
||||
#: ../../getting_started/getting_started.md 5d103f7e4d1b4b6cb7358c0c717c9f73
|
||||
msgid "RTX 4090"
|
||||
msgstr "RTX 4090"
|
||||
|
||||
#: ../../getting_started/getting_started.md 48338f6b18dc41efb3613d47b1a762a7
|
||||
#: f14d278e083440b58fc7faeed30e2879
|
||||
msgid "24 GB"
|
||||
msgstr "24 GB"
|
||||
|
||||
#: ../../getting_started/getting_started.md dc238037ff3449cdb95cbd882d8de170
|
||||
msgid "Smooth conversation inference"
|
||||
msgstr "可以流畅的进行对话推理,无卡顿"
|
||||
|
||||
#: ../../getting_started/getting_started.md d7f84ac79bf84cb6a453d3bfd26eb935
|
||||
msgid "RTX 3090"
|
||||
msgstr "RTX 3090"
|
||||
|
||||
#: ../../getting_started/getting_started.md 511ee322b777476b87a3aa5624609944
|
||||
msgid "Smooth conversation inference, better than V100"
|
||||
msgstr "可以流畅进行对话推理,有卡顿感,但好于V100"
|
||||
|
||||
#: ../../getting_started/getting_started.md 974b704e8cf84f6483774153df8a8c6c
|
||||
msgid "V100"
|
||||
msgstr "V100"
|
||||
|
||||
#: ../../getting_started/getting_started.md 72008961ce004a0fa24b74db55fcf96e
|
||||
msgid "16 GB"
|
||||
msgstr "16 GB"
|
||||
|
||||
#: ../../getting_started/getting_started.md 2a3b936fe04c4b7789680c26be7f4869
|
||||
msgid "Conversation inference possible, noticeable stutter"
|
||||
msgstr "可以进行对话推理,有明显卡顿"
|
||||
|
||||
#: ../../getting_started/getting_started.md:18 fb1dbccb8f804384ade8e171aa40f99c
|
||||
msgid "2. Install"
|
||||
msgstr "2. 安装"
|
||||
|
||||
#: ../../getting_started/getting_started.md:20 695fdb8858c6488e9a0872d68fb387e5
|
||||
msgid ""
|
||||
"This project relies on a local MySQL database service, which you need to "
|
||||
"install locally. We recommend using Docker for installation."
|
||||
msgstr "本项目依赖一个本地的 MySQL 数据库服务,你需要本地安装,推荐直接使用 Docker 安装。"
|
||||
|
||||
#: ../../getting_started/getting_started.md:25 954f3a282ec54b11a55ebfe1f680d1df
|
||||
msgid ""
|
||||
"We use [Chroma embedding database](https://github.com/chroma-core/chroma)"
|
||||
" as the default for our vector database, so there is no need for special "
|
||||
"installation. If you choose to connect to other databases, you can follow"
|
||||
" our tutorial for installation and configuration. For the entire "
|
||||
"installation process of DB-GPT, we use the miniconda3 virtual "
|
||||
"environment. Create a virtual environment and install the Python "
|
||||
"dependencies."
|
||||
msgstr "向量数据库我们默认使用的是Chroma内存数据库,所以无需特殊安装,如果有"
|
||||
"需要连接其他的同学,可以按照我们的教程进行安装配置。整个DB-GPT的"
|
||||
"安装过程,我们使用的是miniconda3的虚拟环境。创建虚拟环境,并安装python依赖包"
|
||||
|
||||
|
||||
#: ../../getting_started/getting_started.md:35 0314bad0928940fc8e382d289d356c66
|
||||
msgid ""
|
||||
"Once the environment is installed, we have to create a new folder "
|
||||
"\"models\" in the DB-GPT project, and then we can put all the models "
|
||||
"downloaded from huggingface in this directory"
|
||||
msgstr "环境安装完成后,我们必须在DB-GPT项目中创建一个新文件夹\"models\","
|
||||
"然后我们可以把从huggingface下载的所有模型放到这个目录下。"
|
||||
|
||||
#: ../../getting_started/getting_started.md:42 afdf176f72224fd6b8b6e9e23c80c1ef
|
||||
msgid ""
|
||||
"The model files are large and will take a long time to download. During "
|
||||
"the download, let's configure the .env file, which needs to be copied and"
|
||||
" created from the .env.template"
|
||||
msgstr "模型文件很大,需要很长时间才能下载。在下载过程中,让我们配置.env文件,"
|
||||
"它需要从。env.template中复制和创建。"
|
||||
|
||||
#: ../../getting_started/getting_started.md:48 76c87610993f41059c3c0aade5117171
|
||||
msgid ""
|
||||
"You can configure basic parameters in the .env file, for example setting "
|
||||
"LLM_MODEL to the model to be used"
|
||||
msgstr "您可以在.env文件中配置基本参数,例如将LLM_MODEL设置为要使用的模型。"
|
||||
|
||||
#: ../../getting_started/getting_started.md:35 443f5f92e4cd4ce4887bae2556b605b0
|
||||
msgid "3. Run"
|
||||
msgstr "3. 运行"
|
||||
|
||||
#: ../../getting_started/getting_started.md:36 3dab200eceda460b81a096d44de43d21
|
||||
msgid ""
|
||||
"You can refer to this document to obtain the Vicuna weights: "
|
||||
"[Vicuna](https://github.com/lm-sys/FastChat/blob/main/README.md#model-"
|
||||
"weights) ."
|
||||
msgstr "关于基础模型, 可以根据[Vicuna](https://github.com/lm-sys/FastChat/b"
|
||||
"lob/main/README.md#model-weights) 合成教程进行合成。"
|
||||
|
||||
|
||||
#: ../../getting_started/getting_started.md:38 b036ca6294f04bceb686187d2d8b6646
|
||||
msgid ""
|
||||
"If you have difficulty with this step, you can also directly use the "
|
||||
"model from [this link](https://huggingface.co/Tribbiani/vicuna-7b) as a "
|
||||
"replacement."
|
||||
msgstr "如果此步有困难的同学,也可以直接使用[此链接](https://huggingface.co/Tribbiani/vicuna-7b)上的模型进行替代。"
|
||||
|
||||
#: ../../getting_started/getting_started.md:40 35537c13ff6f4bd69951c486274ca1f9
|
||||
msgid "Run server"
|
||||
msgstr "运行模型服务"
|
||||
|
||||
#: ../../getting_started/getting_started.md:45 f7aa3668a6c94fb3a1b8346392d921f3
|
||||
msgid "Run gradio webui"
|
||||
msgstr "运行模型服务"
|
||||
|
||||
#: ../../getting_started/getting_started.md:51 d80c908f01144e2c8a15b7f6e8e7f88d
|
||||
msgid ""
|
||||
"Notice: the webserver need to connect llmserver, so you need change the"
|
||||
" .env file. change the MODEL_SERVER = \"http://127.0.0.1:8000\" to your "
|
||||
"address. It's very important."
|
||||
msgstr "注意: 在启动Webserver之前, 需要修改.env 文件中的MODEL_SERVER"
|
||||
" = "http://127.0.0.1:8000", 将地址设置为你的服务器地址。"
|
||||
|
59
docs/locales/zh_CN/LC_MESSAGES/getting_started/tutorials.po
Normal file
59
docs/locales/zh_CN/LC_MESSAGES/getting_started/tutorials.po
Normal file
@ -0,0 +1,59 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-13 11:38+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.12.1\n"
|
||||
|
||||
#: ../../getting_started/tutorials.md:1 7011a2ab0e7f45ddb1fa85b6479cc442
|
||||
msgid "Tutorials"
|
||||
msgstr "教程"
|
||||
|
||||
#: ../../getting_started/tutorials.md:4 960f88b9c1b64940bfa0576bab5b0314
|
||||
msgid "This is a collection of DB-GPT tutorials on Medium."
|
||||
msgstr "这是知乎上DB-GPT教程的集合。."
|
||||
|
||||
#: ../../getting_started/tutorials.md:6 3915395cc45742519bf0c607eeafc489
|
||||
msgid ""
|
||||
"###Introduce [What is DB-"
|
||||
"GPT](https://www.youtube.com/watch?v=QszhVJerc0I) by csunny "
|
||||
"(https://github.com/csunny/DB-GPT)"
|
||||
msgstr "###Introduce [什么是DB-GPT](https://www.bilibili.com/video/BV1SM4y1a7Nj/?buvid=551b023900b290f9497610b2155a2668&is_story_h5=false&mid=%2BVyE%2Fwau5woPcUKieCWS0A%3D%3D&p=1&plat_id=116&share_from=ugc&share_medium=iphone&share_plat=ios&share_session_id=5D08B533-82A4-4D40-9615-7826065B4574&share_source=GENERIC&share_tag=s_i×tamp=1686307943&unique_k=bhO3lgQ&up_id=31375446) by csunny (https://github.com/csunny/DB-GPT)"
|
||||
|
||||
#: ../../getting_started/tutorials.md:9 e213736923574b2cb039a457d789c27c
|
||||
msgid "Knowledge"
|
||||
msgstr "知识库"
|
||||
|
||||
#: ../../getting_started/tutorials.md:11 90b5472735a644168d51c054ed882748
|
||||
msgid ""
|
||||
"[How to Create your own knowledge repository](https://db-"
|
||||
"gpt.readthedocs.io/en/latest/modules/knownledge.html)"
|
||||
msgstr "[怎么创建自己的知识库](https://db-"
|
||||
"gpt.readthedocs.io/en/latest/modules/knownledge.html)"
|
||||
|
||||
#: ../../getting_started/tutorials.md:13 6a851e1e88ea4bcbaf7ee742a12224ef
|
||||
msgid "[Add new Knowledge demonstration](../../assets/new_knownledge_en.gif)"
|
||||
msgstr "[新增知识库演示](../../assets/new_knownledge_en.gif)"
|
||||
|
||||
#: ../../getting_started/tutorials.md:15 4487ef393e004e7c936f5104727212a4
|
||||
msgid "DB Plugins"
|
||||
msgstr "DB Plugins"
|
||||
|
||||
#: ../../getting_started/tutorials.md:16 ee5decd8441d40ae8a240a19c1a5a74a
|
||||
msgid "[db plugins demonstration](../../assets/auto_sql_en.gif)"
|
||||
msgstr "[db plugins 演示](../../assets/auto_sql_en.gif)"
|
||||
|
272
docs/locales/zh_CN/LC_MESSAGES/index.po
Normal file
272
docs/locales/zh_CN/LC_MESSAGES/index.po
Normal file
@ -0,0 +1,272 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../index.rst:34 ../../index.rst:45 e3275f133efd471582d952301a6e243e
|
||||
msgid "Getting Started"
|
||||
msgstr "开始"
|
||||
|
||||
#: ../../index.rst:56 ../../index.rst:75 86e2ce002e604304a4032aa1555b36cb
|
||||
msgid "Modules"
|
||||
msgstr "模块"
|
||||
|
||||
#: ../../index.rst:88 ../../index.rst:104 b15c23cfcc084df9a8f8f9990e6903ac
|
||||
msgid "Use Cases"
|
||||
msgstr "示例"
|
||||
|
||||
#: ../../index.rst:118 ../../index.rst:121 70605b76fe5348299dd5d48d8ab6a77c
|
||||
msgid "Reference"
|
||||
msgstr "参考"
|
||||
|
||||
#: ../../index.rst:145 ../../index.rst:151 f62cf565fab64977b0efbd50e83540cc
|
||||
msgid "Resources"
|
||||
msgstr "资源"
|
||||
|
||||
#: ../../index.rst:7 c8b3a0ca759f432095161f7baccde1c4
|
||||
msgid "Welcome to DB-GPT!"
|
||||
msgstr "欢迎来到DB-GPT中文文档"
|
||||
|
||||
#: ../../index.rst:8 0167fea2c4df4181bc10d6e71527d005
|
||||
msgid ""
|
||||
"As large models are released and iterated upon, they are becoming "
|
||||
"increasingly intelligent. However, in the process of using large models, "
|
||||
"we face significant challenges in data security and privacy. We need to "
|
||||
"ensure that our sensitive data and environments remain completely "
|
||||
"controlled and avoid any data privacy leaks or security risks. Based on "
|
||||
"this, we have launched the DB-GPT project to build a complete private "
|
||||
"large model solution for all database-based scenarios. This solution "
|
||||
"supports local deployment, allowing it to be applied not only in "
|
||||
"independent private environments but also to be independently deployed "
|
||||
"and isolated according to business modules, ensuring that the ability of "
|
||||
"large models is absolutely private, secure, and controllable."
|
||||
msgstr "随着大型模型的发布和迭代,它们变得越来越智能。然而,在使用大型模型的过程中,"
|
||||
"我们在数据安全和隐私方面面临着重大挑战。我们需要确保我们的敏感数据和环境得到完全控制,"
|
||||
"避免任何数据隐私泄露或安全风险。基于此,我们启动了DB-GPT项目,为所有基于数据库的"
|
||||
"场景构建一个完整的私有大模型解决方案。该方案“”支持本地部署,既可应用于“独立私"
|
||||
"有环境”,又可根据业务模块进行“独立部署”和“隔离”,确保“大模型”的能力绝对"
|
||||
"私有、安全、可控。"
|
||||
|
||||
#: ../../index.rst:10 36b847a04d624286a4942cd77821da8c
|
||||
msgid ""
|
||||
"**DB-GPT** is an experimental open-source project that uses localized GPT"
|
||||
" large models to interact with your data and environment. With this "
|
||||
"solution, you can be assured that there is no risk of data leakage, and "
|
||||
"your data is 100% private and secure."
|
||||
msgstr "DB-GPT 是一个开源的以数据库为基础的GPT实验项目,使用本地化的"
|
||||
"GPT大模型与您的数据和环境进行交互,无数据泄露风险"
|
||||
"100% 私密,100% 安全。"
|
||||
|
||||
#: ../../index.rst:12 d20166d203934385b811740f4d5eda33
|
||||
msgid "**Features**"
|
||||
msgstr "特性"
|
||||
|
||||
#: ../../index.rst:13 03f9de47513b4bc9a26f31e1d2d8ad60
|
||||
msgid ""
|
||||
"Currently, we have released multiple key features, which are listed below"
|
||||
" to demonstrate our current capabilities:"
|
||||
msgstr "目前我们已经发布了多种关键的特性,这里一一列举展示一下当前发布的能力。"
|
||||
|
||||
#: ../../index.rst:15 abc51c99bc6e49d5b0105c7d95e391da
|
||||
msgid "SQL language capabilities - SQL generation - SQL diagnosis"
|
||||
msgstr "SQL语言能力 - SQL生成 - SQL诊断"
|
||||
|
||||
#: ../../index.rst:19 e9ba27f21fd84ecf973640fa021b06b6
|
||||
msgid ""
|
||||
"Private domain Q&A and data processing - Database knowledge Q&A - Data "
|
||||
"processing"
|
||||
msgstr "私有领域问答与数据处理 - 数据库知识问答 - 数据处理"
|
||||
|
||||
#: ../../index.rst:23 a4584012b6634553abef5a4ee6ddf509
|
||||
msgid ""
|
||||
"Plugins - Support custom plugin execution tasks and natively support the "
|
||||
"Auto-GPT plugin, such as:"
|
||||
msgstr "插件模型 - 支持自定义插件执行任务,并原生支持Auto-GPT插件,例如:"
|
||||
"* SQL自动执行,获取查询结果 * 自动爬取学习知识"
|
||||
|
||||
#: ../../index.rst:26 b08674d7a7da4405b9388e296bc2cd57
|
||||
msgid ""
|
||||
"Unified vector storage/indexing of knowledge base - Support for "
|
||||
"unstructured data such as PDF, Markdown, CSV, and WebURL"
|
||||
msgstr "知识库统一向量存储/索引 - 非结构化数据支持包括PDF、MarkDown、CSV、WebURL"
|
||||
|
||||
#: ../../index.rst:29 cf4bc81d46b4418b81a78242cbc7f984
|
||||
msgid ""
|
||||
"Milti LLMs Support - Supports multiple large language models, currently "
|
||||
"supporting Vicuna (7b, 13b), ChatGLM-6b (int4, int8) - TODO: codegen2, "
|
||||
"codet5p"
|
||||
msgstr "多模型支持 - 支持多种大语言模型, 当前已支持Vicuna(7b,13b), ChatGLM-6b(int4, int8)"
|
||||
Guanaco, Goriila, Falcon等系列模型"
|
||||
|
||||
#: ../../index.rst:35 681ae172eea64b718e0f6fc734d041b1
|
||||
msgid ""
|
||||
"How to get started using DB-GPT to interact with your data and "
|
||||
"environment."
|
||||
msgstr "开始使用DB-GPT与您的数据环境进行交互。"
|
||||
|
||||
#: ../../index.rst:36 87f507e0c27a4a38ba2a5c19e804549f
|
||||
msgid "`Quickstart Guid <./getting_started/getting_started.html>`_"
|
||||
msgstr "`使用指南 <./getting_started/getting_started.html>`_"
|
||||
|
||||
#: ../../index.rst:38 ab35a5cd96c548ecb0c285fd822f652a
|
||||
msgid "Concepts and terminology"
|
||||
msgstr "相关概念"
|
||||
|
||||
#: ../../index.rst:40 3fbd5c96df084ef889442a0b89ad6c05
|
||||
msgid "`Concepts and terminology <./getting_started/concepts.html>`_"
|
||||
msgstr "`相关概念 <./getting_started/concepts.html>`_"
|
||||
|
||||
#: ../../index.rst:42 6d9a0d727ce14edfbdcf678c6fbba76b
|
||||
msgid "Coming soon..."
|
||||
msgstr "未完待续。。。"
|
||||
|
||||
#: ../../index.rst:44 58cdc41dce264a3e83de565501298010
|
||||
msgid "`Tutorials <.getting_started/tutorials.html>`_"
|
||||
msgstr "`教程 <.getting_started/tutorials.html>`_"
|
||||
|
||||
#: ../../index.rst:58 20d67b324c23468e8f2cac6d9100b9f5
|
||||
msgid ""
|
||||
"These modules are the core abstractions with which we can interact with "
|
||||
"data and environment smoothly."
|
||||
msgstr "这些模块是我们可以与数据和环境顺利地进行交互的核心组成。"
|
||||
|
||||
|
||||
#: ../../index.rst:59 45a14052370f4860a72d8e831269d184
|
||||
msgid ""
|
||||
"It's very important for DB-GPT, DB-GPT also provide standard, extendable "
|
||||
"interfaces."
|
||||
msgstr "DB-GPT还提供了标准的、可扩展的接口。"
|
||||
|
||||
#: ../../index.rst:61 7c78c2ddc4104a8b9688472072c3225c
|
||||
msgid ""
|
||||
"The docs for each module contain quickstart examples, how to guides, "
|
||||
"reference docs, and conceptual guides."
|
||||
msgstr "每个模块的文档都包含快速入门的例子、操作指南、参考文档和相关概念等内容。"
|
||||
|
||||
#: ../../index.rst:63 4bcc203282434ca9b77d20c4115a646a
|
||||
msgid "The modules are as follows"
|
||||
msgstr "组成模块如下:"
|
||||
|
||||
#: ../../index.rst:65 c87f13e106b5443a824df5ca85331df4
|
||||
msgid ""
|
||||
"`LLMs <./modules/llms.html>`_: Supported multi models management and "
|
||||
"integrations."
|
||||
msgstr "`LLMs <./modules/llms.html>`_:基于FastChat提供大模型的运行环境。支持多模型管理和集成。 "
|
||||
|
||||
#: ../../index.rst:67 3447e10b61804b48a786ee12beaaedfd
|
||||
msgid ""
|
||||
"`Prompts <./modules/prompts.html>`_: Prompt management, optimization, and"
|
||||
" serialization for multi database."
|
||||
msgstr "`Prompt自动生成与优化 <./modules/prompts.html>`_: 自动化生成高质量的Prompt"
|
||||
" ,并进行优化,提高系统的响应效率"
|
||||
|
||||
#: ../../index.rst:69 a3182673127141888fdc13560e7dcfb3
|
||||
msgid "`Plugins <./modules/plugins.html>`_: Plugins management, scheduler."
|
||||
msgstr "`Agent与插件: <./modules/plugins.html>`_:提供Agent和插件机制,使得用户可以自定义并增强系统的行为。"
|
||||
|
||||
#: ../../index.rst:71 66abfffcb9c0466f9a3988ecfb19fc9e
|
||||
msgid ""
|
||||
"`Knownledge <./modules/knownledge.html>`_: Knownledge management, "
|
||||
"embedding, and search."
|
||||
msgstr "`知识库能力: <./modules/knownledge.html>`_: 支持私域知识库问答能力, "
|
||||
|
||||
#: ../../index.rst:73 1027a33646614790a4d88f29285ab0fd
|
||||
msgid ""
|
||||
"`Connections <./modules/connections.html>`_: Supported multi databases "
|
||||
"connection. management connections and interact with this."
|
||||
msgstr "`连接模块 <./modules/connections.html>`_: 用于连接不同的模块和数据源,实现数据的流转和交互 "
|
||||
|
||||
|
||||
#: ../../index.rst:90 53b58e6e531841878fbc8616841d5e9e
|
||||
msgid "Best Practices and built-in implementations for common DB-GPT use cases:"
|
||||
msgstr "DB-GPT用例的最佳实践和内置方法:"
|
||||
|
||||
#: ../../index.rst:92 a5c664233fe04417ba9bb0415fd686d7
|
||||
msgid ""
|
||||
"`Sql generation and diagnosis "
|
||||
"<./use_cases/sql_generation_and_diagnosis.html>`_: SQL generation and "
|
||||
"diagnosis."
|
||||
msgstr "`Sql生成和诊断 "
|
||||
"<./use_cases/sql_generation_and_diagnosis.html>`_: Sql生成和诊断。"
|
||||
|
||||
|
||||
#: ../../index.rst:94 04c63b56e77b45e5b4e7bd1db45ea10f
|
||||
msgid ""
|
||||
"`knownledge Based QA <./use_cases/knownledge_based_qa.html>`_: A "
|
||||
"important scene for user to chat with database documents, codes, bugs and"
|
||||
" schemas."
|
||||
msgstr "`知识库问答 <./use_cases/knownledge_based_qa.html>`_: "
|
||||
"用户与数据库文档、代码和bug聊天的重要场景"。
|
||||
|
||||
#: ../../index.rst:96 415e2b9f640341a084f893781e2b3ec0
|
||||
msgid ""
|
||||
"`Chatbots <./use_cases/chatbots.html>`_: Language model love to chat, use"
|
||||
" multi models to chat."
|
||||
msgstr "`聊天机器人 <./use_cases/chatbots.html>`_: 使用多模型进行对话"
|
||||
|
||||
#: ../../index.rst:98 59a7ec39d2034fb794a9272d55607122
|
||||
msgid ""
|
||||
"`Querying Database Data <./use_cases/query_database_data.html>`_: Query "
|
||||
"and Analysis data from databases and give charts."
|
||||
msgstr "`查询数据库数据 <./use_cases/query_database_data.html>`_:"
|
||||
"从数据库中查询和分析数据并给出图表。"
|
||||
|
||||
#: ../../index.rst:100 3bd098eda9044bd39e4bba28a82f4195
|
||||
msgid ""
|
||||
"`Interacting with apis <./use_cases/interacting_with_api.html>`_: "
|
||||
"Interact with apis, such as create a table, deploy a database cluster, "
|
||||
"create a database and so on."
|
||||
msgstr "`API交互 <./use_cases/interacting_with_api.html>`_: "
|
||||
"与API交互,例如创建表、部署数据库集群、创建数据库等。"
|
||||
|
||||
|
||||
#: ../../index.rst:102 66daab899d7b4e528eda70779ab79676
|
||||
msgid ""
|
||||
"`Tool use with plugins <./use_cases/tool_use_with_plugin>`_: According to"
|
||||
" Plugin use tools to manage databases autonomoly."
|
||||
msgstr "`插件工具 <./use_cases/tool_use_with_plugin>`_:"
|
||||
" 根据插件使用工具自主管理数据库。"
|
||||
|
||||
#: ../../index.rst:119 e5a84e2dc87d4a06aa77ef4d77fb7bcb
|
||||
msgid ""
|
||||
"Full documentation on all methods, classes, installation methods, and "
|
||||
"integration setups for DB-GPT."
|
||||
msgstr "关于DB-GPT的所有方法、类、安装方法和集成设置的完整文档。"
|
||||
|
||||
#: ../../index.rst:130 7c51e39ad3824c5f8575390adbcba738
|
||||
msgid "Ecosystem"
|
||||
msgstr "环境系统"
|
||||
|
||||
#: ../../index.rst:132 b59e9ddba86945c1bebe395b2863174c
|
||||
msgid "Guides for how other companies/products can be used with DB-GPT"
|
||||
msgstr "其他公司/产品如何与DB-GPT一起使用的方法指南"
|
||||
|
||||
#: ../../index.rst:147 992bf68cc48a425696c02429d39f86e3
|
||||
msgid ""
|
||||
"Additional resources we think may be useful as you develop your "
|
||||
"application!"
|
||||
msgstr "“我们认为在您开发应用程序时可能有用的其他资源!”"
|
||||
|
||||
#: ../../index.rst:149 d99277006b05438c8d2e8088242f239c
|
||||
msgid ""
|
||||
"`Discord <https://discord.com/invite/twmZk3vv>`_: if your have some "
|
||||
"problem or ideas, you can talk from discord."
|
||||
msgstr "`Discord <https://discord.com/invite/twmZk3vv>`_:"
|
||||
"如果您有任何问题,可以到discord中进行交流。"
|
||||
|
34
docs/locales/zh_CN/LC_MESSAGES/modules/connections.po
Normal file
34
docs/locales/zh_CN/LC_MESSAGES/modules/connections.po
Normal file
@ -0,0 +1,34 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../modules/connections.md:1 21de23e95a6c4405a242fb9a0f4e5f2b
|
||||
msgid "Connections"
|
||||
msgstr "连接模块"
|
||||
|
||||
#: ../../modules/connections.md:3 0f09b3be20cd409f92c2ba819dbf45eb
|
||||
msgid ""
|
||||
"In order to interact more conveniently with users' private environments, "
|
||||
"the project has designed a connection module, which can support "
|
||||
"connection to databases, Excel, knowledge bases, and other environments "
|
||||
"to achieve information and data exchange."
|
||||
msgstr "为了更方便地与用户的私有环境进行交互,项目设计了一个连接模块,可以支持"
|
||||
"与数据库、Excel、知识库等环境的连接,实现信息和数据的交换。"
|
||||
|
38
docs/locales/zh_CN/LC_MESSAGES/modules/index.po
Normal file
38
docs/locales/zh_CN/LC_MESSAGES/modules/index.po
Normal file
@ -0,0 +1,38 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../modules/index.md:1 a7cda547b08244fdad5efc00b164432d
|
||||
msgid "Vector storage and indexing"
|
||||
msgstr "向量存储和索引"
|
||||
|
||||
#: ../../modules/index.md:3 fcbfbe3dda3d47d8a8ca2fefb1750b9a
|
||||
msgid ""
|
||||
"In order to facilitate the management of knowledge after vectorization, "
|
||||
"we have built-in multiple vector storage engines, from memory-based "
|
||||
"Chroma to distributed Milvus. Users can choose different storage engines "
|
||||
"according to their own scenario needs. The storage of knowledge vectors "
|
||||
"is the cornerstone of AI capability enhancement. As the intermediate "
|
||||
"language for interaction between humans and large language models, "
|
||||
"vectors play a very important role in this project."
|
||||
msgstr "为了便于知识向量化后的管理,我们内置了多个向量存储引擎,从基于内存的Chroma"
|
||||
"到分布式的Milvus。用户可以根据自己的场景需求选择不同的存储引擎。知识向量的存储是增"
|
||||
"强人工智能能力的基石。作为人类和大型语言模型之间交互的中间语言,向量在这个项目中扮演"
|
||||
"着非常重要的角色。"
|
89
docs/locales/zh_CN/LC_MESSAGES/modules/knownledge.po
Normal file
89
docs/locales/zh_CN/LC_MESSAGES/modules/knownledge.po
Normal file
@ -0,0 +1,89 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../modules/knownledge.md:1 ac3aa55568c0414a821a42aeed509ab2
|
||||
msgid "Knownledge"
|
||||
msgstr "知识"
|
||||
|
||||
#: ../../modules/knownledge.md:3 1d57e3d2d790437ea54730477c67fdfb
|
||||
msgid ""
|
||||
"As the knowledge base is currently the most significant user demand "
|
||||
"scenario, we natively support the construction and processing of "
|
||||
"knowledge bases. At the same time, we also provide multiple knowledge "
|
||||
"base management strategies in this project, such as:"
|
||||
msgstr "由于知识库是当前用户需求最显著的场景,我们原生支持知识库的构建和处理。"
|
||||
"同时,我们还在本项目中提供了多种知识库管理策略,如:"
|
||||
|
||||
#: ../../modules/knownledge.md:4 784708fc19334742b73549d92a21ed32
|
||||
msgid "Default built-in knowledge base"
|
||||
msgstr "默认内置知识库"
|
||||
|
||||
#: ../../modules/knownledge.md:5 c65ccfabe79348c09e6fc13a10774ffd
|
||||
msgid "Custom addition of knowledge bases"
|
||||
msgstr "自定义新增知识库"
|
||||
|
||||
#: ../../modules/knownledge.md:6 fc8fded3e3634edfbe6001d9ea1add90
|
||||
msgid ""
|
||||
"Various usage scenarios such as constructing knowledge bases through "
|
||||
"plugin capabilities and web crawling. Users only need to organize the "
|
||||
"knowledge documents, and they can use our existing capabilities to build "
|
||||
"the knowledge base required for the large model."
|
||||
msgstr "各种使用场景,例如通过插件功能和爬虫构建知识库。用户只需要组织知识文档,"
|
||||
"并且他们可以使用我们现有的功能来构建大型模型所需的知识库。"
|
||||
|
||||
#: ../../modules/knownledge.md:9 2fa8ae0edeef4380ab60c43754d93c93
|
||||
msgid "Create your own knowledge repository"
|
||||
msgstr "创建你自己的知识库"
|
||||
|
||||
#: ../../modules/knownledge.md:11 13dc4cea806e42c4887c45bbd84fb063
|
||||
msgid ""
|
||||
"1.Place personal knowledge files or folders in the pilot/datasets "
|
||||
"directory."
|
||||
msgstr "1.将个人知识文件或文件夹放在pilot/datasets目录中。"
|
||||
|
||||
#: ../../modules/knownledge.md:13 8dbf51249c9d47749e3fedbf9886479b
|
||||
msgid ""
|
||||
"2.Update your .env, set your vector store type, VECTOR_STORE_TYPE=Chroma "
|
||||
"(now only support Chroma and Milvus, if you set Milvus, please set "
|
||||
"MILVUS_URL and MILVUS_PORT)"
|
||||
msgstr "2.更新你的.env,设置你的向量存储类型,VECTOR_STORE_TYPE=Chroma(现在只支持"
|
||||
"Chroma和Milvus,如果你设置了Milvus,请设置MILVUS_URL和MILVUS_PORT)"
|
||||
|
||||
#: ../../modules/knownledge.md:16 e03cce8ad3b14100b8bb22dd98ea49ae
|
||||
msgid "2.Run the knowledge repository script in the tools directory."
|
||||
msgstr "2.在tools目录执行知识入库脚本"
|
||||
|
||||
#: ../../modules/knownledge.md:26 a2919580cc324820b1217e31c8b22203
|
||||
msgid ""
|
||||
"3.Add the knowledge repository in the interface by entering the name of "
|
||||
"your knowledge repository (if not specified, enter \"default\") so you "
|
||||
"can use it for Q&A based on your knowledge base."
|
||||
msgstr "如果选择新增知识库,在界面上新增知识库输入你的知识库名"
|
||||
|
||||
#: ../../modules/knownledge.md:28 236317becbb042f2acbf66c499a3b984
|
||||
msgid ""
|
||||
"Note that the default vector model used is text2vec-large-chinese (which "
|
||||
"is a large model, so if your personal computer configuration is not "
|
||||
"enough, it is recommended to use text2vec-base-chinese). Therefore, "
|
||||
"ensure that you download the model and place it in the models directory."
|
||||
msgstr "注意,这里默认向量模型是text2vec-large-chinese(模型比较大,如果个人电脑"
|
||||
"配置不够建议采用text2vec-base-chinese),因此确保需要将模型download下来放到models目录中。"
|
||||
|
97
docs/locales/zh_CN/LC_MESSAGES/modules/llms.po
Normal file
97
docs/locales/zh_CN/LC_MESSAGES/modules/llms.po
Normal file
@ -0,0 +1,97 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-13 11:38+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.12.1\n"
|
||||
|
||||
#: ../../modules/llms.md:1 34386f3fecba48fbbd86718283ba593c
|
||||
msgid "LLMs"
|
||||
msgstr "大语言模型"
|
||||
|
||||
#: ../../modules/llms.md:3 241b39ad980f4cfd90a7f0fdae05a1d2
|
||||
#, python-format
|
||||
msgid ""
|
||||
"In the underlying large model integration, we have designed an open "
|
||||
"interface that supports integration with various large models. At the "
|
||||
"same time, we have a very strict control and evaluation mechanism for the"
|
||||
" effectiveness of the integrated models. In terms of accuracy, the "
|
||||
"integrated models need to align with the capability of ChatGPT at a level"
|
||||
" of 85% or higher. We use higher standards to select models, hoping to "
|
||||
"save users the cumbersome testing and evaluation process in the process "
|
||||
"of use."
|
||||
msgstr "在底层大模型接入中,我们设计了开放的接口,支持对接多种大模型。同时对于接入模型的效果,我们有非常严格的把控与评审机制。对大模型能力上与ChatGPT对比,在准确率上需要满足85%以上的能力对齐。我们用更高的标准筛选模型,是期望在用户使用过程中,可以省去前面繁琐的测试评估环节。"
|
||||
|
||||
#: ../../modules/llms.md:5 25175e87a62e41bca86798eb783cefd6
|
||||
msgid "Multi LLMs Usage"
|
||||
msgstr "多模型使用"
|
||||
|
||||
#: ../../modules/llms.md:6 8c35341e9ca94202ba779567813f9973
|
||||
msgid ""
|
||||
"To use multiple models, modify the LLM_MODEL parameter in the .env "
|
||||
"configuration file to switch between the models."
|
||||
msgstr "如果要使用不同的模型,请修改.env配置文件中的LLM MODEL参数以在模型之间切换。"
|
||||
|
||||
#: ../../modules/llms.md:8 2edf3309a6554f39ad74e19faff09cee
|
||||
msgid ""
|
||||
"Notice: you can create .env file from .env.template, just use command "
|
||||
"like this:"
|
||||
msgstr "注意:你可以从 .env.template 创建 .env 文件。只需使用如下命令:"
|
||||
|
||||
#: ../../modules/llms.md:14 5fa7639ef294425e89e13b7c6617fb4b
|
||||
msgid ""
|
||||
"now we support models vicuna-13b, vicuna-7b, chatglm-6b, flan-t5-base, "
|
||||
"guanaco-33b-merged, falcon-40b, gorilla-7b."
|
||||
msgstr "现在我们支持的模型有vicuna-13b, vicuna-7b, chatglm-6b, flan-t5-base, "
|
||||
"guanaco-33b-merged, falcon-40b, gorilla-7b."
|
||||
|
||||
#: ../../modules/llms.md:16 96c9a5ad00264bd2a07bdbdec87e471e
|
||||
msgid ""
|
||||
"DB-GPT provides a model load adapter and chat adapter. load adapter which"
|
||||
" allows you to easily adapt load different LLM models by inheriting the "
|
||||
"BaseLLMAdapter. You just implement match() and loader() method."
|
||||
msgstr "DB-GPT提供了多模型适配器load adapter和chat adapter.load adapter通过继承BaseLLMAdapter类, 实现match和loader方法允许你适配不同的LLM."
|
||||
|
||||
#: ../../modules/llms.md:18 1033714691464f50900c04c9e1bb5643
|
||||
msgid "vicuna llm load adapter"
|
||||
msgstr "vicuna llm load adapter"
|
||||
|
||||
#: ../../modules/llms.md:35 faa6432575be45bcae5deb1cc7fee3fb
|
||||
msgid "chatglm load adapter"
|
||||
msgstr "chatglm load adapter"
|
||||
|
||||
#: ../../modules/llms.md:62 61c4189cabf04e628132c2bf5f02bb50
|
||||
msgid ""
|
||||
"chat adapter which allows you to easily adapt chat different LLM models "
|
||||
"by inheriting the BaseChatAdpter.you just implement match() and "
|
||||
"get_generate_stream_func() method"
|
||||
msgstr "chat adapter通过继承BaseChatAdpter允许你通过实现match和get_generate_stream_func方法允许你适配不同的LLM."
|
||||
|
||||
#: ../../modules/llms.md:64 407a67e4e2c6414b9cde346961d850c0
|
||||
msgid "vicuna llm chat adapter"
|
||||
msgstr "vicuna llm chat adapter"
|
||||
|
||||
#: ../../modules/llms.md:76 53a55238cd90406db58c50dc64465195
|
||||
msgid "chatglm llm chat adapter"
|
||||
msgstr "chatglm llm chat adapter"
|
||||
|
||||
#: ../../modules/llms.md:89 b0c5ff72c05e40b3b301d6b81205fe63
|
||||
msgid ""
|
||||
"if you want to integrate your own model, just need to inheriting "
|
||||
"BaseLLMAdaper and BaseChatAdpter and implement the methods"
|
||||
msgstr "如果你想集成自己的模型,只需要继承BaseLLMAdaper和BaseChatAdpter类,然后实现里面的方法即可"
|
||||
|
37
docs/locales/zh_CN/LC_MESSAGES/modules/plugins.po
Normal file
37
docs/locales/zh_CN/LC_MESSAGES/modules/plugins.po
Normal file
@ -0,0 +1,37 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../modules/plugins.md:1 48f1b7ff4099485ba3853c373e64273f
|
||||
msgid "Plugins"
|
||||
msgstr "插件"
|
||||
|
||||
#: ../../modules/plugins.md:3 3d94b3250511468d80aa29359f01128d
|
||||
msgid ""
|
||||
"The ability of Agent and Plugin is the core of whether large models can "
|
||||
"be automated. In this project, we natively support the plugin mode, and "
|
||||
"large models can automatically achieve their goals. At the same time, in "
|
||||
"order to give full play to the advantages of the community, the plugins "
|
||||
"used in this project natively support the Auto-GPT plugin ecology, that "
|
||||
"is, Auto-GPT plugins can directly run in our project."
|
||||
msgstr "Agent与插件能力是大模型能否自动化的核心,在本的项目中,原生支持插件模式,"
|
||||
"大模型可以自动化完成目标。 同时为了充分发挥社区的优势,本项目中所用的插件原生支持"
|
||||
"Auto-GPT插件生态,即Auto-GPT的插件可以直接在我们的项目中运行。"
|
||||
|
37
docs/locales/zh_CN/LC_MESSAGES/modules/prompts.po
Normal file
37
docs/locales/zh_CN/LC_MESSAGES/modules/prompts.po
Normal file
@ -0,0 +1,37 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../modules/prompts.md:1 bb9583334e6948b98b59126234ae045f
|
||||
msgid "Prompts"
|
||||
msgstr ""
|
||||
|
||||
#: ../../modules/prompts.md:3 e6f5129e260c4a739a40115fff82850f
|
||||
msgid ""
|
||||
"Prompt is a very important part of the interaction between the large "
|
||||
"model and the user, and to a certain extent, it determines the quality "
|
||||
"and accuracy of the answer generated by the large model. In this project,"
|
||||
" we will automatically optimize the corresponding prompt according to "
|
||||
"user input and usage scenarios, making it easier and more efficient for "
|
||||
"users to use large language models."
|
||||
msgstr "Prompt是与大模型交互过程中非常重要的部分,一定程度上Prompt决定了"
|
||||
"大模型生成答案的质量与准确性,在本的项目中,我们会根据用户输入与"
|
||||
"使用场景,自动优化对应的Prompt,让用户使用大语言模型变得更简单、更高效。"
|
||||
|
32
docs/locales/zh_CN/LC_MESSAGES/modules/server.po
Normal file
32
docs/locales/zh_CN/LC_MESSAGES/modules/server.po
Normal file
@ -0,0 +1,32 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../modules/server.md:1 e882c271ebc441bca79808bc00f2bc24
|
||||
msgid "Server"
|
||||
msgstr ""
|
||||
|
||||
#: ../../modules/server.md:3 325cc3afd7d04e568c912bbf7f11788d
|
||||
msgid ""
|
||||
"TODO: In terms of terminal display, we will provide a multi-platform "
|
||||
"product interface, including PC, mobile phone, command line, Slack and "
|
||||
"other platforms."
|
||||
msgstr "TODO: 在终端展示上,我们将提供多端产品界面。包括PC、手机、命令行、Slack等多种模式。"
|
||||
|
25
docs/locales/zh_CN/LC_MESSAGES/reference.po
Normal file
25
docs/locales/zh_CN/LC_MESSAGES/reference.po
Normal file
@ -0,0 +1,25 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../reference.md:1 83c827fb051c40d8b16f704752c9581b
|
||||
msgid "Reference"
|
||||
msgstr "参考文献"
|
||||
|
25
docs/locales/zh_CN/LC_MESSAGES/use_cases/chatbots.po
Normal file
25
docs/locales/zh_CN/LC_MESSAGES/use_cases/chatbots.po
Normal file
@ -0,0 +1,25 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../use_cases/chatbots.md:1 e599819098be40759193233cc476f26a
|
||||
msgid "Chatbot"
|
||||
msgstr "聊天机器人"
|
||||
|
@ -0,0 +1,25 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../use_cases/interacting_with_api.md:1 2dc3e9c958e24aca90af1b0520d416b4
|
||||
msgid "Interacting with api"
|
||||
msgstr "API交互"
|
||||
|
@ -0,0 +1,58 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-13 11:38+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.12.1\n"
|
||||
|
||||
#: ../../use_cases/knownledge_based_qa.md:1 ddfe412b92e14324bdc11ffe58114e5f
|
||||
msgid "Knownledge based qa"
|
||||
msgstr "知识问答"
|
||||
|
||||
#: ../../use_cases/knownledge_based_qa.md:3 48635316cc704a779089ff7b5cb9a836
|
||||
msgid ""
|
||||
"Chat with your own knowledge is a very interesting thing. In the usage "
|
||||
"scenarios of this chapter, we will introduce how to build your own "
|
||||
"knowledge base through the knowledge base API. Firstly, building a "
|
||||
"knowledge store can currently be initialized by executing \"python "
|
||||
"tool/knowledge_init.py\" to initialize the content of your own knowledge "
|
||||
"base, which was introduced in the previous knowledge base module. Of "
|
||||
"course, you can also call our provided knowledge embedding API to store "
|
||||
"knowledge."
|
||||
msgstr ""
|
||||
"用自己的知识聊天是一件很有趣的事情。在本章的使用场景中,我们将介绍如何通过知识库API构建自己的知识库。首先,构建知识存储目前可以通过执行“python"
|
||||
" "
|
||||
"tool/knowledge_init.py”来初始化您自己的知识库的内容,这在前面的知识库模块中已经介绍过了。当然,你也可以调用我们提供的知识嵌入API来存储知识。"
|
||||
|
||||
#: ../../use_cases/knownledge_based_qa.md:6 0a5c68429c9343cf8b88f4f1dddb18eb
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
"We currently support many document formats: txt, pdf, md, html, doc, ppt,"
|
||||
" and url."
|
||||
msgstr "“我们目前支持四种文件格式: txt, pdf, url, 和md。"
|
||||
|
||||
#: ../../use_cases/knownledge_based_qa.md:20 83f3544c06954e5cbc0cc7788f699eb1
|
||||
msgid ""
|
||||
"Now we currently support vector databases: Chroma (default) and Milvus. "
|
||||
"You can switch between them by modifying the \"VECTOR_STORE_TYPE\" field "
|
||||
"in the .env file."
|
||||
msgstr "“我们目前支持向量数据库:Chroma(默认)和Milvus。你可以通过修改.env文件中的“VECTOR_STORE_TYPE”参数在它们之间切换。"
|
||||
|
||||
#: ../../use_cases/knownledge_based_qa.md:31 ac12f26b81384fc4bf44ccce1c0d86b4
|
||||
msgid "Below is an example of using the knowledge base API to query knowledge:"
|
||||
msgstr "下面是一个使用知识库API进行查询的例子:"
|
||||
|
@ -0,0 +1,25 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../use_cases/query_database_data.md:1 4a246f7052db497d990d3e65236b7c52
|
||||
msgid "Query database data"
|
||||
msgstr "查询数据库数据"
|
||||
|
@ -0,0 +1,26 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../use_cases/sql_generation_and_diagnosis.md:1
|
||||
#: 8900f8d9f3034b20a96df1d5c611eaa1
|
||||
msgid "SQL generation and diagnosis"
|
||||
msgstr "SQL生成和诊断"
|
||||
|
@ -0,0 +1,25 @@
|
||||
# SOME DESCRIPTIVE TITLE.
|
||||
# Copyright (C) 2023, csunny
|
||||
# This file is distributed under the same license as the DB-GPT package.
|
||||
# FIRST AUTHOR <EMAIL@ADDRESS>, 2023.
|
||||
#
|
||||
#, fuzzy
|
||||
msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: DB-GPT 0.1.0\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2023-06-11 14:10+0800\n"
|
||||
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
|
||||
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
|
||||
"Language: zh_CN\n"
|
||||
"Language-Team: zh_CN <LL@li.org>\n"
|
||||
"Plural-Forms: nplurals=1; plural=0;\n"
|
||||
"MIME-Version: 1.0\n"
|
||||
"Content-Type: text/plain; charset=utf-8\n"
|
||||
"Content-Transfer-Encoding: 8bit\n"
|
||||
"Generated-By: Babel 2.11.0\n"
|
||||
|
||||
#: ../../use_cases/tool_use_with_plugin.md:1 2bd7d79a16a548c4a3872a12c436aa4f
|
||||
msgid "Tool use with plugin"
|
||||
msgstr "插件工具"
|
||||
|
@ -10,6 +10,15 @@ As the knowledge base is currently the most significant user demand scenario, we
|
||||
|
||||
1.Place personal knowledge files or folders in the pilot/datasets directory.
|
||||
|
||||
We currently support many document formats: txt, pdf, md, html, doc, ppt, and url.
|
||||
|
||||
before execution:
|
||||
|
||||
```
|
||||
python -m spacy download zh_core_web_sm
|
||||
|
||||
```
|
||||
|
||||
2.Update your .env, set your vector store type, VECTOR_STORE_TYPE=Chroma
|
||||
(now only support Chroma and Milvus, if you set Milvus, please set MILVUS_URL and MILVUS_PORT)
|
||||
|
||||
@ -19,7 +28,6 @@ As the knowledge base is currently the most significant user demand scenario, we
|
||||
python tools/knowledge_init.py
|
||||
|
||||
--vector_name : your vector store name default_value:default
|
||||
--append: append mode, True:append, False: not append default_value:False
|
||||
|
||||
```
|
||||
|
||||
|
@ -8,4 +8,82 @@ To use multiple models, modify the LLM_MODEL parameter in the .env configuration
|
||||
Notice: you can create .env file from .env.template, just use command like this:
|
||||
```
|
||||
cp .env.template .env
|
||||
```
|
||||
LLM_MODEL=vicuna-13b
|
||||
MODEL_SERVER=http://127.0.0.1:8000
|
||||
```
|
||||
now we support models vicuna-13b, vicuna-7b, chatglm-6b, flan-t5-base, guanaco-33b-merged, falcon-40b, gorilla-7b.
|
||||
|
||||
DB-GPT provides a model load adapter and chat adapter. load adapter which allows you to easily adapt load different LLM models by inheriting the BaseLLMAdapter. You just implement match() and loader() method.
|
||||
|
||||
vicuna llm load adapter
|
||||
|
||||
```
|
||||
class VicunaLLMAdapater(BaseLLMAdaper):
|
||||
"""Vicuna Adapter"""
|
||||
|
||||
def match(self, model_path: str):
|
||||
return "vicuna" in model_path
|
||||
|
||||
def loader(self, model_path: str, from_pretrained_kwagrs: dict):
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)
|
||||
model = AutoModelForCausalLM.from_pretrained(
|
||||
model_path, low_cpu_mem_usage=True, **from_pretrained_kwagrs
|
||||
)
|
||||
return model, tokenizer
|
||||
```
|
||||
|
||||
chatglm load adapter
|
||||
```
|
||||
|
||||
class ChatGLMAdapater(BaseLLMAdaper):
|
||||
"""LLM Adatpter for THUDM/chatglm-6b"""
|
||||
|
||||
def match(self, model_path: str):
|
||||
return "chatglm" in model_path
|
||||
|
||||
def loader(self, model_path: str, from_pretrained_kwargs: dict):
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
|
||||
|
||||
if DEVICE != "cuda":
|
||||
model = AutoModel.from_pretrained(
|
||||
model_path, trust_remote_code=True, **from_pretrained_kwargs
|
||||
).float()
|
||||
return model, tokenizer
|
||||
else:
|
||||
model = (
|
||||
AutoModel.from_pretrained(
|
||||
model_path, trust_remote_code=True, **from_pretrained_kwargs
|
||||
)
|
||||
.half()
|
||||
.cuda()
|
||||
)
|
||||
return model, tokenizer
|
||||
```
|
||||
chat adapter which allows you to easily adapt chat different LLM models by inheriting the BaseChatAdpter.you just implement match() and get_generate_stream_func() method
|
||||
|
||||
vicuna llm chat adapter
|
||||
```
|
||||
class VicunaChatAdapter(BaseChatAdpter):
|
||||
"""Model chat Adapter for vicuna"""
|
||||
|
||||
def match(self, model_path: str):
|
||||
return "vicuna" in model_path
|
||||
|
||||
def get_generate_stream_func(self):
|
||||
return generate_stream
|
||||
```
|
||||
|
||||
chatglm llm chat adapter
|
||||
```
|
||||
class ChatGLMChatAdapter(BaseChatAdpter):
|
||||
"""Model chat Adapter for ChatGLM"""
|
||||
|
||||
def match(self, model_path: str):
|
||||
return "chatglm" in model_path
|
||||
|
||||
def get_generate_stream_func(self):
|
||||
from pilot.model.llm_out.chatglm_llm import chatglm_generate_stream
|
||||
|
||||
return chatglm_generate_stream
|
||||
```
|
||||
if you want to integrate your own model, just need to inheriting BaseLLMAdaper and BaseChatAdpter and implement the methods
|
@ -3,7 +3,7 @@
|
||||
Chat with your own knowledge is a very interesting thing. In the usage scenarios of this chapter, we will introduce how to build your own knowledge base through the knowledge base API. Firstly, building a knowledge store can currently be initialized by executing "python tool/knowledge_init.py" to initialize the content of your own knowledge base, which was introduced in the previous knowledge base module. Of course, you can also call our provided knowledge embedding API to store knowledge.
|
||||
|
||||
|
||||
We currently support four document formats: txt, pdf, url, and md.
|
||||
We currently support many document formats: txt, pdf, md, html, doc, ppt, and url.
|
||||
```
|
||||
vector_store_config = {
|
||||
"vector_store_name": name
|
||||
@ -11,7 +11,7 @@ vector_store_config = {
|
||||
|
||||
file_path = "your file path"
|
||||
|
||||
knowledge_embedding_client = KnowledgeEmbedding(file_path=file_path, model_name=LLM_MODEL_CONFIG["text2vec"],local_persist=False, vector_store_config=vector_store_config)
|
||||
knowledge_embedding_client = KnowledgeEmbedding(file_path=file_path, model_name=LLM_MODEL_CONFIG["text2vec"], vector_store_config=vector_store_config)
|
||||
|
||||
knowledge_embedding_client.knowledge_embedding()
|
||||
|
||||
@ -37,7 +37,7 @@ vector_store_config = {
|
||||
|
||||
query = "your query"
|
||||
|
||||
knowledge_embedding_client = KnowledgeEmbedding(file_path="", model_name=LLM_MODEL_CONFIG["text2vec"], local_persist=False, vector_store_config=vector_store_config)
|
||||
knowledge_embedding_client = KnowledgeEmbedding(file_path="", model_name=LLM_MODEL_CONFIG["text2vec"], vector_store_config=vector_store_config)
|
||||
|
||||
knowledge_embedding_client.similar_search(query, 10)
|
||||
```
|
@ -443,6 +443,14 @@ class Database:
|
||||
indexes = cursor.fetchall()
|
||||
return [(index[2], index[4]) for index in indexes]
|
||||
|
||||
def get_show_create_table(self, table_name):
|
||||
"""Get table show create table about specified table."""
|
||||
session = self._db_sessions()
|
||||
cursor = session.execute(text(f"SHOW CREATE TABLE {table_name}"))
|
||||
ans = cursor.fetchall()
|
||||
return ans[0][1]
|
||||
|
||||
|
||||
def get_fields(self, table_name):
|
||||
"""Get column fields about specified table."""
|
||||
session = self._db_sessions()
|
||||
|
@ -7,7 +7,7 @@ lang_dicts = {
|
||||
"learn_more_markdown": "该服务是仅供非商业用途的研究预览。受 Vicuna-13B 模型 [License](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) 的约束",
|
||||
"model_control_param": "模型参数",
|
||||
"sql_generate_mode_direct": "直接执行结果",
|
||||
"sql_generate_mode_none": "不直接执行结果",
|
||||
"sql_generate_mode_none": "db问答",
|
||||
"max_input_token_size": "最大输出Token数",
|
||||
"please_choose_database": "请选择数据",
|
||||
"sql_generate_diagnostics": "SQL生成与诊断",
|
||||
@ -44,7 +44,7 @@ lang_dicts = {
|
||||
"learn_more_markdown": "The service is a research preview intended for non-commercial use only. subject to the model [License](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) of Vicuna-13B",
|
||||
"model_control_param": "Model Parameters",
|
||||
"sql_generate_mode_direct": "Execute directly",
|
||||
"sql_generate_mode_none": "Execute without mode",
|
||||
"sql_generate_mode_none": "chat to db",
|
||||
"max_input_token_size": "Maximum output token size",
|
||||
"please_choose_database": "Please choose database",
|
||||
"sql_generate_diagnostics": "SQL Generation & Diagnostics",
|
||||
|
@ -52,7 +52,7 @@ class ChatWithDbQA(BaseChat):
|
||||
raise ValueError("Could not import DBSummaryClient. ")
|
||||
if self.db_name:
|
||||
client = DBSummaryClient()
|
||||
table_info = client.get_similar_tables(
|
||||
table_info = client.get_db_summary(
|
||||
dbname=self.db_name, query=self.current_user_input, topk=self.top_k
|
||||
)
|
||||
# table_info = self.database.table_simple_info(self.db_connect)
|
||||
@ -60,8 +60,8 @@ class ChatWithDbQA(BaseChat):
|
||||
|
||||
input_values = {
|
||||
"input": self.current_user_input,
|
||||
"top_k": str(self.top_k),
|
||||
"dialect": dialect,
|
||||
# "top_k": str(self.top_k),
|
||||
# "dialect": dialect,
|
||||
"table_info": table_info,
|
||||
}
|
||||
return input_values
|
||||
|
@ -10,22 +10,44 @@ CFG = Config()
|
||||
|
||||
PROMPT_SCENE_DEFINE = """A chat between a curious user and an artificial intelligence assistant, who very familiar with database related knowledge. """
|
||||
|
||||
PROMPT_SUFFIX = """Only use the following tables generate sql if have any table info:
|
||||
# PROMPT_SUFFIX = """Only use the following tables generate sql if have any table info:
|
||||
# {table_info}
|
||||
#
|
||||
# Question: {input}
|
||||
#
|
||||
# """
|
||||
|
||||
# _DEFAULT_TEMPLATE = """
|
||||
# You are a SQL expert. Given an input question, first create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.
|
||||
# Unless the user specifies in his question a specific number of examples he wishes to obtain, always limit your query to at most {top_k} results.
|
||||
# You can order the results by a relevant column to return the most interesting examples in the database.
|
||||
# Never query for all the columns from a specific table, only ask for a the few relevant columns given the question.
|
||||
# Pay attention to use only the column names that you can see in the schema description. Be careful to not query for columns that do not exist. Also, pay attention to which column is in which table.
|
||||
#
|
||||
# """
|
||||
|
||||
_DEFAULT_TEMPLATE_EN = """
|
||||
You are a database expert. you will be given metadata information about a database or table, and then provide a brief summary and answer to the question. For example, question: "How many tables are there in database 'db_gpt'?" , answer: "There are 5 tables in database 'db_gpt', which are 'book', 'book_category', 'borrower', 'borrowing', and 'category'.
|
||||
Based on the database metadata information below, provide users with professional and concise answers to their questions. If the answer cannot be obtained from the provided content, please say: "The information provided in the knowledge base is not sufficient to answer this question." It is forbidden to make up information randomly.
|
||||
database metadata information:
|
||||
{table_info}
|
||||
|
||||
Question: {input}
|
||||
|
||||
question:
|
||||
{input}
|
||||
"""
|
||||
|
||||
_DEFAULT_TEMPLATE = """
|
||||
You are a SQL expert. Given an input question, first create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.
|
||||
Unless the user specifies in his question a specific number of examples he wishes to obtain, always limit your query to at most {top_k} results.
|
||||
You can order the results by a relevant column to return the most interesting examples in the database.
|
||||
Never query for all the columns from a specific table, only ask for a the few relevant columns given the question.
|
||||
Pay attention to use only the column names that you can see in the schema description. Be careful to not query for columns that do not exist. Also, pay attention to which column is in which table.
|
||||
|
||||
_DEFAULT_TEMPLATE_ZH = """
|
||||
你是一位数据库专家。你将获得有关数据库或表的元数据信息,然后提供简要的总结和回答。例如,问题:“数据库 'db_gpt' 中有多少个表?” 答案:“数据库 'db_gpt' 中有 5 个表,分别是 'book'、'book_category'、'borrower'、'borrowing' 和 'category'。”
|
||||
根据以下数据库元数据信息,为用户提供专业简洁的答案。如果无法从提供的内容中获取答案,请说:“知识库中提供的信息不足以回答此问题。” 禁止随意捏造信息。
|
||||
数据库元数据信息:
|
||||
{table_info}
|
||||
问题:
|
||||
{input}
|
||||
"""
|
||||
|
||||
_DEFAULT_TEMPLATE = (
|
||||
_DEFAULT_TEMPLATE_EN if CFG.LANGUAGE == "en" else _DEFAULT_TEMPLATE_ZH
|
||||
)
|
||||
|
||||
|
||||
PROMPT_SEP = SeparatorStyle.SINGLE.value
|
||||
|
||||
@ -33,10 +55,10 @@ PROMPT_NEED_NEED_STREAM_OUT = True
|
||||
|
||||
prompt = PromptTemplate(
|
||||
template_scene=ChatScene.ChatWithDbQA.value,
|
||||
input_variables=["input", "table_info", "dialect", "top_k"],
|
||||
input_variables=["input", "table_info"],
|
||||
response_format=None,
|
||||
template_define=PROMPT_SCENE_DEFINE,
|
||||
template=_DEFAULT_TEMPLATE + PROMPT_SUFFIX,
|
||||
template=_DEFAULT_TEMPLATE,
|
||||
stream_out=PROMPT_NEED_NEED_STREAM_OUT,
|
||||
output_parser=NormalChatOutputParser(
|
||||
sep=PROMPT_SEP, is_stream_out=PROMPT_NEED_NEED_STREAM_OUT
|
||||
|
@ -38,7 +38,7 @@ class ChatUrlKnowledge(BaseChat):
|
||||
)
|
||||
self.url = url
|
||||
vector_store_config = {
|
||||
"vector_store_name": url,
|
||||
"vector_store_name": url.replace(":", ""),
|
||||
"vector_store_path": KNOWLEDGE_UPLOAD_ROOT_PATH,
|
||||
}
|
||||
self.knowledge_embedding_client = KnowledgeEmbedding(
|
||||
|
@ -1,11 +1,13 @@
|
||||
from typing import Optional
|
||||
|
||||
from chromadb.errors import NotEnoughElementsException
|
||||
from langchain.embeddings import HuggingFaceEmbeddings
|
||||
|
||||
from pilot.configs.config import Config
|
||||
from pilot.source_embedding.csv_embedding import CSVEmbedding
|
||||
from pilot.source_embedding.markdown_embedding import MarkdownEmbedding
|
||||
from pilot.source_embedding.pdf_embedding import PDFEmbedding
|
||||
from pilot.source_embedding.ppt_embedding import PPTEmbedding
|
||||
from pilot.source_embedding.url_embedding import URLEmbedding
|
||||
from pilot.source_embedding.word_embedding import WordEmbedding
|
||||
from pilot.vector_store.connector import VectorStoreConnector
|
||||
@ -19,6 +21,8 @@ KnowledgeEmbeddingType = {
|
||||
".doc": (WordEmbedding, {}),
|
||||
".docx": (WordEmbedding, {}),
|
||||
".csv": (CSVEmbedding, {}),
|
||||
".ppt": (PPTEmbedding, {}),
|
||||
".pptx": (PPTEmbedding, {}),
|
||||
}
|
||||
|
||||
|
||||
@ -42,8 +46,12 @@ class KnowledgeEmbedding:
|
||||
self.knowledge_embedding_client = self.init_knowledge_embedding()
|
||||
self.knowledge_embedding_client.source_embedding()
|
||||
|
||||
def knowledge_embedding_batch(self):
|
||||
self.knowledge_embedding_client.batch_embedding()
|
||||
def knowledge_embedding_batch(self, docs):
|
||||
# docs = self.knowledge_embedding_client.read_batch()
|
||||
self.knowledge_embedding_client.index_to_store(docs)
|
||||
|
||||
def read(self):
|
||||
return self.knowledge_embedding_client.read_batch()
|
||||
|
||||
def init_knowledge_embedding(self):
|
||||
if self.file_type == "url":
|
||||
@ -68,7 +76,11 @@ class KnowledgeEmbedding:
|
||||
vector_client = VectorStoreConnector(
|
||||
CFG.VECTOR_STORE_TYPE, self.vector_store_config
|
||||
)
|
||||
return vector_client.similar_search(text, topk)
|
||||
try:
|
||||
ans = vector_client.similar_search(text, topk)
|
||||
except NotEnoughElementsException:
|
||||
ans = vector_client.similar_search(text, 1)
|
||||
return ans
|
||||
|
||||
def vector_exist(self):
|
||||
vector_client = VectorStoreConnector(
|
||||
|
@ -5,8 +5,8 @@ from typing import List
|
||||
|
||||
import markdown
|
||||
from bs4 import BeautifulSoup
|
||||
from langchain.document_loaders import TextLoader
|
||||
from langchain.schema import Document
|
||||
from langchain.text_splitter import SpacyTextSplitter
|
||||
|
||||
from pilot.configs.config import Config
|
||||
from pilot.source_embedding import SourceEmbedding, register
|
||||
@ -30,32 +30,8 @@ class MarkdownEmbedding(SourceEmbedding):
|
||||
def read(self):
|
||||
"""Load from markdown path."""
|
||||
loader = EncodeTextLoader(self.file_path)
|
||||
text_splitter = CHNDocumentSplitter(
|
||||
pdf=True, sentence_size=CFG.KNOWLEDGE_CHUNK_SIZE
|
||||
)
|
||||
return loader.load_and_split(text_splitter)
|
||||
|
||||
@register
|
||||
def read_batch(self):
|
||||
"""Load from markdown path."""
|
||||
docments = []
|
||||
for root, _, files in os.walk(self.file_path, topdown=False):
|
||||
for file in files:
|
||||
filename = os.path.join(root, file)
|
||||
loader = TextLoader(filename)
|
||||
# text_splitor = CHNDocumentSplitter(chunk_size=1000, chunk_overlap=20, length_function=len)
|
||||
# docs = loader.load_and_split()
|
||||
docs = loader.load()
|
||||
# 更新metadata数据
|
||||
new_docs = []
|
||||
for doc in docs:
|
||||
doc.metadata = {
|
||||
"source": doc.metadata["source"].replace(self.file_path, "")
|
||||
}
|
||||
print("doc is embedding ... ", doc.metadata)
|
||||
new_docs.append(doc)
|
||||
docments += new_docs
|
||||
return docments
|
||||
textsplitter = SpacyTextSplitter(pipeline='zh_core_web_sm', chunk_size=CFG.KNOWLEDGE_CHUNK_SIZE, chunk_overlap=200)
|
||||
return loader.load_and_split(textsplitter)
|
||||
|
||||
@register
|
||||
def data_process(self, documents: List[Document]):
|
||||
|
@ -29,7 +29,7 @@ class PDFEmbedding(SourceEmbedding):
|
||||
# pdf=True, sentence_size=CFG.KNOWLEDGE_CHUNK_SIZE
|
||||
# )
|
||||
textsplitter = SpacyTextSplitter(
|
||||
pipeline="zh_core_web_sm", chunk_size=1000, chunk_overlap=200
|
||||
pipeline="zh_core_web_sm", chunk_size=CFG.KNOWLEDGE_CHUNK_SIZE, chunk_overlap=200
|
||||
)
|
||||
return loader.load_and_split(textsplitter)
|
||||
|
||||
|
37
pilot/source_embedding/ppt_embedding.py
Normal file
37
pilot/source_embedding/ppt_embedding.py
Normal file
@ -0,0 +1,37 @@
|
||||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
from typing import List
|
||||
|
||||
from langchain.document_loaders import UnstructuredPowerPointLoader
|
||||
from langchain.schema import Document
|
||||
from langchain.text_splitter import SpacyTextSplitter
|
||||
|
||||
from pilot.configs.config import Config
|
||||
from pilot.source_embedding import SourceEmbedding, register
|
||||
|
||||
CFG = Config()
|
||||
|
||||
|
||||
class PPTEmbedding(SourceEmbedding):
|
||||
"""ppt embedding for read ppt document."""
|
||||
|
||||
def __init__(self, file_path, vector_store_config):
|
||||
"""Initialize with pdf path."""
|
||||
super().__init__(file_path, vector_store_config)
|
||||
self.file_path = file_path
|
||||
self.vector_store_config = vector_store_config
|
||||
|
||||
@register
|
||||
def read(self):
|
||||
"""Load from ppt path."""
|
||||
loader = UnstructuredPowerPointLoader(self.file_path)
|
||||
textsplitter = SpacyTextSplitter(pipeline='zh_core_web_sm', chunk_size=CFG.KNOWLEDGE_CHUNK_SIZE, chunk_overlap=200)
|
||||
return loader.load_and_split(textsplitter)
|
||||
|
||||
@register
|
||||
def data_process(self, documents: List[Document]):
|
||||
i = 0
|
||||
for d in documents:
|
||||
documents[i].page_content = d.page_content.replace("\n", "")
|
||||
i += 1
|
||||
return documents
|
@ -2,6 +2,8 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
from abc import ABC, abstractmethod
|
||||
from typing import Dict, List, Optional
|
||||
|
||||
from chromadb.errors import NotEnoughElementsException
|
||||
from pilot.configs.config import Config
|
||||
from pilot.vector_store.connector import VectorStoreConnector
|
||||
|
||||
@ -62,7 +64,11 @@ class SourceEmbedding(ABC):
|
||||
@register
|
||||
def similar_search(self, doc, topk):
|
||||
"""vector store similarity_search"""
|
||||
return self.vector_client.similar_search(doc, topk)
|
||||
try:
|
||||
ans = self.vector_client.similar_search(doc, topk)
|
||||
except NotEnoughElementsException:
|
||||
ans = self.vector_client.similar_search(doc, 1)
|
||||
return ans
|
||||
|
||||
def vector_name_exist(self):
|
||||
return self.vector_client.vector_name_exists()
|
||||
@ -79,14 +85,11 @@ class SourceEmbedding(ABC):
|
||||
if "index_to_store" in registered_methods:
|
||||
self.index_to_store(text)
|
||||
|
||||
def batch_embedding(self):
|
||||
if "read_batch" in registered_methods:
|
||||
text = self.read_batch()
|
||||
def read_batch(self):
|
||||
if "read" in registered_methods:
|
||||
text = self.read()
|
||||
if "data_process" in registered_methods:
|
||||
text = self.data_process(text)
|
||||
if "text_split" in registered_methods:
|
||||
self.text_split(text)
|
||||
if "text_to_vector" in registered_methods:
|
||||
self.text_to_vector(text)
|
||||
if "index_to_store" in registered_methods:
|
||||
self.index_to_store(text)
|
||||
return text
|
||||
|
@ -32,13 +32,14 @@ class DBSummaryClient:
|
||||
model_name=LLM_MODEL_CONFIG[CFG.EMBEDDING_MODEL]
|
||||
)
|
||||
vector_store_config = {
|
||||
"vector_store_name": dbname + "_profile",
|
||||
"vector_store_name": dbname + "_summary",
|
||||
"embeddings": embeddings,
|
||||
}
|
||||
embedding = StringEmbedding(
|
||||
file_path=db_summary_client.get_summery(),
|
||||
vector_store_config=vector_store_config,
|
||||
)
|
||||
self.init_db_profile(db_summary_client, dbname, embeddings)
|
||||
if not embedding.vector_name_exist():
|
||||
if CFG.SUMMARY_CONFIG == "FAST":
|
||||
for vector_table_info in db_summary_client.get_summery():
|
||||
@ -69,10 +70,22 @@ class DBSummaryClient:
|
||||
|
||||
logger.info("db summary embedding success")
|
||||
|
||||
def get_db_summary(self, dbname, query, topk):
|
||||
vector_store_config = {
|
||||
"vector_store_name": dbname + "_profile",
|
||||
}
|
||||
knowledge_embedding_client = KnowledgeEmbedding(
|
||||
model_name=LLM_MODEL_CONFIG[CFG.EMBEDDING_MODEL],
|
||||
vector_store_config=vector_store_config,
|
||||
)
|
||||
table_docs =knowledge_embedding_client.similar_search(query, topk)
|
||||
ans = [d.page_content for d in table_docs]
|
||||
return ans
|
||||
|
||||
def get_similar_tables(self, dbname, query, topk):
|
||||
"""get user query related tables info"""
|
||||
vector_store_config = {
|
||||
"vector_store_name": dbname + "_profile",
|
||||
"vector_store_name": dbname + "_summary",
|
||||
}
|
||||
knowledge_embedding_client = KnowledgeEmbedding(
|
||||
model_name=LLM_MODEL_CONFIG[CFG.EMBEDDING_MODEL],
|
||||
@ -112,6 +125,29 @@ class DBSummaryClient:
|
||||
for dbname in dbs:
|
||||
self.db_summary_embedding(dbname)
|
||||
|
||||
def init_db_profile(self, db_summary_client, dbname, embeddings):
|
||||
profile_store_config = {
|
||||
"vector_store_name": dbname + "_profile",
|
||||
"embeddings": embeddings,
|
||||
}
|
||||
embedding = StringEmbedding(
|
||||
file_path=db_summary_client.get_db_summery(),
|
||||
vector_store_config=profile_store_config,
|
||||
)
|
||||
if not embedding.vector_name_exist():
|
||||
docs = []
|
||||
docs.extend(embedding.read_batch())
|
||||
for table_summary in db_summary_client.table_info_json():
|
||||
embedding = StringEmbedding(
|
||||
table_summary,
|
||||
profile_store_config,
|
||||
)
|
||||
docs.extend(embedding.read_batch())
|
||||
embedding.index_to_store(docs)
|
||||
logger.info("init db profile success...")
|
||||
|
||||
|
||||
|
||||
|
||||
def _get_llm_response(query, db_input, dbsummary):
|
||||
chat_param = {
|
||||
|
@ -5,6 +5,43 @@ from pilot.summary.db_summary import DBSummary, TableSummary, FieldSummary, Inde
|
||||
|
||||
CFG = Config()
|
||||
|
||||
# {
|
||||
# "database_name": "mydatabase",
|
||||
# "tables": [
|
||||
# {
|
||||
# "table_name": "customers",
|
||||
# "columns": [
|
||||
# {"name": "id", "type": "int(11)", "is_primary_key": true},
|
||||
# {"name": "name", "type": "varchar(255)", "is_primary_key": false},
|
||||
# {"name": "email", "type": "varchar(255)", "is_primary_key": false}
|
||||
# ],
|
||||
# "indexes": [
|
||||
# {"name": "PRIMARY", "type": "primary", "columns": ["id"]},
|
||||
# {"name": "idx_name", "type": "index", "columns": ["name"]},
|
||||
# {"name": "idx_email", "type": "index", "columns": ["email"]}
|
||||
# ],
|
||||
# "size_in_bytes": 1024,
|
||||
# "rows": 1000
|
||||
# },
|
||||
# {
|
||||
# "table_name": "orders",
|
||||
# "columns": [
|
||||
# {"name": "id", "type": "int(11)", "is_primary_key": true},
|
||||
# {"name": "customer_id", "type": "int(11)", "is_primary_key": false},
|
||||
# {"name": "order_date", "type": "date", "is_primary_key": false},
|
||||
# {"name": "total_amount", "type": "decimal(10,2)", "is_primary_key": false}
|
||||
# ],
|
||||
# "indexes": [
|
||||
# {"name": "PRIMARY", "type": "primary", "columns": ["id"]},
|
||||
# {"name": "fk_customer_id", "type": "foreign_key", "columns": ["customer_id"], "referenced_table": "customers", "referenced_columns": ["id"]}
|
||||
# ],
|
||||
# "size_in_bytes": 2048,
|
||||
# "rows": 500
|
||||
# }
|
||||
# ],
|
||||
# "qps": 100,
|
||||
# "tps": 50
|
||||
# }
|
||||
|
||||
class MysqlSummary(DBSummary):
|
||||
"""Get mysql summary template."""
|
||||
@ -13,7 +50,7 @@ class MysqlSummary(DBSummary):
|
||||
self.name = name
|
||||
self.type = "MYSQL"
|
||||
self.summery = (
|
||||
"""database name:{name}, database type:{type}, table infos:{table_info}"""
|
||||
"""{{"database_name": "{name}", "type": "{type}", "tables": "{tables}", "qps": "{qps}", "tps": {tps}}}"""
|
||||
)
|
||||
self.tables = {}
|
||||
self.tables_info = []
|
||||
@ -31,12 +68,14 @@ class MysqlSummary(DBSummary):
|
||||
)
|
||||
tables = self.db.get_table_names()
|
||||
self.table_comments = self.db.get_table_comments(name)
|
||||
comment_map = {}
|
||||
for table_comment in self.table_comments:
|
||||
self.tables_info.append(
|
||||
"table name:{table_name},table description:{table_comment}".format(
|
||||
table_name=table_comment[0], table_comment=table_comment[1]
|
||||
)
|
||||
)
|
||||
comment_map[table_comment[0]] = table_comment[1]
|
||||
|
||||
vector_table = json.dumps(
|
||||
{"table_name": table_comment[0], "table_description": table_comment[1]}
|
||||
@ -45,11 +84,18 @@ class MysqlSummary(DBSummary):
|
||||
vector_table.encode("utf-8").decode("unicode_escape")
|
||||
)
|
||||
self.table_columns_info = []
|
||||
self.table_columns_json = []
|
||||
|
||||
for table_name in tables:
|
||||
table_summary = MysqlTableSummary(self.db, name, table_name)
|
||||
table_summary = MysqlTableSummary(self.db, name, table_name, comment_map)
|
||||
# self.tables[table_name] = table_summary.get_summery()
|
||||
self.tables[table_name] = table_summary.get_columns()
|
||||
self.table_columns_info.append(table_summary.get_columns())
|
||||
# self.table_columns_json.append(table_summary.get_summary_json())
|
||||
table_profile = "table name:{table_name},table description:{table_comment}".format(
|
||||
table_name=table_name, table_comment=self.db.get_show_create_table(table_name)
|
||||
)
|
||||
self.table_columns_json.append(table_profile)
|
||||
# self.tables_info.append(table_summary.get_summery())
|
||||
|
||||
def get_summery(self):
|
||||
@ -60,23 +106,29 @@ class MysqlSummary(DBSummary):
|
||||
name=self.name, type=self.type, table_info=";".join(self.tables_info)
|
||||
)
|
||||
|
||||
def get_db_summery(self):
|
||||
return self.summery.format(
|
||||
name=self.name, type=self.type, tables=";".join(self.vector_tables_info), qps=1000, tps=1000
|
||||
)
|
||||
|
||||
def get_table_summary(self):
|
||||
return self.tables
|
||||
|
||||
def get_table_comments(self):
|
||||
return self.table_comments
|
||||
|
||||
def get_columns(self):
|
||||
return self.table_columns_info
|
||||
def table_info_json(self):
|
||||
return self.table_columns_json
|
||||
|
||||
|
||||
class MysqlTableSummary(TableSummary):
|
||||
"""Get mysql table summary template."""
|
||||
|
||||
def __init__(self, instance, dbname, name):
|
||||
def __init__(self, instance, dbname, name, comment_map):
|
||||
self.name = name
|
||||
self.dbname = dbname
|
||||
self.summery = """database name:{dbname}, table name:{name}, have columns info: {fields}, have indexes info: {indexes}"""
|
||||
self.json_summery_template = """{{"table_name": "{name}", "comment": "{comment}", "columns": "{fields}", "indexes": "{indexes}", "size_in_bytes": {size_in_bytes}, "rows": {rows}}}"""
|
||||
self.fields = []
|
||||
self.fields_info = []
|
||||
self.indexes = []
|
||||
@ -100,6 +152,10 @@ class MysqlTableSummary(TableSummary):
|
||||
self.indexes.append(index_summary)
|
||||
self.indexes_info.append(index_summary.get_summery())
|
||||
|
||||
self.json_summery = self.json_summery_template.format(
|
||||
name=name, comment=comment_map[name], fields=self.fields_info, indexes=self.indexes_info, size_in_bytes=1000, rows=1000
|
||||
)
|
||||
|
||||
def get_summery(self):
|
||||
return self.summery.format(
|
||||
name=self.name,
|
||||
@ -111,20 +167,24 @@ class MysqlTableSummary(TableSummary):
|
||||
def get_columns(self):
|
||||
return self.column_summery
|
||||
|
||||
def get_summary_json(self):
|
||||
return self.json_summery
|
||||
|
||||
|
||||
class MysqlFieldsSummary(FieldSummary):
|
||||
"""Get mysql field summary template."""
|
||||
|
||||
def __init__(self, field):
|
||||
self.name = field[0]
|
||||
self.summery = """column name:{name}, column data type:{data_type}, is nullable:{is_nullable}, default value is:{default_value}, comment is:{comment} """
|
||||
# self.summery = """column name:{name}, column data type:{data_type}, is nullable:{is_nullable}, default value is:{default_value}, comment is:{comment} """
|
||||
# self.summery = """{"name": {name}, "type": {data_type}, "is_primary_key": {is_nullable}, "comment":{comment}, "default":{default_value}}"""
|
||||
self.data_type = field[1]
|
||||
self.default_value = field[2]
|
||||
self.is_nullable = field[3]
|
||||
self.comment = field[4]
|
||||
|
||||
def get_summery(self):
|
||||
return self.summery.format(
|
||||
return '{{"name": "{name}", "type": "{data_type}", "is_primary_key": "{is_nullable}", "comment": "{comment}", "default": "{default_value}"}}'.format(
|
||||
name=self.name,
|
||||
data_type=self.data_type,
|
||||
is_nullable=self.is_nullable,
|
||||
@ -138,11 +198,12 @@ class MysqlIndexSummary(IndexSummary):
|
||||
|
||||
def __init__(self, index):
|
||||
self.name = index[0]
|
||||
self.summery = """index name:{name}, index bind columns:{bind_fields}"""
|
||||
# self.summery = """index name:{name}, index bind columns:{bind_fields}"""
|
||||
self.summery_template = '{{"name": "{name}", "columns": {bind_fields}}}'
|
||||
self.bind_fields = index[1]
|
||||
|
||||
def get_summery(self):
|
||||
return self.summery.format(name=self.name, bind_fields=self.bind_fields)
|
||||
return self.summery_template.format(name=self.name, bind_fields=self.bind_fields)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
@ -29,7 +29,7 @@ tokenizers==0.13.2
|
||||
tqdm==4.64.1
|
||||
transformers==4.28.0
|
||||
timm==0.6.13
|
||||
spacy==3.5.1
|
||||
spacy==3.5.3
|
||||
webdataset==0.2.48
|
||||
yarl==1.8.2
|
||||
zipp==3.14.0
|
||||
|
4
run.sh
4
run.sh
@ -15,11 +15,11 @@ function find_python_command() {
|
||||
|
||||
PYTHONCMD=$(find_python_command)
|
||||
|
||||
nohup PYTHONCMD pilot/server/llmserver.py >> /root/server.log 2>&1 &
|
||||
nohup $PYTHONCMD pilot/server/llmserver.py >> /root/server.log 2>&1 &
|
||||
while [ `grep -c "Uvicorn running on" /root/server.log` -eq '0' ];do
|
||||
sleep 1s;
|
||||
echo "wait server running"
|
||||
done
|
||||
echo "server running"
|
||||
|
||||
PYTHONCMD pilot/server/webserver.py
|
||||
$PYTHONCMD pilot/server/webserver.py
|
||||
|
@ -23,7 +23,7 @@ class LocalKnowledgeInit:
|
||||
self.vector_store_config = vector_store_config
|
||||
self.model_name = LLM_MODEL_CONFIG["text2vec"]
|
||||
|
||||
def knowledge_persist(self, file_path, append_mode):
|
||||
def knowledge_persist(self, file_path):
|
||||
"""knowledge persist"""
|
||||
for root, _, files in os.walk(file_path, topdown=False):
|
||||
for file in files:
|
||||
@ -41,7 +41,6 @@ class LocalKnowledgeInit:
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument("--vector_name", type=str, default="default")
|
||||
parser.add_argument("--append", type=bool, default=False)
|
||||
args = parser.parse_args()
|
||||
vector_name = args.vector_name
|
||||
append_mode = args.append
|
||||
@ -49,5 +48,5 @@ if __name__ == "__main__":
|
||||
vector_store_config = {"vector_store_name": vector_name}
|
||||
print(vector_store_config)
|
||||
kv = LocalKnowledgeInit(vector_store_config=vector_store_config)
|
||||
kv.knowledge_persist(file_path=DATASETS_DIR, append_mode=append_mode)
|
||||
kv.knowledge_persist(file_path=DATASETS_DIR)
|
||||
print("your knowledge embedding success...")
|
||||
|
Loading…
Reference in New Issue
Block a user