update:doc

This commit is contained in:
aries-ckt 2023-06-13 10:56:58 +08:00
parent f29cbaab79
commit 7124c34017
4 changed files with 93 additions and 6 deletions

View File

@ -3,4 +3,14 @@
This is a collection of DB-GPT tutorials on Medium.
Comming soon...
###Introduce
[What is DB-GPT](https://www.youtube.com/watch?v=QszhVJerc0I) by csunny (https://github.com/csunny/DB-GPT):
### Knowledge
[How to Create your own knowledge repository](https://db-gpt.readthedocs.io/en/latest/modules/knownledge.html)
[Add new Knowledge demonstration](../../assets/new_knownledge_en.gif)
### DB Plugins
[db plugins demonstration](../../assets/auto_sql_en.gif)

View File

@ -19,7 +19,6 @@ As the knowledge base is currently the most significant user demand scenario, we
python tools/knowledge_init.py
--vector_name : your vector store name default_value:default
--append: append mode, True:append, False: not append default_value:False
```

View File

@ -8,4 +8,82 @@ To use multiple models, modify the LLM_MODEL parameter in the .env configuration
Notice: you can create .env file from .env.template, just use command like this:
```
cp .env.template .env
```
LLM_MODEL=vicuna-13b
MODEL_SERVER=http://127.0.0.1:8000
```
now we support models vicuna-13b, vicuna-7b, chatglm-6b, flan-t5-base, guanaco-33b-merged, falcon-40b, gorilla-7b.
DB-GPT provides a model load adapter and chat adapter. load adapter which allows you to easily adapt load different LLM models by inheriting the BaseLLMAdapter. You just implement match() and loader() method.
vicuna llm load adapter
```
class VicunaLLMAdapater(BaseLLMAdaper):
"""Vicuna Adapter"""
def match(self, model_path: str):
return "vicuna" in model_path
def loader(self, model_path: str, from_pretrained_kwagrs: dict):
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)
model = AutoModelForCausalLM.from_pretrained(
model_path, low_cpu_mem_usage=True, **from_pretrained_kwagrs
)
return model, tokenizer
```
chatglm load adapter
```
class ChatGLMAdapater(BaseLLMAdaper):
"""LLM Adatpter for THUDM/chatglm-6b"""
def match(self, model_path: str):
return "chatglm" in model_path
def loader(self, model_path: str, from_pretrained_kwargs: dict):
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
if DEVICE != "cuda":
model = AutoModel.from_pretrained(
model_path, trust_remote_code=True, **from_pretrained_kwargs
).float()
return model, tokenizer
else:
model = (
AutoModel.from_pretrained(
model_path, trust_remote_code=True, **from_pretrained_kwargs
)
.half()
.cuda()
)
return model, tokenizer
```
chat adapter which allows you to easily adapt chat different LLM models by inheriting the BaseChatAdpter.you just implement match() and get_generate_stream_func() method
vicuna llm chat adapter
```
class VicunaChatAdapter(BaseChatAdpter):
"""Model chat Adapter for vicuna"""
def match(self, model_path: str):
return "vicuna" in model_path
def get_generate_stream_func(self):
return generate_stream
```
chatglm llm chat adapter
```
class ChatGLMChatAdapter(BaseChatAdpter):
"""Model chat Adapter for ChatGLM"""
def match(self, model_path: str):
return "chatglm" in model_path
def get_generate_stream_func(self):
from pilot.model.llm_out.chatglm_llm import chatglm_generate_stream
return chatglm_generate_stream
```
if you want to integrate your own model, just need to inheriting BaseLLMAdaper and BaseChatAdpter and implement the methods

View File

@ -3,7 +3,7 @@
Chat with your own knowledge is a very interesting thing. In the usage scenarios of this chapter, we will introduce how to build your own knowledge base through the knowledge base API. Firstly, building a knowledge store can currently be initialized by executing "python tool/knowledge_init.py" to initialize the content of your own knowledge base, which was introduced in the previous knowledge base module. Of course, you can also call our provided knowledge embedding API to store knowledge.
We currently support four document formats: txt, pdf, url, and md.
We currently support many document formats: txt, pdf, md, html, doc, ppt, and url.
```
vector_store_config = {
"vector_store_name": name
@ -11,7 +11,7 @@ vector_store_config = {
file_path = "your file path"
knowledge_embedding_client = KnowledgeEmbedding(file_path=file_path, model_name=LLM_MODEL_CONFIG["text2vec"],local_persist=False, vector_store_config=vector_store_config)
knowledge_embedding_client = KnowledgeEmbedding(file_path=file_path, model_name=LLM_MODEL_CONFIG["text2vec"], vector_store_config=vector_store_config)
knowledge_embedding_client.knowledge_embedding()
@ -37,7 +37,7 @@ vector_store_config = {
query = "your query"
knowledge_embedding_client = KnowledgeEmbedding(file_path="", model_name=LLM_MODEL_CONFIG["text2vec"], local_persist=False, vector_store_config=vector_store_config)
knowledge_embedding_client = KnowledgeEmbedding(file_path="", model_name=LLM_MODEL_CONFIG["text2vec"], vector_store_config=vector_store_config)
knowledge_embedding_client.similar_search(query, 10)
```