doc(ChatKnowledge): Add documents for knowledge command line

This commit is contained in:
FangYin Cheng
2023-09-28 12:50:37 +08:00
parent 5dfe611478
commit 5b9a0fa7c0
3 changed files with 361 additions and 65 deletions

View File

@@ -91,4 +91,207 @@ Prompt Argument
#### WEAVIATE
* WEAVIATE_URL=https://kt-region-m8hcy0wc.weaviate.network
```
## KBQA command line
### Load your local documents to DB-GPT
```bash
dbgpt knowledge load --space_name my_kbqa_space --local_doc_path ./pilot/datasets --vector_store_type Chroma
```
- `--space_name`: Your knowledge space name, default: `default`
- `--local_doc_path`: Your document directory or document file path, default: `./pilot/datasets`
- `--vector_store_type`: Vector store type, default: `Chroma`
**View the `dbgpt knowledge load --help`help**
```
dbgpt knowledge load --help
```
Here you can see the parameters:
```
Usage: dbgpt knowledge load [OPTIONS]
Load your local knowledge to DB-GPT
Options:
--space_name TEXT Your knowledge space name [default: default]
--vector_store_type TEXT Vector store type. [default: Chroma]
--local_doc_path TEXT Your document directory or document file path.
[default: ./pilot/datasets]
--skip_wrong_doc Skip wrong document.
--overwrite Overwrite existing document(they has same name).
--max_workers INTEGER The maximum number of threads that can be used to
upload document.
--pre_separator TEXT Preseparator, this separator is used for pre-
splitting before the document is actually split by
the text splitter. Preseparator are not included
in the vectorized text.
--separator TEXT This is the document separator. Currently, only
one separator is supported.
--chunk_size INTEGER Maximum size of chunks to split.
--chunk_overlap INTEGER Overlap in characters between chunks.
--help Show this message and exit.
```
### List knowledge space
#### List knowledge space
```
dbgpt knowledge list
```
Output should look something like the following:
```
+------------------------------------------------------------------+
| All knowledge spaces |
+----------+-------------+-------------+-------------+-------------+
| Space ID | Space Name | Vector Type | Owner | Description |
+----------+-------------+-------------+-------------+-------------+
| 6 | n1 | Chroma | DB-GPT | DB-GPT cli |
| 5 | default_2 | Chroma | DB-GPT | DB-GPT cli |
| 4 | default_1 | Chroma | DB-GPT | DB-GPT cli |
| 3 | default | Chroma | DB-GPT | DB-GPT cli |
+----------+-------------+-------------+-------------+-------------+
```
#### List documents in knowledge space
```
dbgpt knowledge list --space_name default
```
Output should look something like the following:
```
+------------------------------------------------------------------------+
| Space default description |
+------------+-----------------+--------------+--------------+-----------+
| Space Name | Total Documents | Current Page | Current Size | Page Size |
+------------+-----------------+--------------+--------------+-----------+
| default | 1 | 1 | 1 | 20 |
+------------+-----------------+--------------+--------------+-----------+
+-----------------------------------------------------------------------------------------------------------------------------------+
| Documents of space default |
+------------+-------------+---------------+----------+--------+----------------------------+----------+----------------------------+
| Space Name | Document ID | Document Name | Type | Chunks | Last Sync | Status | Result |
+------------+-------------+---------------+----------+--------+----------------------------+----------+----------------------------+
| default | 61 | Knowledge.pdf | DOCUMENT | 745 | 2023-09-28T03:25:39.065762 | FINISHED | document embedding success |
+------------+-------------+---------------+----------+--------+----------------------------+----------+----------------------------+
```
#### List chunks of document in space `default`
```
dbgpt knowledge list --space_name default --doc_id 61 --page_size 5
```
```
+-----------------------------------------------------------------------------------+
| Document 61 in default description |
+------------+-------------+--------------+--------------+--------------+-----------+
| Space Name | Document ID | Total Chunks | Current Page | Current Size | Page Size |
+------------+-------------+--------------+--------------+--------------+-----------+
| default | 61 | 745 | 1 | 5 | 5 |
+------------+-------------+--------------+--------------+--------------+-----------+
+-----------------------------------------------------------------------------------------------------------------------+
| chunks of document id 61 in space default |
+------------+-------------+---------------+----------+-----------------------------------------------------------------+
| Space Name | Document ID | Document Name | Content | Meta Data |
+------------+-------------+---------------+----------+-----------------------------------------------------------------+
| default | 61 | Knowledge.pdf | [Hidden] | {'source': '/app/pilot/data/default/Knowledge.pdf', 'page': 10} |
| default | 61 | Knowledge.pdf | [Hidden] | {'source': '/app/pilot/data/default/Knowledge.pdf', 'page': 9} |
| default | 61 | Knowledge.pdf | [Hidden] | {'source': '/app/pilot/data/default/Knowledge.pdf', 'page': 9} |
| default | 61 | Knowledge.pdf | [Hidden] | {'source': '/app/pilot/data/default/Knowledge.pdf', 'page': 8} |
| default | 61 | Knowledge.pdf | [Hidden] | {'source': '/app/pilot/data/default/Knowledge.pdf', 'page': 8} |
+------------+-------------+---------------+----------+-----------------------------------------------------------------+
```
#### More list usage
```
dbgpt knowledge list --help
```
```
Usage: dbgpt knowledge list [OPTIONS]
List knowledge space
Options:
--space_name TEXT Your knowledge space name. If None, list all
spaces
--doc_id INTEGER Your document id in knowledge space. If Not
None, list all chunks in current document
--page INTEGER The page for every query [default: 1]
--page_size INTEGER The page size for every query [default: 20]
--show_content Query the document content of chunks
--output [text|html|csv|latex|json]
The output format
--help Show this message and exit.
```
### Delete your knowledge space or document in space
#### Delete your knowledge space
```
dbgpt knowledge delete --space_name default
```
#### Delete your document in space
```
dbgpt knowledge delete --space_name default --doc_name Knowledge.pdf
```
#### More delete usage
```
dbgpt knowledge delete --help
```
```
Usage: dbgpt knowledge delete [OPTIONS]
Delete your knowledge space or document in space
Options:
--space_name TEXT Your knowledge space name [default: default]
--doc_name TEXT The document name you want to delete. If doc_name is
None, this command will delete the whole space.
-y Confirm your choice
--help Show this message and exit.
```
#### More knowledge usage
```
dbgpt knowledge --help
```
```
Usage: dbgpt knowledge [OPTIONS] COMMAND [ARGS]...
Knowledge command line tool
Options:
--address TEXT Address of the Api server(If not set, try to read from
environment variable: API_ADDRESS). [default:
http://127.0.0.1:5000]
--help Show this message and exit.
Commands:
delete Delete your knowledge space or document in space
list List knowledge space
load Load your local documents to DB-GPT
```