From e74a11119c79cef28de3b3b7f4664bbd7ae6a9b1 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Iv=C3=A1n=20Mart=C3=ADnez?= Date: Sat, 20 May 2023 12:15:13 +0200 Subject: [PATCH] Show ingestion logs in readme --- README.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/README.md b/README.md index f2632d54..5719ebfe 100644 --- a/README.md +++ b/README.md @@ -56,6 +56,19 @@ Run the following command to ingest all the data. python ingest.py ``` +Output should look like this: + +```shell +Creating new vectorstore +Loading documents from source_documents +Loading new documents: 100%|██████████████████████| 1/1 [00:01<00:00, 1.73s/it] +Loaded 1 new documents from source_documents +Split into 90 chunks of text (max. 500 tokens each) +Creating embeddings. May take some minutes... +Using embedded DuckDB with persistence: data will be stored in: db +Ingestion complete! You can now run privateGPT.py to query your documents +``` + It will create a `db` folder containing the local vectorstore. Will take 20-30 seconds per document, depending on the size of the document. You can ingest as many documents as you want, and all will be accumulated in the local embeddings database. If you want to start from an empty database, delete the `db` folder.