Commit Graph

333 Commits

Author SHA1 Message Date
Iván Martínez
a86641cdec Readme small fixes following review and formatting 2023-05-20 11:22:45 +02:00
Iván Martínez
fc50eb1b89 Merge branch 'abhiruka-main' 2023-05-20 11:21:35 +02:00
jiangzhuo
cb7c96b31d Add progress bar to load_documents function
Enhanced the load_documents() function by adding a progress bar using the tqdm library. This change improves user experience by providing real-time feedback on the progress of document loading. Now, users can easily track the progress of this operation, especially when loading a large number of documents.
2023-05-20 11:16:13 +02:00
jiangzhuo
e3b769d33a Optimize load_documents function with multiprocessing 2023-05-20 11:16:13 +02:00
MDW
04f6706bbb Make scripts executeable, add basic pre-commit setup 2023-05-20 11:15:58 +02:00
Iván Martínez
20554a7c9d
Merge pull request #292 from jiangzhuo/feature/multiprocessing-for-document-loading
Optimize load_documents function with multiprocessing
2023-05-20 10:57:42 +02:00
Iván Martínez
b30cd52136
Merge pull request #271 from mdeweerd/executable_python
Make scripts executable, add basic pre-commit setup
2023-05-20 10:49:20 +02:00
Abhiruka
be1bcbca37
Merge branch 'imartinez:main' into main 2023-05-20 07:42:26 +08:00
abhiruka
f8805c80f8 Update as per the feedback.
- moved args parser inside main
- assigned empty list to docs.
- Updated README.md.
2023-05-20 07:40:05 +08:00
MDW
7f918a9fa1 Make scripts executeable, add basic pre-commit setup 2023-05-19 23:21:39 +02:00
Iván Martínez
22945bc91d
Merge pull request #299 from mdeweerd/elm_extended
Add fallback for plain elm #294 #290
2023-05-19 21:40:42 +02:00
abhiruka
9fb7f07e3c "Refactored main function to take hide_source and mute_stream parameters for controlling output. Added argparse for command-line argument parsing. StreamingStdOutCallbackHandler and source document display are now optional based on user input. Introduced parse_arguments function to handle command-line arguments. Also, updated README.md to reflect these changes." 2023-05-19 23:18:31 +08:00
MDW
4cda348cf8 Fix #294 (tested) 2023-05-19 16:23:09 +02:00
jiangzhuo
ba0dbe8d1c Add progress bar to load_documents function
Enhanced the load_documents() function by adding a progress bar using the tqdm library. This change improves user experience by providing real-time feedback on the progress of document loading. Now, users can easily track the progress of this operation, especially when loading a large number of documents.
2023-05-19 10:59:38 +09:00
jiangzhuo
81b221bccb Optimize load_documents function with multiprocessing 2023-05-19 10:58:28 +09:00
MDW
a862ff2be6 Add fallback for plain elm #294 #290 2023-05-19 01:04:42 +02:00
Iván Martínez
ad64589c8f
Merge pull request #231 from milescattini/patch-1
Add fix for clang install of non m1 mac
2023-05-18 23:51:36 +02:00
Iván Martínez
b9f8dc312f
Merge pull request #254 from Fabio3rs/formatOffice97-2003
Add .doc .ppt (Word and PowerPoint 97/2003 formats)
2023-05-18 23:49:40 +02:00
Iván Martínez
1590c5890f Update requirements 2023-05-18 23:23:11 +02:00
impulsivus
7844553ca1
Implement a way of ingesting more documents
Move environment variables to the global scope
Add a better check for vectorstore existence
Introduced a new function for better readability
Co-authored-by: Pulp <51127079+PulpCattel@users.noreply.github.com>
2023-05-18 17:45:38 +03:00
Iván Martínez
42046c5ec0
Merge pull request #268 from vilaca/dotenv-called-twice
remove duplicate call 'load_dotenv()' in ingester.py
2023-05-18 15:15:17 +02:00
milescattini
2360728fab
Fix Typo in Mac on Intel 2023-05-18 18:02:54 +10:00
Fabio Rossini Sluzala
ec126b51d8
Fix loader mapping order 2023-05-17 22:38:30 -03:00
vilaca
79a3c00313 remove duplicate 2023-05-17 23:45:27 +01:00
Fabio Rossini Sluzala
652401cf29
Add the formats to the README.md 2023-05-17 13:53:46 -03:00
Fabio Rossini Sluzala
66a9f9cde0
Add .doc .ppt (Word and PowerPoint 97/2003 formats) 2023-05-17 12:04:16 -03:00
Iván Martínez
355b4be7c0
Merge pull request #224 from imartinez/feature/sentence-transformers-embeddings
Feature/sentence transformers embeddings
2023-05-17 10:56:34 +02:00
Iván Martínez
83797ec08b
Merge pull request #240 from zishon89us/patch-1
pypandoc-binary replacing pandoc-binary
2023-05-17 09:25:14 +02:00
Zeeshan Hassan Memon
dd144bba16
pypandoc-binary replacing pandoc-binary 2023-05-17 11:27:43 +05:00
milescattini
380b119581
Add fix for clang install of non m1 mac 2023-05-17 11:48:35 +10:00
Iván Martínez
90798f1986 Merge branch 'main' into feature/sentence-transformers-embeddings 2023-05-17 01:00:13 +02:00
Iván Martínez
bf3bddfbb6 More loaders, generic method
- Update the README with extra formats
- Add Powerpoint, requested in #138
- Add ePub requested in #138 comment - https://github.com/imartinez/privateGPT/pull/138#issuecomment-1549564535
- Update requirements
2023-05-17 00:55:21 +02:00
Iván Martínez
fdb45741e5
Merge pull request #211 from mdeweerd/extra_loaders
More loaders, generic method
2023-05-17 00:39:37 +02:00
Iván Martínez
23d24c88e9 Update code to use sentence-transformers through huggingfaceembeddings 2023-05-17 00:32:41 +02:00
Iván Martínez
8a5b2f453b Use faster and better embeddings: sentenceTransformers 2023-05-17 00:19:21 +02:00
Iván Martínez
2217b5f0e3 More loaders, generic method
- Update the README with extra formats
- Add Powerpoint, requested in #138
- Add ePub requested in #138 comment - https://github.com/imartinez/privateGPT/pull/138#issuecomment-1549564535
- Update requirements
2023-05-16 23:58:58 +02:00
Iván Martínez
b6f007dbb8
Update issue templates 2023-05-16 20:44:30 +02:00
Iván Martínez
9e94a3cd40
Update issue templates 2023-05-16 20:12:34 +02:00
Iván Martínez
f42d3e0ce2
Merge pull request #168 from andreakiro/fix/requirements
Add python-dotenv to requirements
2023-05-16 19:32:11 +02:00
Andrea Pinto
7ae80e6629 add python-dotenv to requirements 2023-05-15 19:19:10 +02:00
Iván Martínez
5a695e9767
Merge pull request #93 from katojunichi893/main
Update README.md
2023-05-14 10:55:12 +02:00
Iván Martínez
a061270bf0
Merge pull request #105 from koushkv/patch-1
fixed a typo
2023-05-14 10:42:25 +02:00
Iván Martínez
7612193031
Merge pull request #64 from FluffyDietEngine/main
added library for parsing PDFs
2023-05-14 10:39:38 +02:00
katojunichi893
9c3832c156 Update README.md 2023-05-14 17:36:40 +09:00
Koushik
2dac62c5aa
fixed a typo 2023-05-14 10:26:13 +05:30
ひかる
24e464f51b
Update README.md 2023-05-14 04:18:17 +09:00
Iván Martínez
b76a240714
Merge pull request #74 from andreakiro/fix/load-documents
Ingest unlimited number of documents
2023-05-13 10:36:57 +02:00
Andrea Pinto
d0aa57178a ingest unlimited number of documents 2023-05-12 15:36:20 +02:00
Iván Martínez
271673ffcc
Merge pull request #68 from andreakiro/readme/updates
Note on instructions for .env
2023-05-12 11:33:51 +02:00
Iván Martínez
034fde4c3e
Merge pull request #67 from andreakiro/fix/persist-dir
Fix persist db directory at ingestion
2023-05-12 11:31:53 +02:00