Commit Graph

349 Commits

Author SHA1 Message Date
Iván Martínez
8a5b2f453b Use faster and better embeddings: sentenceTransformers 2023-05-17 00:19:21 +02:00
Iván Martínez
2217b5f0e3 More loaders, generic method
- Update the README with extra formats
- Add Powerpoint, requested in #138
- Add ePub requested in #138 comment - https://github.com/imartinez/privateGPT/pull/138#issuecomment-1549564535
- Update requirements
2023-05-16 23:58:58 +02:00
Iván Martínez
b6f007dbb8
Update issue templates 2023-05-16 20:44:30 +02:00
Iván Martínez
9e94a3cd40
Update issue templates 2023-05-16 20:12:34 +02:00
Iván Martínez
f42d3e0ce2
Merge pull request #168 from andreakiro/fix/requirements
Add python-dotenv to requirements
2023-05-16 19:32:11 +02:00
Andrea Pinto
7ae80e6629 add python-dotenv to requirements 2023-05-15 19:19:10 +02:00
Iván Martínez
5a695e9767
Merge pull request #93 from katojunichi893/main
Update README.md
2023-05-14 10:55:12 +02:00
Iván Martínez
a061270bf0
Merge pull request #105 from koushkv/patch-1
fixed a typo
2023-05-14 10:42:25 +02:00
Iván Martínez
7612193031
Merge pull request #64 from FluffyDietEngine/main
added library for parsing PDFs
2023-05-14 10:39:38 +02:00
katojunichi893
9c3832c156 Update README.md 2023-05-14 17:36:40 +09:00
Koushik
2dac62c5aa
fixed a typo 2023-05-14 10:26:13 +05:30
ひかる
24e464f51b
Update README.md 2023-05-14 04:18:17 +09:00
Iván Martínez
b76a240714
Merge pull request #74 from andreakiro/fix/load-documents
Ingest unlimited number of documents
2023-05-13 10:36:57 +02:00
Andrea Pinto
d0aa57178a ingest unlimited number of documents 2023-05-12 15:36:20 +02:00
Iván Martínez
271673ffcc
Merge pull request #68 from andreakiro/readme/updates
Note on instructions for .env
2023-05-12 11:33:51 +02:00
Iván Martínez
034fde4c3e
Merge pull request #67 from andreakiro/fix/persist-dir
Fix persist db directory at ingestion
2023-05-12 11:31:53 +02:00
Andrea Pinto
718b67715c note on instructions for .env 2023-05-12 11:15:51 +02:00
Andrea Pinto
01f55441e7 fix persist db directory at ingestion 2023-05-12 10:37:10 +02:00
Santhosh Solomon
6419d0aa1c
added library for parsing PDFs
pdfminer.six==20221105
2023-05-12 09:33:05 +05:30
Iván Martínez
39df61ca07
Merge pull request #58 from sorin/sorin-fix-env
Load .env file
2023-05-12 00:37:05 +02:00
Sorin Neacsu
544ddd9631
load .env 2023-05-11 15:34:17 -07:00
Sorin Neacsu
e947ca1d0f
load .env 2023-05-11 15:33:56 -07:00
Iván Martínez
bc7ce4395b
Merge pull request #53 from alxspiker/main
.env + LlamaCpp + PDF/CSV + Ingest All
2023-05-11 23:22:27 +02:00
alxspiker
39d00b840d
Update README.md 2023-05-11 15:05:07 -06:00
alxspiker
9722ef4356
Update README.md 2023-05-11 15:01:57 -06:00
alxspiker
51f01d850a
Update README.md 2023-05-11 14:53:10 -06:00
alxspiker
f60dbb520e
Merge branch 'main' into main 2023-05-11 14:34:13 -06:00
alxspiker
52ae6c0866 .env + LlamaCpp + PDF/CSV + Ingest All
.env

Added an env file to make configuration easier

LlamaCpp

Added support for LlamaCpp in .env (MODEL_TYPE=LlamaCpp)

PDF/CSV

Added support for PDF and CSV files.

Ingest All

All files in source_documents will automatically get stored in vector store based on their file type when running ingest, no longer need a path argument.
2023-05-11 14:24:39 -06:00
Iván Martínez
56c1be36ad
Merge pull request #44 from R-Y-M-R/Fix/DisableChromaTelemetry
Disable chroma telemetry. Extract constants.
2023-05-11 19:38:43 +02:00
Iván Martínez
9c0321235b
Merge pull request #39 from R-Y-M-R/Update/Requirements
Update langchain and llama versions
2023-05-11 19:35:31 +02:00
R-Y-M-R
85528db743 Update langchain to 0.0.166
Tested.

Release: https://github.com/hwchase17/langchain/releases/tag/v0.0.166
2023-05-11 12:37:00 -04:00
R-Y-M-R
f12ea568e5 Use constants.py file 2023-05-11 10:29:07 -04:00
R-Y-M-R
8c6a81a07f Fix: Disable Chroma Telemetry
Opts-out of anonymized telemetry being tracked in Chroma.

See: https://docs.trychroma.com/telemetry
2023-05-11 10:17:18 -04:00
R-Y-M-R
918b384e38 Update langchain and llama versions
Bumped versions in requirements.txt, tested OK.

langchain 0.0.165 release: https://github.com/hwchase17/langchain/releases/tag/v0.0.165

llama 0.1.48 release: https://github.com/abetlen/llama-cpp-python/releases/tag/v0.1.48
2023-05-11 09:50:40 -04:00
Iván Martínez
60225698b6
Merge pull request #35 from R-Y-M-R/Fix/urllib3
Add urllib3 fix to requirements.txt
2023-05-11 14:32:28 +02:00
R-Y-M-R
54d14a6cb6 Resolve #17: Add urllib3 fix to requirements.txt
Applied fix from @abereghici to requirements.txt
2023-05-11 06:26:04 -04:00
Iván Martínez
2841fe45e1
Merge pull request #22 from 0mlml/patch-1
Fix typo in README.md
2023-05-10 14:52:11 +02:00
Max
e3769a060e
Fix typo in README.md 2023-05-10 08:17:39 -04:00
Iván Martínez
026b9f895c Use RecursiveCharacterTextSplitter to avoid llama_tokenize: too many tokens error during ingestion 2023-05-09 00:21:02 +02:00
Iván Martínez
75a1141743
Update README.md
Reflect the updated execution flow
2023-05-08 23:49:54 +02:00
Iván Martínez
34cb82c784
Update README.md 2023-05-08 23:47:09 +02:00
Iván Martínez
ab30465be7
Update README.md
Add demo screenshot
2023-05-08 23:44:43 +02:00
Iván Martínez
bdd8c8748b Update dependencies. Remove custom gpt4all_j wrapper. 2023-05-08 23:41:57 +02:00
Iván Martínez
92244a90b4 Use a different text splitter to improve results. Ingest takes an argument pointing to the doc to ingest. 2023-05-05 17:32:31 +02:00
Iván Martínez
a05186b598
Merge pull request #3 from mkinney/main
pin pygptj
2023-05-04 08:33:15 +02:00
Mike Kinney
5128704a8e pin pygptj 2023-05-03 23:29:31 -07:00
Iván martínez
77447e50c0 Complete readme. Fixed reference in gpt4all_j wrapper 2023-05-02 20:33:16 +02:00
Iván martínez
55338b8f6e End-to-end working version 2023-05-02 20:32:28 +02:00
Iván Martínez
51dae80058
Initial commit 2023-05-02 11:15:31 +02:00