Iván Martínez
bf3bddfbb6
More loaders, generic method
...
- Update the README with extra formats
- Add Powerpoint, requested in #138
- Add ePub requested in #138 comment - https://github.com/imartinez/privateGPT/pull/138#issuecomment-1549564535
- Update requirements
2023-05-17 00:55:21 +02:00
Iván Martínez
23d24c88e9
Update code to use sentence-transformers through huggingfaceembeddings
2023-05-17 00:32:41 +02:00
Andrea Pinto
d0aa57178a
ingest unlimited number of documents
2023-05-12 15:36:20 +02:00
Andrea Pinto
01f55441e7
fix persist db directory at ingestion
2023-05-12 10:37:10 +02:00
Sorin Neacsu
544ddd9631
load .env
2023-05-11 15:34:17 -07:00
alxspiker
f60dbb520e
Merge branch 'main' into main
2023-05-11 14:34:13 -06:00
alxspiker
52ae6c0866
.env + LlamaCpp + PDF/CSV + Ingest All
...
.env
Added an env file to make configuration easier
LlamaCpp
Added support for LlamaCpp in .env (MODEL_TYPE=LlamaCpp)
PDF/CSV
Added support for PDF and CSV files.
Ingest All
All files in source_documents will automatically get stored in vector store based on their file type when running ingest, no longer need a path argument.
2023-05-11 14:24:39 -06:00
R-Y-M-R
f12ea568e5
Use constants.py file
2023-05-11 10:29:07 -04:00
R-Y-M-R
8c6a81a07f
Fix: Disable Chroma Telemetry
...
Opts-out of anonymized telemetry being tracked in Chroma.
See: https://docs.trychroma.com/telemetry
2023-05-11 10:17:18 -04:00
Iván Martínez
026b9f895c
Use RecursiveCharacterTextSplitter to avoid llama_tokenize: too many tokens error during ingestion
2023-05-09 00:21:02 +02:00
Iván Martínez
92244a90b4
Use a different text splitter to improve results. Ingest takes an argument pointing to the doc to ingest.
2023-05-05 17:32:31 +02:00
Iván martínez
55338b8f6e
End-to-end working version
2023-05-02 20:32:28 +02:00