feat(ingest): Created a faster ingestion mode - pipeline (#1750)

* Unify pgvector and postgres connection settings

* Remove local changes

* Update file pgvector->postgres

* postgresql should be postgres

* Adding pipeline ingestion mode

* disable hugging face parallelism.  Continue on file to doc transform failure

* Semaphore to limit docq async workers. ETA reporting
This commit is contained in:
Brett England
2024-03-19 16:24:46 -04:00
committed by GitHub
parent 1efac6a3fe
commit 134fc54d7d
5 changed files with 301 additions and 2 deletions

View File

@@ -155,13 +155,14 @@ class HuggingFaceSettings(BaseModel):
class EmbeddingSettings(BaseModel):
mode: Literal["huggingface", "openai", "azopenai", "sagemaker", "ollama", "mock"]
ingest_mode: Literal["simple", "batch", "parallel"] = Field(
ingest_mode: Literal["simple", "batch", "parallel", "pipeline"] = Field(
"simple",
description=(
"The ingest mode to use for the embedding engine:\n"
"If `simple` - ingest files sequentially and one by one. It is the historic behaviour.\n"
"If `batch` - if multiple files, parse all the files in parallel, "
"and send them in batch to the embedding model.\n"
"In `pipeline` - The Embedding engine is kept as busy as possible\n"
"If `parallel` - parse the files in parallel using multiple cores, and embedd them in parallel.\n"
"`parallel` is the fastest mode for local setup, as it parallelize IO RW in the index.\n"
"For modes that leverage parallelization, you can specify the number of "
@@ -174,6 +175,7 @@ class EmbeddingSettings(BaseModel):
"The number of workers to use for file ingestion.\n"
"In `batch` mode, this is the number of workers used to parse the files.\n"
"In `parallel` mode, this is the number of workers used to parse the files and embed them.\n"
"In `pipeline` mode, this is the number of workers that can perform embeddings.\n"
"This is only used if `ingest_mode` is not `simple`.\n"
"Do not go too high with this number, as it might cause memory issues. (especially in `parallel` mode)\n"
"Do not set it higher than your number of threads of your CPU."