privateGPT/private_gpt/server/chunks/chunks_service.py
Pablo Orgaz 51cc638758
Next version of PrivateGPT (#1077)
* Dockerize private-gpt

* Use port 8001 for local development

* Add setup script

* Add CUDA Dockerfile

* Create README.md

* Make the API use OpenAI response format

* Truncate prompt

* refactor: add models and __pycache__ to .gitignore

* Better naming

* Update readme

* Move models ignore to it's folder

* Add scaffolding

* Apply formatting

* Fix tests

* Working sagemaker custom llm

* Fix linting

* Fix linting

* Enable streaming

* Allow all 3.11 python versions

* Use llama 2 prompt format and fix completion

* Restructure (#3)

Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local>

* Fix Dockerfile

* Use a specific build stage

* Cleanup

* Add FastAPI skeleton

* Cleanup openai package

* Fix DI and tests

* Split tests and tests with coverage

* Remove old scaffolding

* Add settings logic (#4)

* Add settings logic

* Add settings for sagemaker

---------

Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local>

* Local LLM (#5)

* Add settings logic

* Add settings for sagemaker

* Add settings-local-example.yaml

* Delete terraform files

* Refactor tests to use fixtures

* Join deltas

* Add local model support

---------

Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local>

* Update README.md

* Fix tests

* Version bump

* Enable simple llamaindex observability (#6)

* Enable simple llamaindex observability

* Improve code through linting

* Update README.md

* Move to async (#7)

* Migrate implementation to use asyncio

* Formatting

* Cleanup

* Linting

---------

Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local>

* Query Docs and gradio UI

* Remove unnecessary files

* Git ignore chromadb folder

* Async migration + DI Cleanup

* Fix tests

* Add integration test

* Use fastapi responses

* Retrieval service with partial implementation

* Cleanup

* Run formatter

* Fix types

* Fetch nodes asynchronously

* Install local dependencies in tests

* Install ui dependencies in tests

* Install dependencies for llama-cpp

* Fix sudo

* Attempt to fix cuda issues

* Attempt to fix cuda issues

* Try to reclaim some space from ubuntu machine

* Retrieval with context

* Fix lint and imports

* Fix mypy

* Make retrieval API a POST

* Make Completions body a dataclass

* Fix LLM chat message order

* Add Query Chunks to Gradio UI

* Improve rag query prompt

* Rollback CI Changes

* Move to sync code

* Using Llamaindex abstraction for query retrieval

* Fix types

* Default to CONDENSED chat mode for contextualized chat

* Rename route function

* Add Chat endpoint

* Remove webhooks

* Add IntelliJ run config to gitignore

* .gitignore applied

* Sync chat completion

* Refactor total

* Typo in context_files.py

* Add embeddings component and service

* Remove wrong dataclass from IngestService

* Filter by context file id implementation

* Fix typing

* Implement context_filter and separate from the bool use_context in the API

* Change chunks api to avoid conceptual class of the context concept

* Deprecate completions and fix tests

* Remove remaining dataclasses

* Use embedding component in ingest service

* Fix ingestion to have multipart and local upload

* Fix ingestion API

* Add chunk tests

* Add configurable paths

* Cleaning up

* Add more docs

* IngestResponse includes a list of IngestedDocs

* Use IngestedDoc in the Chunk document reference

* Rename ingest routes to ingest_router.py

* Fix test working directory for intellij

* Set testpaths for pytest

* Remove unused as_chat_engine

* Add .fleet ide to gitignore

* Make LLM and Embedding model configurable

* Fix imports and checks

* Let local_data folder exist empty in the repository

* Don't use certain metadata in LLM

* Remove long lines

* Fix windows installation

* Typos

* Update poetry.lock

* Add TODO for linux

* Script and first version of docs

* No jekill build

* Fix relative url to openapi json

* Change default docs values

* Move chromadb dependency to the general group

* Fix tests to use separate local_data

* Create CNAME

* Update CNAME

* Fix openapi.json relative path

* PrivateGPT logo

* WIP OpenAPI documentation metadata

* Add ingest script (#11)

* Add ingest script

* Fix broken name refactor

* Add ingest docs and Makefile script

* Linting

* Move transformers to main dependency

* Move torch to main dependencies

* Don't load HuggingFaceEmbedding in tests

* Fix lint

---------

Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local>

* Rename file to camel_case

* Commit settings-local.yaml

* Move documentation to public docs

* Fix docker image for linux

* Installation and Running the Server documentation

* Move back to docs folder, as it is the only supported by github pages

* Delete CNAME

* Create CNAME

* Delete CNAME

* Create CNAME

* Improved API documentation

* Fix lint

* Completions documentation

* Updated openapi scheme

* Ingestion API doc

* Minor doc changes

* Updated openapi scheme

* Chunks API documentation

* Embeddings and Health API, and homogeneous responses

* Revamp README with new skeleton of content

* More docs

* PrivateGPT logo

* Improve UI

* Update ingestion docu

* Update README with new sections

* Use context window in the retriever

* Gradio Documentation

* Add logo to UI

* Include Contributing and Community sections to README

* Update links to resources in the README

* Small README.md updates

* Wrap lines of README.md

* Don't put health under /v1

* Add copy button to Chat

* Architecture documentation

* Updated openapi.json

* Updated openapi.json

* Updated openapi.json

* Change UI label

* Update documentation

* Add releases link to README.md

* Gradio avatar and stop debug

* Readme update

* Clean old files

* Remove unused terraform checks

* Update twitter link.

* Disable minimum coverage

* Clean install message in README.md

---------

Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local>
Co-authored-by: Iván Martínez <ivanmartit@gmail.com>
Co-authored-by: RubenGuerrero <ruben.guerrero@boopos.com>
Co-authored-by: Daniel Gallego Vico <daniel.gallego@bq.com>
2023-10-19 16:04:35 +02:00

120 lines
4.3 KiB
Python

from typing import TYPE_CHECKING
from injector import inject, singleton
from llama_index import ServiceContext, StorageContext, VectorStoreIndex
from llama_index.schema import NodeWithScore
from pydantic import BaseModel, Field
from private_gpt.components.embedding.embedding_component import EmbeddingComponent
from private_gpt.components.llm.llm_component import LLMComponent
from private_gpt.components.node_store.node_store_component import NodeStoreComponent
from private_gpt.components.vector_store.vector_store_component import (
VectorStoreComponent,
)
from private_gpt.open_ai.extensions.context_filter import ContextFilter
from private_gpt.server.ingest.ingest_service import IngestedDoc
if TYPE_CHECKING:
from llama_index.schema import RelatedNodeInfo
class Chunk(BaseModel):
object: str = Field(enum=["context.chunk"])
score: float = Field(examples=[0.023])
document: IngestedDoc
text: str = Field(examples=["Outbound sales increased 20%, driven by new leads."])
previous_texts: list[str] | None = Field(
examples=[["SALES REPORT 2023", "Inbound didn't show major changes."]]
)
next_texts: list[str] | None = Field(
examples=[
[
"New leads came from Google Ads campaign.",
"The campaign was run by the Marketing Department",
]
]
)
@singleton
class ChunksService:
@inject
def __init__(
self,
llm_component: LLMComponent,
vector_store_component: VectorStoreComponent,
embedding_component: EmbeddingComponent,
node_store_component: NodeStoreComponent,
) -> None:
self.vector_store_component = vector_store_component
self.storage_context = StorageContext.from_defaults(
vector_store=vector_store_component.vector_store,
docstore=node_store_component.doc_store,
index_store=node_store_component.index_store,
)
self.query_service_context = ServiceContext.from_defaults(
llm=llm_component.llm, embed_model=embedding_component.embedding_model
)
def _get_sibling_nodes_text(
self, node_with_score: NodeWithScore, related_number: int, forward: bool = True
) -> list[str]:
explored_nodes_texts = []
current_node = node_with_score.node
for _ in range(related_number):
explored_node_info: RelatedNodeInfo | None = (
current_node.next_node if forward else current_node.prev_node
)
if explored_node_info is None:
break
explored_node = self.storage_context.docstore.get_node(
explored_node_info.node_id
)
explored_nodes_texts.append(explored_node.get_content())
current_node = explored_node
return explored_nodes_texts
def retrieve_relevant(
self,
text: str,
context_filter: ContextFilter | None = None,
limit: int = 10,
prev_next_chunks: int = 0,
) -> list[Chunk]:
index = VectorStoreIndex.from_vector_store(
self.vector_store_component.vector_store,
storage_context=self.storage_context,
service_context=self.query_service_context,
show_progress=True,
)
vector_index_retriever = self.vector_store_component.get_retriever(
index=index, context_filter=context_filter, similarity_top_k=limit
)
nodes = vector_index_retriever.retrieve(text)
nodes.sort(key=lambda n: n.score or 0.0, reverse=True)
retrieved_nodes = []
for node in nodes:
doc_id = node.node.ref_doc_id if node.node.ref_doc_id is not None else "-"
retrieved_nodes.append(
Chunk(
object="context.chunk",
score=node.score or 0.0,
document=IngestedDoc(
object="ingest.document",
doc_id=doc_id,
doc_metadata=node.metadata,
),
text=node.get_content(),
previous_texts=self._get_sibling_nodes_text(
node, prev_next_chunks, False
),
next_texts=self._get_sibling_nodes_text(node, prev_next_chunks),
)
)
return retrieved_nodes