mirror of https://github.com/imartinez/privateGPT.git synced 2025-09-16 23:30:48 +00:00

Go to file

Pablo Orgaz 51cc638758 Next version of PrivateGPT (#1077 )

* Dockerize private-gpt

* Use port 8001 for local development

* Add setup script

* Add CUDA Dockerfile

* Create README.md

* Make the API use OpenAI response format

* Truncate prompt

* refactor: add models and __pycache__ to .gitignore

* Better naming

* Update readme

* Move models ignore to it's folder

* Add scaffolding

* Apply formatting

* Fix tests

* Working sagemaker custom llm

* Fix linting

* Fix linting

* Enable streaming

* Allow all 3.11 python versions

* Use llama 2 prompt format and fix completion

* Restructure (#3)

Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local>

* Fix Dockerfile

* Use a specific build stage

* Cleanup

* Add FastAPI skeleton

* Cleanup openai package

* Fix DI and tests

* Split tests and tests with coverage

* Remove old scaffolding

* Add settings logic (#4)

* Add settings logic

* Add settings for sagemaker

---------

Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local>

* Local LLM (#5)

* Add settings logic

* Add settings for sagemaker

* Add settings-local-example.yaml

* Delete terraform files

* Refactor tests to use fixtures

* Join deltas

* Add local model support

---------

Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local>

* Update README.md

* Fix tests

* Version bump

* Enable simple llamaindex observability (#6)

* Enable simple llamaindex observability

* Improve code through linting

* Update README.md

* Move to async (#7)

* Migrate implementation to use asyncio

* Formatting

* Cleanup

* Linting

---------

Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local>

* Query Docs and gradio UI

* Remove unnecessary files

* Git ignore chromadb folder

* Async migration + DI Cleanup

* Fix tests

* Add integration test

* Use fastapi responses

* Retrieval service with partial implementation

* Cleanup

* Run formatter

* Fix types

* Fetch nodes asynchronously

* Install local dependencies in tests

* Install ui dependencies in tests

* Install dependencies for llama-cpp

* Fix sudo

* Attempt to fix cuda issues

* Attempt to fix cuda issues

* Try to reclaim some space from ubuntu machine

* Retrieval with context

* Fix lint and imports

* Fix mypy

* Make retrieval API a POST

* Make Completions body a dataclass

* Fix LLM chat message order

* Add Query Chunks to Gradio UI

* Improve rag query prompt

* Rollback CI Changes

* Move to sync code

* Using Llamaindex abstraction for query retrieval

* Fix types

* Default to CONDENSED chat mode for contextualized chat

* Rename route function

* Add Chat endpoint

* Remove webhooks

* Add IntelliJ run config to gitignore

* .gitignore applied

* Sync chat completion

* Refactor total

* Typo in context_files.py

* Add embeddings component and service

* Remove wrong dataclass from IngestService

* Filter by context file id implementation

* Fix typing

* Implement context_filter and separate from the bool use_context in the API

* Change chunks api to avoid conceptual class of the context concept

* Deprecate completions and fix tests

* Remove remaining dataclasses

* Use embedding component in ingest service

* Fix ingestion to have multipart and local upload

* Fix ingestion API

* Add chunk tests

* Add configurable paths

* Cleaning up

* Add more docs

* IngestResponse includes a list of IngestedDocs

* Use IngestedDoc in the Chunk document reference

* Rename ingest routes to ingest_router.py

* Fix test working directory for intellij

* Set testpaths for pytest

* Remove unused as_chat_engine

* Add .fleet ide to gitignore

* Make LLM and Embedding model configurable

* Fix imports and checks

* Let local_data folder exist empty in the repository

* Don't use certain metadata in LLM

* Remove long lines

* Fix windows installation

* Typos

* Update poetry.lock

* Add TODO for linux

* Script and first version of docs

* No jekill build

* Fix relative url to openapi json

* Change default docs values

* Move chromadb dependency to the general group

* Fix tests to use separate local_data

* Create CNAME

* Update CNAME

* Fix openapi.json relative path

* PrivateGPT logo

* WIP OpenAPI documentation metadata

* Add ingest script (#11)

* Add ingest script

* Fix broken name refactor

* Add ingest docs and Makefile script

* Linting

* Move transformers to main dependency

* Move torch to main dependencies

* Don't load HuggingFaceEmbedding in tests

* Fix lint

---------

Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local>

* Rename file to camel_case

* Commit settings-local.yaml

* Move documentation to public docs

* Fix docker image for linux

* Installation and Running the Server documentation

* Move back to docs folder, as it is the only supported by github pages

* Delete CNAME

* Create CNAME

* Delete CNAME

* Create CNAME

* Improved API documentation

* Fix lint

* Completions documentation

* Updated openapi scheme

* Ingestion API doc

* Minor doc changes

* Updated openapi scheme

* Chunks API documentation

* Embeddings and Health API, and homogeneous responses

* Revamp README with new skeleton of content

* More docs

* PrivateGPT logo

* Improve UI

* Update ingestion docu

* Update README with new sections

* Use context window in the retriever

* Gradio Documentation

* Add logo to UI

* Include Contributing and Community sections to README

* Update links to resources in the README

* Small README.md updates

* Wrap lines of README.md

* Don't put health under /v1

* Add copy button to Chat

* Architecture documentation

* Updated openapi.json

* Updated openapi.json

* Updated openapi.json

* Change UI label

* Update documentation

* Add releases link to README.md

* Gradio avatar and stop debug

* Readme update

* Clean old files

* Remove unused terraform checks

* Update twitter link.

* Disable minimum coverage

* Clean install message in README.md

---------

Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local>
Co-authored-by: Iván Martínez <ivanmartit@gmail.com>
Co-authored-by: RubenGuerrero <ruben.guerrero@boopos.com>
Co-authored-by: Daniel Gallego Vico <daniel.gallego@bq.com>

2023-10-19 16:04:35 +02:00

.github/workflows

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

docs

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

local_data

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

models

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

private_gpt

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

scripts

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

tests

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

.dockerignore

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

.gitignore

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

.pre-commit-config.yaml

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

Dockerfile

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

Makefile

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

poetry.lock

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

pyproject.toml

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

README.md

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

settings-docker.yaml

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

settings-local.yaml

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

settings-test.yaml

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

settings.yaml

Next version of PrivateGPT (#1077 )

2023-10-19 16:04:35 +02:00

README.md

🔒 PrivateGPT 📑

PrivateGPT is a production-ready AI project that allows you to ask questions to your documents using the power of Large Language Models (LLMs), even in scenarios without Internet connection. 100% private, no data leaves your execution environment at any point.

The project provides an API offering all the primitives required to build private, context-aware AI applications. It follows and extends OpenAI API standard, and supports both normal and streaming responses.

The API is divided into two logical blocks:

High-level API, which abstracts all the complexity of a RAG (Retrieval Augmented Generation) pipeline implementation:

Ingestion of documents: internally managing document parsing, splitting, metadata extraction, embedding generation and storage.
Chat & Completions using context from ingested documents: abstracting the retrieval of context, the prompt engineering and the response generation.

Low-level API, which allows advanced users to implement their own complex pipelines:

Embeddings generation: based on a piece of text.
Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested documents.

In addition to this, a working Gradio UI client is provided to test the API, together with a set of useful tools such as bulk model download script, ingestion script, documents folder watch, etc.

👂 Need help applying PrivateGPT to your specific use case? Let us know more about it and we'll try to help! We are refining PrivateGPT through your feedback.

🎞️ Overview

DISCLAIMER: This README is not updated as frequently as the documentation. Please check it out for the latest updates!

Motivation behind PrivateGPT

Generative AI is a game changer for our society, but adoption in companies of all size and data-sensitive domains like healthcare or legal is limited by a clear concern: privacy. Not being able to ensure that your data is fully under your control when using third-party AI tools is a risk those industries cannot take.

Primordial version

The first version of PrivateGPT was launched in May 2023 as a novel approach to address the privacy concern by using LLMs in a complete offline way. This was done by leveraging existing technologies developed by the thriving Open Source AI community: LangChain, LlamaIndex, GPT4All, LlamaCpp, Chroma and SentenceTransformers.

That version, which rapidly became a go-to project for privacy-sensitive setups and served as the seed for thousands of local-focused generative AI projects, was the foundation of what PrivateGPT is becoming nowadays; thus a simpler and more educational implementation to understand the basic concepts required to build a fully local -and therefore, private- chatGPT-like tool.

If you want to keep experimenting with it, we have saved it in the primordial branch of the project.

It is strongly recommended to do a clean clone and install of this new version of PrivateGPT if you come from the previous, primordial version.

Present and Future of PrivateGPT

PrivateGPT is now evolving towards becoming a gateway to generative AI models and primitives, including completions, document ingestion, RAG pipelines and other low-level building blocks. We want to make easier for any developer to build AI applications and experiences, as well as providing a suitable extensive architecture for the community to keep contributing.

Stay tuned to our releases to check all the new features and changes included.

📄 Documentation

Full documentation on installation, dependencies, configuration, running the server, deployment options, ingesting local documents, API details and UI features can be found here: https://docs.privategpt.dev/

🧩 Architecture

Conceptually, PrivateGPT is an API that wraps a RAG pipeline and exposes its primitives.

The API is built using FastAPI and follows OpenAI's API scheme.
The RAG pipeline is based on LlamaIndex.

The design of PrivateGPT allows to easily extend and adapt both the API and the RAG implementation. Some key architectural decisions are:

Dependency Injection, decoupling the different componentes and layers.
Usage of LlamaIndex abstractions such as LLM, BaseEmbedding or VectorStore, making it immediate to change the actual implementations of those abstractions.
Simplicity, adding as few layers and new abstractions as possible.
Ready to use, providing a full implementation of the API and RAG pipeline.

Main building blocks:

APIs are defined in private_gpt:server:<api>. Each package contains an <api>_router.py (FastAPI layer) and an <api>_service.py (the service implementation). Each Service uses LlamaIndex base abstractions instead of specific implementations, decoupling the actual implementation from its usage.
Components are placed in private_gpt:components:<component>. Each Component is in charge of providing actual implementations to the base abstractions used in the Services - for example LLMComponent is in charge of providing an actual implementation of an LLM (for example LlamaCPP or OpenAI).

💡 Contributing

Contributions are welcomed! To ensure code quality we have enabled several format and typing checks, just run make check before committing to make sure your code is ok. Remember to test your code! You'll find a tests folder with helpers, and you can run tests using make test command.

Interested in contributing to PrivateGPT? We have the following challenges ahead of us in case you want to give a hand:

Improvements

Better RAG pipeline implementation (improvements to both indexing and querying stages)
Code documentation
Expose execution parameters such as top_p, temperature, max_tokens... in Completions and Chat Completions
Expose chunk size in Ingest API
Implement Update and Delete document in Ingest API
Add information about tokens consumption in each response
Add to Completion APIs (chat and completion) the context docs used to answer the question
In “model” field return the actual LLM or Embeddings model name used

Features

Implement concurrency lock to avoid errors when there are several calls to the local LlamaCPP model
API key-based request control to the API
CORS support
Support for Sagemaker
Support Function calling
Add md5 to check files already ingested
Select a document to query in the UI
Better observability of the RAG pipeline

Project Infrastructure

Create a “wipe” shortcut in make to remove all contents of local_data folder except .gitignore
Packaged version as a local desktop app (windows executable, mac app, linux app)
Dockerize the application for platforms outside linux (Docker Desktop for Mac and Windows)
Document how to deploy to AWS, GCP and Azure.

💬 Community

Join the conversation around PrivateGPT on our:

📖 Citation

Reference to cite if you use PrivateGPT in a paper:

@software{PrivateGPT_2023,
authors = {Martinez, I., Gallego, D. Orgaz, P.},
month = {5},
title = {PrivateGPT},
url = {https://github.com/imartinez/privateGPT},
year = {2023}
}