privateGPT/fern/docs/pages/manual/llms.mdx
lopagela 36f69eed0f
Refactor documentation architecture (#1264)
* Refactor documentation architecture

Split into several `tab` and sections

* Fix Fern's docs.yml after PR review

Thank you Danny!

Co-authored-by: dannysheridan <danny@buildwithfern.com>

* Re-add quickstart in the overview tab

It went missing after a refactoring of the doc architecture

* Documentation writing

* Adapt Makefile to fern documentation

* Do not create overlapping page names in fern documentation

This is causing 500. Thank you to @dsinghvi for the troubleshooting and the help!

* Add a readme to help to understand how fern documentation work and how to add new pages

* Rework the welcome view

Redirects directly users to installation guide with links for people that are not familiar with documentation browsing.

* Simplify the quickstart guide

* PR feedback on installation guide

A ton of refactoring can still be made there

* PR feedback on ingestion

* PR feedback on ingestion splitting

* Rename section on LLM

* Fix missing word in list of LLMs

---------

Co-authored-by: dannysheridan <danny@buildwithfern.com>
2023-11-19 18:46:09 +01:00

83 lines
2.6 KiB
Plaintext

## Running the Server
PrivateGPT supports running with different LLMs & setups.
### Local models
Both the LLM and the Embeddings model will run locally.
Make sure you have followed the *Local LLM requirements* section before moving on.
This command will start PrivateGPT using the `settings.yaml` (default profile) together with the `settings-local.yaml`
configuration files. By default, it will enable both the API and the Gradio UI. Run:
```bash
PGPT_PROFILES=local make run
```
or
```bash
PGPT_PROFILES=local poetry run python -m private_gpt
```
When the server is started it will print a log *Application startup complete*.
Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API
using Swagger UI.
### Using OpenAI
If you cannot run a local model (because you don't have a GPU, for example) or for testing purposes, you may
decide to run PrivateGPT using OpenAI as the LLM and Embeddings model.
In order to do so, create a profile `settings-openai.yaml` with the following contents:
```yaml
llm:
mode: openai
openai:
api_key: <your_openai_api_key> # You could skip this configuration and use the OPENAI_API_KEY env var instead
```
And run PrivateGPT loading that profile you just created:
`PGPT_PROFILES=openai make run`
or
`PGPT_PROFILES=openai poetry run python -m private_gpt`
When the server is started it will print a log *Application startup complete*.
Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.
You'll notice the speed and quality of response is higher, given you are using OpenAI's servers for the heavy
computations.
### Using AWS Sagemaker
For a fully private & performant setup, you can choose to have both your LLM and Embeddings model deployed using Sagemaker.
Note: how to deploy models on Sagemaker is out of the scope of this documentation.
In order to do so, create a profile `settings-sagemaker.yaml` with the following contents (remember to
update the values of the llm_endpoint_name and embedding_endpoint_name to yours):
```yaml
llm:
mode: sagemaker
sagemaker:
llm_endpoint_name: huggingface-pytorch-tgi-inference-2023-09-25-19-53-32-140
embedding_endpoint_name: huggingface-pytorch-inference-2023-11-03-07-41-36-479
```
And run PrivateGPT loading that profile you just created:
`PGPT_PROFILES=sagemaker make run`
or
`PGPT_PROFILES=sagemaker poetry run python -m private_gpt`
When the server is started it will print a log *Application startup complete*.
Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.