templates: Add NVIDIA Canonical RAG example chain (#15758)

- **Description:** Adds a RAG template that uses NVIDIA AI playground
and embedding models, along with Milvus vector store

- **Dependencies:** This template depends on the AI playground service
in NVIDIA NGC. API keys with a significant trial compute are available
(10k queries at the time of writing). This template also depends on the
Milvus Vector store which is publicly available.

Note: [A quick link to get a
key](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/codellama-13b/api)
when you have an NGC account. Generate Key button at the top right of
the code window.

---------

Co-authored-by: Sagar B Manjunath <sbogadimanju@nvidia.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
This commit is contained in:
Sagar B Manjunath 2024-01-11 08:09:16 +05:30 committed by GitHub
parent 38523d7c57
commit e6240fecab
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 2562 additions and 0 deletions

View File

@ -0,0 +1,21 @@
MIT License
Copyright (c) 2023 LangChain, Inc.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@ -0,0 +1,121 @@
# nvidia-rag-canonical
This template performs RAG using Milvus Vector Store and NVIDIA Models (Embedding and Chat).
## Environment Setup
You should export your NVIDIA API Key as an environment variable.
If you do not have an NVIDIA API Key, you can create one by following these steps:
1. Create a free account with the [NVIDIA GPU Cloud](https://catalog.ngc.nvidia.com/) service, which hosts AI solution catalogs, containers, models, etc.
2. Navigate to `Catalog > AI Foundation Models > (Model with API endpoint)`.
3. Select the `API` option and click `Generate Key`.
4. Save the generated key as `NVIDIA_API_KEY`. From there, you should have access to the endpoints.
```shell
export NVIDIA_API_KEY=...
```
For instructions on hosting the Milvus Vector Store, refer to the section at the bottom.
## Usage
To use this package, you should first have the LangChain CLI installed:
```shell
pip install -U langchain-cli
```
To use the NVIDIA models, install the Langchain NVIDIA AI Endpoints package:
```shell
pip install -U langchain_nvidia_aiplay
```
To create a new LangChain project and install this as the only package, you can do:
```shell
langchain app new my-app --package nvidia-rag-canonical
```
If you want to add this to an existing project, you can just run:
```shell
langchain app add nvidia-rag-canonical
```
And add the following code to your `server.py` file:
```python
from nvidia_rag_canonical import chain as rag_nvidia_chain
add_routes(app, rag_nvidia_chain, path="/nvidia-rag")
```
If you want to set up an ingestion pipeline, you can add the following code to your `server.py` file:
```python
from rag_nvidia_canonical import ingest as rag_nvidia_ingest
add_routes(app, rag_nvidia_ingest, path="/nvidia-rag-ingest")
```
Note that for files ingested by the ingestion API, the server will need to be restarted for the newly ingested files to be accessible by the retriever.
(Optional) Let's now configure LangSmith.
LangSmith will help us trace, monitor and debug LangChain applications.
LangSmith is currently in private beta, you can sign up [here](https://smith.langchain.com/).
If you don't have access, you can skip this section
```shell
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=<your-api-key>
export LANGCHAIN_PROJECT=<your-project> # if not specified, defaults to "default"
```
If you DO NOT already have a Milvus Vector Store you want to connect to, see `Milvus Setup` section below before proceeding.
If you DO have a Milvus Vector Store you want to connect to, edit the connection details in `nvidia_rag_canonical/chain.py`
If you are inside this directory, then you can spin up a LangServe instance directly by:
```shell
langchain serve
```
This will start the FastAPI app with a server is running locally at
[http://localhost:8000](http://localhost:8000)
We can see all templates at [http://127.0.0.1:8000/docs](http://127.0.0.1:8000/docs)
We can access the playground at [http://127.0.0.1:8000/nvidia-rag/playground](http://127.0.0.1:8000/nvidia-rag/playground)
We can access the template from code with:
```python
from langserve.client import RemoteRunnable
runnable = RemoteRunnable("http://localhost:8000/nvidia-rag")
```
## Milvus Setup
Use this step if you need to create a Milvus Vector Store and ingest data.
We will first follow the standard Milvus setup instructions [here](https://milvus.io/docs/install_standalone-docker.md).
1. Download the Docker Compose YAML file.
```shell
wget https://github.com/milvus-io/milvus/releases/download/v2.3.3/milvus-standalone-docker-compose.yml -O docker-compose.yml
```
2. Start the Milvus Vector Store container
```shell
sudo docker compose up -d
```
3. Install the PyMilvus package to interact with the Milvus container.
```shell
pip install pymilvus
```
4. Let's now ingest some data! We can do that by moving into this directory and running the code in `ingest.py`, eg:
```shell
python ingest.py
```
Note that you can (and should!) change this to ingest data of your choice.

View File

@ -0,0 +1,39 @@
import getpass
import os
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores.milvus import Milvus
from langchain_nvidia_aiplay import NVIDIAEmbeddings
if os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"):
print("Valid NVIDIA_API_KEY already in environment. Delete to reset")
else:
nvapi_key = getpass.getpass("NVAPI Key (starts with nvapi-): ")
assert nvapi_key.startswith("nvapi-"), f"{nvapi_key[:5]}... is not a valid key"
os.environ["NVIDIA_API_KEY"] = nvapi_key
# Note: if you change this, you should also change it in `nvidia_rag_canonical/chain.py`
EMBEDDING_MODEL = "nvolveqa_40k"
HOST = "127.0.0.1"
PORT = "19530"
COLLECTION_NAME = "test"
embeddings = NVIDIAEmbeddings(model=EMBEDDING_MODEL)
if __name__ == "__main__":
# Load docs
loader = PyPDFLoader("https://www.ssa.gov/news/press/factsheets/basicfact-alt.pdf")
data = loader.load()
# Split docs
text_splitter = CharacterTextSplitter(chunk_size=300, chunk_overlap=100)
docs = text_splitter.split_documents(data)
# Insert the documents in Milvus Vector Store
vector_db = Milvus.from_documents(
docs,
embeddings,
collection_name=COLLECTION_NAME,
connection_args={"host": HOST, "port": PORT},
)

View File

@ -0,0 +1,3 @@
from nvidia_rag_canonical.chain import chain, ingest
__all__ = ["chain", "ingest"]

View File

@ -0,0 +1,91 @@
import getpass
import os
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.document_loaders import PyPDFLoader
from langchain_community.vectorstores import Milvus
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel
from langchain_core.runnables import (
RunnableLambda,
RunnableParallel,
RunnablePassthrough,
)
from langchain_nvidia_aiplay import ChatNVIDIA, NVIDIAEmbeddings
EMBEDDING_MODEL = "nvolveqa_40k"
CHAT_MODEL = "llama2_13b"
HOST = "127.0.0.1"
PORT = "19530"
COLLECTION_NAME = "test"
INGESTION_CHUNK_SIZE = 500
INGESTION_CHUNK_OVERLAP = 0
if os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"):
print("Valid NVIDIA_API_KEY already in environment. Delete to reset")
else:
nvapi_key = getpass.getpass("NVAPI Key (starts with nvapi-): ")
assert nvapi_key.startswith("nvapi-"), f"{nvapi_key[:5]}... is not a valid key"
os.environ["NVIDIA_API_KEY"] = nvapi_key
# Read from Milvus Vector Store
embeddings = NVIDIAEmbeddings(model=EMBEDDING_MODEL)
vectorstore = Milvus(
connection_args={"host": HOST, "port": PORT},
collection_name=COLLECTION_NAME,
embedding_function=embeddings,
)
retriever = vectorstore.as_retriever()
# RAG prompt
template = """<s>[INST] <<SYS>>
Use the following context to answer the user's question. If you don't know the answer,
just say that you don't know, don't try to make up an answer.
<</SYS>>
<s>[INST] Context: {context} Question: {question} Only return the helpful
answer below and nothing else. Helpful answer:[/INST]"
"""
prompt = ChatPromptTemplate.from_template(template)
# RAG
model = ChatNVIDIA(model=CHAT_MODEL)
chain = (
RunnableParallel({"context": retriever, "question": RunnablePassthrough()})
| prompt
| model
| StrOutputParser()
)
# Add typing for input
class Question(BaseModel):
__root__: str
chain = chain.with_types(input_type=Question)
def _ingest(url: str) -> dict:
"""Load and ingest the PDF file from the URL"""
loader = PyPDFLoader(url)
data = loader.load()
# Split docs
text_splitter = CharacterTextSplitter(
chunk_size=INGESTION_CHUNK_SIZE, chunk_overlap=INGESTION_CHUNK_OVERLAP
)
docs = text_splitter.split_documents(data)
# Insert the documents in Milvus Vector Store
_ = Milvus.from_documents(
documents=docs,
embedding=embeddings,
collection_name=COLLECTION_NAME,
connection_args={"host": HOST, "port": PORT},
)
return {}
ingest = RunnableLambda(_ingest)

2258
templates/nvidia-rag-canonical/poetry.lock generated Normal file

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,29 @@
[tool.poetry]
name = "nvidia-rag-canonical"
version = "0.1.0"
description = "RAG with NVIDIA"
authors = ["Sagar Bogadi Manjunath <sbogadimanju@nvidia.com>"]
readme = "README.md"
[tool.poetry.dependencies]
python = ">=3.8.1,<4.0"
langchain = "^0.1"
pymilvus = ">=2.3.0"
langchain-nvidia-aiplay = "^0.0.2"
[tool.poetry.group.dev.dependencies]
langchain-cli = ">=0.0.20"
[tool.langserve]
export_module = "nvidia_rag_canonical"
export_attr = "chain"
[tool.templates-hub]
use-case = "rag"
author = "LangChain"
integrations = ["Milvus", "NVIDIA"]
tags = ["vectordbs"]
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"