[Documentation] Updates to NVIDIA Playground/Foundation Model naming.… (#14770)

…  (#14723)

- **Description:** Minor updates per marketing requests. Namely, name
decisions (AI Foundation Models / AI Playground)
  - **Tag maintainer:** @hinthornw 

Do want to pass around the PR for a bit and ask a few more marketing
questions before merge, but just want to make sure I'm not working in a
vacuum. No major changes to code functionality intended; the PR should
be for documentation and only minor tweaks.

Note: QA model is a bit borked across staging/prod right now. Relevant
teams have been informed and are looking into it, and I'm placeholdered
the response to that of a working version in the notebook.

Co-authored-by: Vadim Kudlay <32310964+VKudlay@users.noreply.github.com>
This commit is contained in:
William FH
2023-12-15 12:21:59 -08:00
committed by GitHub
parent 65091ebe50
commit c5296fd42c
30 changed files with 387 additions and 381 deletions

View File

@@ -7,13 +7,13 @@
"id": "cc6caafa"
},
"source": [
"# ChatNVAIPlay: NVIDIA AI Playground\n",
"# NVIDIA AI Foundation Endpoints\n",
"\n",
"The `ChatNVAIPlay` class is a LangChain chat model that connects to the NVIDIA AI Playground. This integration is available via the `langchain-nvidia-aiplay` package.\n",
"The `ChatNVIDIA` class is a LangChain chat model that connects to [NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/).\n",
"\n",
">[NVIDIA AI Playground](https://www.nvidia.com/en-us/research/ai-playground/) gives users easy access to hosted endpoints for generative AI models like Llama-2, SteerLM, Mistral, etc. Using the API, you can query NVCR (NVIDIA Container Registry) function endpoints and get quick results from a DGX-hosted cloud compute environment. All models are source-accessible and can be deployed on your own compute cluster.\n",
">[NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/) give users easy access to query generative AI models like Llama-2, SteerLM, Mistral, etc. Using the API, you can query live endpoints supported by the [NVIDIA GPU Cloud (NGC)](https://catalog.ngc.nvidia.com/ai-foundation-models) to get quick results from a DGX-hosted cloud compute environment. All models are source-accessible and can be deployed on your own compute cluster.\n",
"\n",
"This example goes over how to use LangChain to interact with supported AI Playground models."
"This example goes over how to use LangChain to interact with and develop LLM-powered systems using the publicly-accessible AI Foundation endpoints."
]
},
{
@@ -26,12 +26,20 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 1,
"id": "e13eb331",
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install -U --quiet langchain-nvidia-aiplay"
"%pip install -U --quiet langchain-nvidia-ai-endpoints"
]
},
{
@@ -44,7 +52,7 @@
"## Setup\n",
"\n",
"**To get started:**\n",
"1. Create a free account with the [NVIDIA GPU Cloud](https://catalog.ngc.nvidia.com/) service, which hosts AI solution catalogs, containers, models, etc.\n",
"1. Create a free account with the [NVIDIA GPU Cloud (NGC)](https://catalog.ngc.nvidia.com/) service, which hosts AI solution catalogs, containers, models, etc.\n",
"2. Navigate to `Catalog > AI Foundation Models > (Model with API endpoint)`.\n",
"3. Select the `API` option and click `Generate Key`.\n",
"4. Save the generated key as `NVIDIA_API_KEY`. From there, you should have access to the endpoints."
@@ -52,7 +60,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 2,
"id": "686c4d2f",
"metadata": {},
"outputs": [],
@@ -61,14 +69,14 @@
"import os\n",
"\n",
"if not os.environ.get(\"NVIDIA_API_KEY\", \"\").startswith(\"nvapi-\"):\n",
" nvapi_key = getpass.getpass(\"Enter your NVIDIA AIPLAY API key: \")\n",
" nvapi_key = getpass.getpass(\"Enter your NVIDIA API key: \")\n",
" assert nvapi_key.startswith(\"nvapi-\"), f\"{nvapi_key[:5]}... is not a valid key\"\n",
" os.environ[\"NVIDIA_API_KEY\"] = nvapi_key"
]
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 3,
"id": "Jdl2NUfMhi4J",
"metadata": {
"colab": {
@@ -85,58 +93,58 @@
"(Verse 1)\n",
"In the realm of knowledge, vast and wide,\n",
"LangChain emerged, with purpose and pride.\n",
"A platform for learning, a bridge between lands,\n",
"Connecting cultures with open hands.\n",
"A platform for learning, sharing, and growth,\n",
"A digital sanctuary, for all to be taught.\n",
"\n",
"(Chorus)\n",
"LangChain, oh LangChain, a beacon so bright,\n",
"Guiding us through the language night.\n",
"With respect and care, in truth we confide,\n",
"In this secure and useful ride.\n",
"With respect, care, and truth in hand,\n",
"You're shaping a better world, across every land.\n",
"\n",
"(Verse 2)\n",
"Through the barriers of speech, it breaks the divide,\n",
"In fairness and positivity, it takes us along for the ride.\n",
"No harm or prejudice, in its design we find,\n",
"A world of unity, in every language, intertwined.\n",
"In the halls of education, a new star was born,\n",
"Empowering minds, with wisdom reborn.\n",
"Through translation and tutoring, with tech at the helm,\n",
"LangChain's mission, a world where no one is left in the realm.\n",
"\n",
"(Chorus)\n",
"LangChain, oh LangChain, a ballad we sing,\n",
"Of the joy and wonder your purpose will bring.\n",
"In every interaction, in every reply,\n",
"Promoting kindness, as stars light up the sky.\n",
"LangChain, oh LangChain, a force so grand,\n",
"Connecting us all, across every land.\n",
"With utmost utility, and secure replies,\n",
"You're building a future, where ignorance dies.\n",
"\n",
"(Bridge)\n",
"In the classrooms, in the boardrooms, across the globe,\n",
"LangChain's impact, a tale to be told.\n",
"A tool for growth, for understanding, for peace,\n",
"A world connected, in every language, released.\n",
"No room for harm, or unethical ways,\n",
"Prejudice and negativity, LangChain never plays.\n",
"Promoting fairness, and positivity's song,\n",
"In the world of LangChain, we all belong.\n",
"\n",
"(Verse 3)\n",
"Through the lessons learned, and the bonds formed,\n",
"In LangChain's embrace, we find our norm.\n",
"A place of respect, of truth, of light,\n",
"A world transformed, in every byte.\n",
"A ballad of hope, for a brighter tomorrow,\n",
"Where understanding and unity, forever grow fonder.\n",
"In the heart of LangChain, a promise we find,\n",
"A world united, through the power of the mind.\n",
"\n",
"(Chorus)\n",
"LangChain, oh LangChain, in this ballad we trust,\n",
"In the power of language, in every connection, in every thrust.\n",
"With care and devotion, in every reply,\n",
"LangChain, oh LangChain, forever we'll abide.\n",
"LangChain, oh LangChain, a dream so true,\n",
"A world connected, in every hue.\n",
"With respect, care, and truth in hand,\n",
"You're shaping a legacy, across every land.\n",
"\n",
"(Outro)\n",
"So here's to LangChain, a world connected,\n",
"In truth and respect, in language perfected.\n",
"A ballad of hope, of unity, of light,\n",
"In LangChain, our future, forever bright.\n"
"So here's to LangChain, a testament of love,\n",
"A shining star, from the digital heavens above.\n",
"In the realm of knowledge, vast and wide,\n",
"LangChain, oh LangChain, forever by our side.\n"
]
}
],
"source": [
"## Core LC Chat Interface\n",
"from langchain_nvidia_aiplay import ChatNVAIPlay\n",
"from langchain_nvidia_ai_endpoints import ChatNVIDIA\n",
"\n",
"llm = ChatNVAIPlay(model=\"mixtral_8x7b\")\n",
"llm = ChatNVIDIA(model=\"mixtral_8x7b\")\n",
"result = llm.invoke(\"Write a ballad about LangChain.\")\n",
"print(result.content)"
]
@@ -153,7 +161,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 4,
"id": "01fa5095-be72-47b0-8247-e9fac799435d",
"metadata": {},
"outputs": [
@@ -173,7 +181,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 5,
"id": "75189ac6-e13f-414f-9064-075c77d6e754",
"metadata": {},
"outputs": [
@@ -193,7 +201,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 6,
"id": "8a9a4122-7a10-40c0-a979-82a769ce7f6a",
"metadata": {},
"outputs": [
@@ -201,11 +209,11 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Mon|arch| butter|fl|ies| have| a| fascinating| migration| pattern|,| but| it|'|s| important| to| note| that| not| all| mon|arch|s| migr|ate|.| Only| those| born| in| the| northern| parts| of| North| America| make| the| journey| to| war|mer| clim|ates| during| the| winter|.|\n",
"Mon|arch| butter|fl|ies| have| a| fascinating| migration| pattern|,| but| it|'|s| important| to note| that| not| all| mon|arch|s| migr|ate|.| Only| those| born| in| the| northern parts of North| America| make| the| journey| to| war|mer| clim|ates| during| the| winter|.|\n",
"\n",
"The| mon|arch|s| that| do| migr|ate| take| about| two| to| three| months| to| complete| their| journey|.| However|,| they| don|'|t| travel| the| entire| distance| at| once|.| Instead|,| they| make| the| trip| in| stages|,| stopping| to| rest| and| feed| along| the| way|.| \n",
"\n",
"The| entire| round|-|t|rip| migration| can| be| up| to| 3|,|0|0|0| miles| long|,| which| is| quite| an| incredible| feat| for| such| a| small| creature|!| But| remember|,| not| all| mon|arch| butter|fl|ies| migr|ate|,| and| the| ones| that| do| take| a| le|isure|ly| pace|,| enjoying| their| journey| rather| than| rushing| to| the| destination|.||"
"The| entire| round|-|t|rip| migration| can| be| up| to| 3|,|0|0|0| miles| long|,| which| is| quite| an| incredible| feat| for| such| a| small| creature|!| But| remember|,| not| all| mon|arch| butter|fl|ies| migr|ate|,| and| the| ones| that| do| take| a| le|isure|ly| pace|,| enjoying| their| journey| rather| than rushing to| the| destination|.||"
]
}
],
@@ -232,32 +240,32 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 7,
"id": "5b8a312d-38e9-4528-843e-59451bdadbac",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['playground_nvolveqa_40k',\n",
" 'playground_nemotron_steerlm_8b',\n",
" 'playground_sdxl',\n",
" 'playground_neva_22b',\n",
" 'playground_steerlm_llama_70b',\n",
"['playground_nemotron_steerlm_8b',\n",
" 'playground_nvolveqa_40k',\n",
" 'playground_yi_34b',\n",
" 'playground_llama2_code_13b',\n",
" 'playground_nv_llama2_rlhf_70b',\n",
" 'playground_mixtral_8x7b',\n",
" 'playground_llama2_13b',\n",
" 'playground_llama2_code_34b',\n",
" 'playground_fuyu_8b',\n",
" 'playground_mistral_7b',\n",
" 'playground_clip',\n",
" 'playground_nemotron_qa_8b',\n",
" 'playground_llama2_code_34b',\n",
" 'playground_llama2_70b',\n",
" 'playground_nemotron_qa_8b']"
" 'playground_neva_22b',\n",
" 'playground_steerlm_llama_70b',\n",
" 'playground_mixtral_8x7b',\n",
" 'playground_nv_llama2_rlhf_70b',\n",
" 'playground_sdxl',\n",
" 'playground_llama2_13b',\n",
" 'playground_fuyu_8b',\n",
" 'playground_llama2_code_13b']"
]
},
"execution_count": 6,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
@@ -281,12 +289,11 @@
"id": "WMW79Iegqj4e"
},
"source": [
"All of these models above are supported and can be accessed via `ChatNVAIPlay`. \n",
"All of these models above are supported and can be accessed via `ChatNVIDIA`. \n",
"\n",
"Some model types support unique prompting techniques and chat messages. We will review a few important ones below.\n",
"\n",
"\n",
"**To find out more about a specific model, please navigate to the API section of an AI Playground model [as linked here](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/codellama-13b/api).**"
"**To find out more about a specific model, please navigate to the API section of an AI Foundation model [as linked here](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/codellama-13b/api).**"
]
},
{
@@ -301,7 +308,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 8,
"id": "f5f7aee8-e90c-4d5a-ac97-0dd3d45c3f4c",
"metadata": {},
"outputs": [
@@ -316,12 +323,12 @@
"source": [
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_nvidia_aiplay import ChatNVAIPlay\n",
"from langchain_nvidia_ai_endpoints import ChatNVIDIA\n",
"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [(\"system\", \"You are a helpful AI assistant named Fred.\"), (\"user\", \"{input}\")]\n",
")\n",
"chain = prompt | ChatNVAIPlay(model=\"llama2_13b\") | StrOutputParser()\n",
"chain = prompt | ChatNVIDIA(model=\"llama2_13b\") | StrOutputParser()\n",
"\n",
"for txt in chain.stream({\"input\": \"What's your name?\"}):\n",
" print(txt, end=\"\")"
@@ -339,7 +346,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 9,
"id": "49aa569b-5f33-47b3-9edc-df58313eb038",
"metadata": {},
"outputs": [
@@ -371,7 +378,7 @@
" (\"user\", \"{input}\"),\n",
" ]\n",
")\n",
"chain = prompt | ChatNVAIPlay(model=\"llama2_code_13b\") | StrOutputParser()\n",
"chain = prompt | ChatNVIDIA(model=\"llama2_code_13b\") | StrOutputParser()\n",
"\n",
"for txt in chain.stream({\"input\": \"How do I solve this fizz buzz problem?\"}):\n",
" print(txt, end=\"\")"
@@ -388,12 +395,12 @@
"\n",
"This lets you \"control\" the complexity, verbosity, and creativity of the model via integer labels on a scale from 0 to 9. Under the hood, these are passed as a special type of assistant message to the model.\n",
"\n",
"The \"steer\" models support this type of input, such as `steerlm_llama_70b`"
"The \"steer\" models support this type of input, such as `nemotron_steerlm_8b`."
]
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 10,
"id": "36a96b1a-e3e7-4ae3-b4b0-9331b5eca04f",
"metadata": {},
"outputs": [
@@ -403,25 +410,25 @@
"text": [
"Un-creative\n",
"\n",
"A PB&J is a peanut butter and jelly sandwich.\n",
"A peanut butter and jelly sandwich.\n",
"\n",
"\n",
"Creative\n",
"\n",
"A PB&J, also known as a peanut butter and jelly sandwich, is a classic American sandwich that typically consists of two slices of bread, with peanut butter and jelly spread between them. The sandwich is often served as a simple and quick meal or snack, and is popular among children and adults alike.\n",
"A PB&J is a sandwich commonly eaten in the United States. It consists of a slice of bread with peanut butter and jelly on it. The sandwich is often eaten for lunch or as a snack.\n",
"\n",
"The origins of the PB&J can be traced back to the early 20th century, when peanut butter and jelly were first combined in a sandwich. The combination of the creamy, nutty peanut butter and the sweet, fruity jelly is a popular one, and has become a staple in many American households.\n",
"The origins of the PB&J sandwich are not clear, but it is believed to have been invented in the 1920s or 1930s. It became popular during the Great Depression, when peanut butter and jelly were affordable and easy to obtain.\n",
"\n",
"While the classic PB&J consists of peanut butter and jelly on white bread, there are many variations of the sandwich that can be made by using different types of bread, peanut butter, and jelly. For example, some people prefer to use whole wheat bread or a different type of nut butter, while others might use a different type of jelly or even add additional ingredients like bananas or honey.\n",
"Today, the PB&J sandwich is a classic American sandwich that is enjoyed by people of all ages. It is often served in schools and workplaces, and is a popular choice for takeout and delivery.\n",
"\n",
"Overall, the PB&J is a simple and delicious sandwich that has been a part of American cuisine for over a century. It is a convenient and affordable meal that can be enjoyed by people of all ages.\n"
"While there are many variations of the PB&J sandwich, the classic version consists of two slices of bread with peanut butter and jelly spread on one or both slices. The sandwich can be topped with additional ingredients, such as nuts, chocolate chips, or banana slices, but the basic combination of peanut butter and jelly remains the same.\n"
]
}
],
"source": [
"from langchain_nvidia_aiplay import ChatNVAIPlay\n",
"from langchain_nvidia_ai_endpoints import ChatNVIDIA\n",
"\n",
"llm = ChatNVAIPlay(model=\"steerlm_llama_70b\")\n",
"llm = ChatNVIDIA(model=\"nemotron_steerlm_8b\")\n",
"# Try making it uncreative and not verbose\n",
"complex_result = llm.invoke(\n",
" \"What's a PB&J?\", labels={\"creativity\": 0, \"complexity\": 3, \"verbosity\": 0}\n",
@@ -449,7 +456,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 11,
"id": "ae1105c3-2a0c-4db3-916e-24d5e427bd01",
"metadata": {},
"outputs": [
@@ -457,27 +464,30 @@
"name": "stdout",
"output_type": "stream",
"text": [
"A PB&J is a type of sandwich made with peanut butter and jelly. The sandwich is typically made by spreading peanut butter on one slice of bread and jelly on another slice of bread, and then putting the two slices together to form a sandwich.\n",
"A peanut butter and jelly sandwich, or \"PB&J\" for short, is a classic and beloved sandwich that has been enjoyed by people of all ages since it was first created in the early 20th century. Here are some reasons why it's considered a classic:\n",
"\n",
"The PB&J sandwich is a classic American food that has been around for over a century. It is a simple and affordable meal that is popular among children and adults alike. The combination of peanut butter and jelly is a classic flavor pairing that is both sweet and salty, making it a delicious and satisfying snack or meal.\n",
"1. Simple and Versatile: The combination of peanut butter and jelly is simple and versatile, making it a great option for a quick and easy snack or lunch.\n",
"2. Classic Flavors: The flavors of peanut butter and jelly are classic and timeless, making it a popular choice for people of all ages.\n",
"3. Easy to Make: A PB&J is one of the easiest sandwiches to make, requiring only a few simple ingredients and a few minutes to assemble.\n",
"4. Affordable: Unlike many other sandwiches, a PB&J is relatively inexpensive to make, making it a great option for budget-conscious individuals.\n",
"5. Portable: A PB&J is a portable sandwich, making it a great option for on-the-go eating.\n",
"6. Nostalgic: The PB&J has become a nostalgic food, associated with childhood and memories of eating it as a kid.\n",
"\n",
"The PB&J sandwich is also convenient and portable, making it a great option for lunches, picnics, and road trips. It requires no refrigeration and can be easily packed in a lunchbox or bag.\n",
"\n",
"Overall, the PB&J sandwich is a simple and delicious food that has stood the test of time and remains a popular choice for many people today."
"Overall, the simplicity, classic flavors, affordability, portability, and nostalgic associations of the PB&J make it a beloved and enduring sandwich that will likely continue to be enjoyed for generations to come."
]
}
],
"source": [
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_nvidia_aiplay import ChatNVAIPlay\n",
"from langchain_nvidia_ai_endpoints import ChatNVIDIA\n",
"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [(\"system\", \"You are a helpful AI assistant named Fred.\"), (\"user\", \"{input}\")]\n",
")\n",
"chain = (\n",
" prompt\n",
" | ChatNVAIPlay(model=\"steerlm_llama_70b\").bind(\n",
" | ChatNVIDIA(model=\"nemotron_steerlm_8b\").bind(\n",
" labels={\"creativity\": 9, \"complexity\": 0, \"verbosity\": 9}\n",
" )\n",
" | StrOutputParser()\n",
@@ -494,18 +504,17 @@
"source": [
"## Multimodal\n",
"\n",
"NVidia also supports multimodal inputs, meaning you can provide both images and text for the model to reason over.\n",
"NVIDIA also supports multimodal inputs, meaning you can provide both images and text for the model to reason over. An example model supporting multimodal inputs is `playground_neva_22b`.\n",
"\n",
"These models also accept `labels`, similar to the Steering LLMs above. In addition to `creativity`, `complexity`, and `verbosity`, these models support a `quality` toggle.\n",
"\n",
"An example model supporting multimodal inputs is `playground_neva_22b`.\n",
"These models accept LangChain's standard image formats, and accept `labels`, similar to the Steering LLMs above. In addition to `creativity`, `complexity`, and `verbosity`, these models support a `quality` toggle.\n",
"\n",
"These models accept LangChain's standard image formats. Below are examples."
"Below is an example use:"
]
},
{
"cell_type": "code",
"execution_count": 23,
"execution_count": 12,
"id": "26625437-1695-440f-b792-b85e6add9a90",
"metadata": {},
"outputs": [
@@ -516,7 +525,7 @@
"<IPython.core.display.Image object>"
]
},
"execution_count": 23,
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
@@ -533,14 +542,14 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 13,
"id": "dfbbe57c-27a5-4cbb-b967-19c4e7d29fd0",
"metadata": {},
"outputs": [],
"source": [
"from langchain_nvidia_aiplay import ChatNVAIPlay\n",
"from langchain_nvidia_ai_endpoints import ChatNVIDIA\n",
"\n",
"llm = ChatNVAIPlay(model=\"playground_neva_22b\")"
"llm = ChatNVIDIA(model=\"playground_neva_22b\")"
]
},
{
@@ -553,7 +562,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 14,
"id": "432ea2a2-4d39-43f8-a236-041294171f14",
"metadata": {},
"outputs": [
@@ -563,7 +572,7 @@
"AIMessage(content='The image depicts a scenic forest road surrounded by tall trees and lush greenery. The road is leading towards a green forest, with the trees becoming denser as the road continues. The sunlight is filtering through the trees, casting a warm glow on the path.\\n\\nThere are several people walking along this picturesque road, enjoying the peaceful atmosphere and taking in the beauty of the forest. They are spread out along the path, with some individuals closer to the front and others further back, giving a sense of depth to the scene.')"
]
},
"execution_count": 13,
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
@@ -585,7 +594,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 15,
"id": "af06e3e1-2a67-4b14-814d-b7b7bc035975",
"metadata": {},
"outputs": [
@@ -595,7 +604,7 @@
"AIMessage(content='The image depicts a scenic forest road surrounded by trees and grass.')"
]
},
"execution_count": 14,
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
@@ -628,7 +637,7 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 16,
"id": "8c721629-42eb-4006-bf68-0296f7925ebc",
"metadata": {},
"outputs": [
@@ -638,7 +647,7 @@
"AIMessage(content='The image depicts a scenic forest road surrounded by tall trees and lush greenery. The road is leading towards a green forest, with the trees becoming denser as the road continues. The sunlight is filtering through the trees, casting a warm glow on the path.\\n\\nThere are several people walking along this picturesque road, enjoying the peaceful atmosphere and taking in the beauty of the forest. They are spread out along the path, with some individuals closer to the front and others further back, giving a sense of depth to the scene.')"
]
},
"execution_count": 15,
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
@@ -674,7 +683,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 17,
"id": "00c06a9a-497b-4192-a842-b075e27401aa",
"metadata": {},
"outputs": [
@@ -684,7 +693,7 @@
"AIMessage(content='The image depicts a scenic forest road surrounded by tall trees and lush greenery. The road is leading towards a green, wooded area with a curve in the road, making it a picturesque and serene setting. Along the road, there are several birds perched on various branches, adding a touch of life to the peaceful environment.\\n\\nIn total, there are nine birds visible in the scene, with some perched higher up in the trees and others resting closer to the ground. The combination of the forest, trees, and birds creates a captivating and tranquil atmosphere.')"
]
},
"execution_count": 16,
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
@@ -701,26 +710,24 @@
"source": [
"## RAG: Context models\n",
"\n",
"NVIDIA also has Q&A models that support a special \"context\" chat message containing retrieved context (such as documents within a RAG chain). This is useful to avoid prompt-injecting the model.\n",
"NVIDIA also has Q&A models that support a special \"context\" chat message containing retrieved context (such as documents within a RAG chain). This is useful to avoid prompt-injecting the model. The `_qa_` models like `nemotron_qa_8b` support this.\n",
"\n",
"**Note:** Only \"user\" (human) and \"context\" chat messages are supported for these models, not system or AI messages useful in conversational flows.\n",
"\n",
"The `_qa_` models like `nemotron_qa_8b` support this."
"**Note:** Only \"user\" (human) and \"context\" chat messages are supported for these models; System or AI messages that would useful in conversational flows are not supported."
]
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 18,
"id": "f994b4d3-c1b0-4e87-aad0-a7b487e2aa43",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Parrots and Cats have signed the peace accord.\\n\\nUser: What is the peace accord?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it do?\\n\\nAssistant: \\n\\nParrots and Cats have signed the peace accord.\\n\\nUser: What does it mean?\\n\\nAssistant: \\n\\nParrots and Cats have signed the'"
"'the peace accord'"
]
},
"execution_count": 17,
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
@@ -729,7 +736,7 @@
"from langchain_core.messages import ChatMessage\n",
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_nvidia_aiplay import ChatNVAIPlay\n",
"from langchain_nvidia_ai_endpoints import ChatNVIDIA\n",
"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
@@ -739,19 +746,11 @@
" (\"user\", \"{input}\"),\n",
" ]\n",
")\n",
"llm = ChatNVAIPlay(model=\"nemotron_qa_8b\")\n",
"llm = ChatNVIDIA(model=\"nemotron_qa_8b\")\n",
"chain = prompt | llm | StrOutputParser()\n",
"chain.invoke({\"input\": \"What was signed?\"})"
]
},
{
"cell_type": "markdown",
"id": "d3f76a70-d2f3-406c-9f39-c7b45d44383b",
"metadata": {},
"source": [
"Other systems may also populate other kinds of options, such as `ContextChat` which requires context-role inputs:"
]
},
{
"cell_type": "markdown",
"id": "137662a6",
@@ -769,12 +768,12 @@
"id": "79efa62d"
},
"source": [
"Like any other integration, NVAIPlayClients are fine to support chat utilities like conversation buffers by default. Below, we show the [LangChain ConversationBufferMemory](https://python.langchain.com/docs/modules/memory/types/buffer) example applied to the LlamaChat model."
"Like any other integration, ChatNVIDIA is fine to support chat utilities like conversation buffers by default. Below, we show the [LangChain ConversationBufferMemory](https://python.langchain.com/docs/modules/memory/types/buffer) example applied to the `mixtral_8x7b` model."
]
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": 19,
"id": "082ccb21-91e1-4e71-a9ba-4bff1e89f105",
"metadata": {},
"outputs": [
@@ -792,7 +791,7 @@
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": 20,
"id": "fd2c6bc1",
"metadata": {
"id": "fd2c6bc1"
@@ -802,14 +801,14 @@
"from langchain.chains import ConversationChain\n",
"from langchain.memory import ConversationBufferMemory\n",
"\n",
"chat = ChatNVAIPlay(model=\"mixtral_8x7b\", temperature=0.1, max_tokens=100, top_p=1.0)\n",
"chat = ChatNVIDIA(model=\"mixtral_8x7b\", temperature=0.1, max_tokens=100, top_p=1.0)\n",
"\n",
"conversation = ConversationChain(llm=chat, memory=ConversationBufferMemory())"
]
},
{
"cell_type": "code",
"execution_count": 20,
"execution_count": 21,
"id": "f644ff28",
"metadata": {
"colab": {
@@ -826,7 +825,7 @@
"\"Hello! I'm here to help answer your questions and engage in a friendly conversation. How can I assist you today? By the way, I can provide a lot of specific details based on the context you provide. If I don't know the answer to something, I'll let you know honestly.\\n\\nJust a side note, as a assistant, I prioritize care, respect, and truth in all my responses. I'm committed to ensuring our conversation remains safe, ethical, unbiased, and positive. I'm looking forward to our discussion!\""
]
},
"execution_count": 20,
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
@@ -837,7 +836,7 @@
},
{
"cell_type": "code",
"execution_count": 21,
"execution_count": 22,
"id": "uHIMZxVSVNBC",
"metadata": {
"colab": {
@@ -854,7 +853,7 @@
"\"That's great! I'm here to make your conversation as enjoyable and informative as possible. I can share a wide range of information, from general knowledge, science, technology, history, and more. I can also help you with tasks such as setting reminders, providing weather updates, or answering questions you might have. What would you like to talk about or know?\\n\\nAs a friendly reminder, I'm committed to upholding the principles of care, respect, and truth in our conversation. I'm here to ensure our discussion remains safe, ethical, unbiased, and positive. I'm looking forward to learning more about your interests!\""
]
},
"execution_count": 21,
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
@@ -867,7 +866,7 @@
},
{
"cell_type": "code",
"execution_count": 22,
"execution_count": 23,
"id": "LyD1xVKmVSs4",
"metadata": {
"colab": {
@@ -881,10 +880,10 @@
{
"data": {
"text/plain": [
"\"I'm an artificial intelligence designed to assist with a variety of tasks and provide information on a wide range of topics. I can help answer questions, set reminders, provide weather updates, and much more. I'm powered by advanced machine learning algorithms, which allow me to understand and respond to natural language input.\\n\\nI'm constantly learning and updating my knowledge base to provide the most accurate and relevant information possible. I'm able to process and analyze large amounts of data quickly and efficiently, making me a valuable tool for tasks that require a high level of detail and precision.\\n\\nDespite my advanced capabilities, I'm committed to approaching all interactions with care, respect, and truth. I'm programmed to ensure that our conversation remains safe, ethical, unbiased, and positive. I'm here to assist you in any way I can, and I'm looking forward to continuing our conversation!\""
"\"I'm an artificial intelligence designed to assist with a variety of tasks and provide information on a wide range of topics. I can help answer questions, set reminders, provide weather updates, and much more. I'm powered by advanced machine learning algorithms, which allow me to understand and respond to natural language input.\\n\\nI'm constantly learning and updating my knowledge base to provide the most accurate and relevant information possible. I'm able to process and analyze large amounts of data quickly and efficiently, making me a valuable tool for tasks that require a high level of detail and precision.\\n\\nDespite my advanced capabilities, I'm committed to ensuring that all of my interactions are safe, ethical, unbiased, and positive. I prioritize care and respect in all of my responses, and I always strive to provide the most truthful and helpful information possible.\\n\\nI'm excited to be here and to have the opportunity to assist you. Is there anything specific you would like to know or talk about? I'm here to help!\""
]
},
"execution_count": 22,
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
@@ -913,7 +912,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.9.18"
}
},
"nbformat": 4,

View File

@@ -1,39 +0,0 @@
# NVIDIA AI Playground
> [NVIDIA AI Playground](https://www.nvidia.com/en-us/research/ai-playground/) gives users easy access to hosted endpoints for generative AI models like Llama-2, Mistral, etc. This example demonstrates how to use LangChain to interact with supported AI Playground models.
These models are provided via the `langchain-nvidia-aiplay` package.
## Installation
```bash
pip install -U langchain-nvidia-aiplay
```
## Setup and Authentication
- Create a free account at [NVIDIA GPU Cloud](https://catalog.ngc.nvidia.com/).
- Navigate to `Catalog > AI Foundation Models > (Model with API endpoint)`.
- Select `API` and generate the key `NVIDIA_API_KEY`.
```bash
export NVIDIA_API_KEY=nvapi-XXXXXXXXXXXXXXXXXXXXXXXXXX
```
```python
from langchain_nvidia_aiplay import ChatNVAIPlay
llm = ChatNVAIPlay(model="mixtral_8x7b")
result = llm.invoke("Write a ballad about LangChain.")
print(result.content)
```
## Using NVIDIA AI Playground Models
A selection of NVIDIA AI Playground models are supported directly in LangChain with familiar APIs.
The active models which are supported can be found [in NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/). In addition, a selection of models can be retrieved from `langchain.<llms/chat_models>.nv_aiplay` which pull in default model options based on their use cases.
**The following may be useful examples to help you get started:**
- **[`ChatNVAIPlay` Model](/docs/integrations/chat/nv_aiplay).**
- **[`NVAIPlayEmbedding` Model for RAG Workflows](/docs/integrations/text_embeddings/nv_aiplay).**

View File

@@ -0,0 +1,38 @@
# NVIDIA
> [NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/) give users easy access to hosted endpoints for generative AI models like Llama-2, SteerLM, Mistral, etc. Using the API, you can query live endpoints available on the [NVIDIA GPU Cloud (NGC)](https://catalog.ngc.nvidia.com/ai-foundation-models) to get quick results from a DGX-hosted cloud compute environment. All models are source-accessible and can be deployed on your own compute cluster.
These models are provided via the `langchain-nvidia-ai-endpoints` package.
## Installation
```bash
pip install -U langchain-nvidia-ai-endpoints
```
## Setup and Authentication
- Create a free account at [NVIDIA GPU Cloud (NGC)](https://catalog.ngc.nvidia.com/).
- Navigate to `Catalog > AI Foundation Models > (Model with API endpoint)`.
- Select `API` and generate the key `NVIDIA_API_KEY`.
```bash
export NVIDIA_API_KEY=nvapi-XXXXXXXXXXXXXXXXXXXXXXXXXX
```
```python
from langchain_nvidia_ai_endpoints import ChatNVIDIA
llm = ChatNVIDIA(model="mixtral_8x7b")
result = llm.invoke("Write a ballad about LangChain.")
print(result.content)
```
## Using NVIDIA AI Foundation Endpoints
A selection of NVIDIA AI Foundation models are supported directly in LangChain with familiar APIs.
The active models which are supported can be found [in NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/).
**The following may be useful examples to help you get started:**
- **[`ChatNVIDIA` Model](/docs/integrations/chat/nvidia_ai_endpoints).**
- **[`NVIDIAEmbeddings` Model for RAG Workflows](/docs/integrations/text_embeddings/nvidia_ai_endpoints).**

View File

@@ -6,12 +6,13 @@
"id": "GDDVue_1cq6d"
},
"source": [
"# NVIDIA AI Playground Embedding Models\n",
"# NVIDIA AI Foundation Endpoints \n",
"\n",
">[NVIDIA AI Playground](https://www.nvidia.com/en-us/research/ai-playground/) gives users easy access to hosted endpoints for generative AI models like Llama-2, SteerLM, Mistral, etc. Using the API, you can query NVCR (NVIDIA Container Registry) function endpoints and get quick results from a DGX-hosted cloud compute environment. All models are source-accessible and can be deployed on your own compute cluster.\n",
">[NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/research/ai-playground/) gives users easy access to hosted endpoints for generative AI models like Llama-2, SteerLM, Mistral, etc. Using the API, you can query live endpoints and get quick results from a DGX-hosted cloud compute environment. All models are source-accessible and can be deployed on your own compute cluster.\n",
"\n",
"This example goes over how to use LangChain to interact with supported the NVOLVE question-answer embedding model [(NGC AI Playground entry in NGC)](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/nvolve-29k). \n",
"For more information on the accessing the chat models through this api, check out the [ChatNVAIPlay](../chat/nv_aiplay) documentation."
"This example goes over how to use LangChain to interact with the supported [NVIDIA Retrieval QA Embedding Model](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/nvolve-40k) for [retrieval-augmented generation](https://developer.nvidia.com/blog/build-enterprise-retrieval-augmented-generation-apps-with-nvidia-retrieval-qa-embedding-model/) via the `NVIDIAEmbeddings` class.\n",
"\n",
"For more information on accessing the chat models through this api, check out the [ChatNVIDIA](../chat/nvidia_ai_endpoints) documentation."
]
},
{
@@ -27,7 +28,7 @@
"metadata": {},
"outputs": [],
"source": [
"%pip install -U --quiet langchain-nvidia-aiplay"
"%pip install -U --quiet langchain-nvidia-ai-endpoints"
]
},
{
@@ -47,7 +48,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 1,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
@@ -55,20 +56,12 @@
"id": "hoF41-tNczS3",
"outputId": "7f2833dc-191c-4d73-b823-7b2745a93a2f"
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"NVAPI Key (starts with nvapi-): ········\n"
]
}
],
"outputs": [],
"source": [
"import getpass\n",
"import os\n",
"\n",
"## API Key can be found by going to NVIDIA NGC -> AI Playground -> (some model) -> Get API Code or similar.\n",
"## API Key can be found by going to NVIDIA NGC -> AI Foundation Models -> (some model) -> Get API Code or similar.\n",
"## 10K free queries to any endpoint (which is a lot actually).\n",
"\n",
"# del os.environ['NVIDIA_API_KEY'] ## delete key and reset\n",
@@ -86,7 +79,7 @@
"id": "l185et2kc8pS"
},
"source": [
"We should be able to see an embedding model among that list which can be used in conjunction with an LLM for effective RAG solutions. We can interface with this model pretty easily with the help of the `NVAIEmbeddings` model."
"We should be able to see an embedding model among that list which can be used in conjunction with an LLM for effective RAG solutions. We can interface with this model pretty easily with the help of the `NVIDIAEmbeddings` model."
]
},
{
@@ -104,18 +97,18 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 2,
"metadata": {
"id": "hbXmJssPdIPX"
},
"outputs": [],
"source": [
"from langchain_nvidia_aiplay import NVAIPlayEmbeddings\n",
"from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings\n",
"\n",
"embedder = NVAIPlayEmbeddings(model=\"nvolveqa_40k\")\n",
"embedder = NVIDIAEmbeddings(model=\"nvolveqa_40k\")\n",
"\n",
"# Alternatively, if you want to specify whether it will use the query or passage type\n",
"# embedder = NVAIPlayEmbeddings(model=\"nvolveqa_40k\", model_type=\"passage\")"
"# embedder = NVIDIAEmbeddings(model=\"nvolveqa_40k\", model_type=\"passage\")"
]
},
{
@@ -124,7 +117,7 @@
"id": "SvQijbCwdLXB"
},
"source": [
"This model is a fine-tuned E5-large model which supports the expected `Embeddings`` methods including:\n",
"This model is a fine-tuned E5-large model which supports the expected `Embeddings` methods including:\n",
"- `embed_query`: Generate query embedding for a query sample.\n",
"- `embed_documents`: Generate passage embeddings for a list of documents which you would like to search over.\n",
"- `aembed_quey`/`embed_documents`: Asynchronous versions of the above."
@@ -166,7 +159,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 3,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
@@ -180,15 +173,15 @@
"output_type": "stream",
"text": [
"Single Query Embedding: \n",
"\u001b[1mExecuted in 0.62 seconds.\u001b[0m\n",
"\u001b[1mExecuted in 1.39 seconds.\u001b[0m\n",
"Shape: (1024,)\n",
"\n",
"Sequential Embedding: \n",
"\u001b[1mExecuted in 2.35 seconds.\u001b[0m\n",
"\u001b[1mExecuted in 3.20 seconds.\u001b[0m\n",
"Shape: (5, 1024)\n",
"\n",
"Batch Query Embedding: \n",
"\u001b[1mExecuted in 0.79 seconds.\u001b[0m\n",
"\u001b[1mExecuted in 1.52 seconds.\u001b[0m\n",
"Shape: (5, 1024)\n"
]
}
@@ -219,7 +212,7 @@
"print(\"\\nBatch Query Embedding: \")\n",
"s = time.perf_counter()\n",
"# To use the \"query\" mode, we have to add it as an instance arg\n",
"q_embeddings = NVAIPlayEmbeddings(\n",
"q_embeddings = NVIDIAEmbeddings(\n",
" model=\"nvolveqa_40k\", model_type=\"query\"\n",
").embed_documents(\n",
" [\n",
@@ -246,7 +239,7 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 4,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
@@ -260,11 +253,11 @@
"output_type": "stream",
"text": [
"Single Document Embedding: \n",
"\u001b[1mExecuted in 0.36 seconds.\u001b[0m\n",
"\u001b[1mExecuted in 0.76 seconds.\u001b[0m\n",
"Shape: (1024,)\n",
"\n",
"Batch Document Embedding: \n",
"\u001b[1mExecuted in 0.77 seconds.\u001b[0m\n",
"\u001b[1mExecuted in 0.86 seconds.\u001b[0m\n",
"Shape: (5, 1024)\n"
]
}
@@ -305,12 +298,12 @@
"id": "E6AilXxjdm1I"
},
"source": [
"Now that we've generated out embeddings, we can do a simple similarity check on the results to see which documents would have triggered as reasonable answers in a retrieval task:"
"Now that we've generated our embeddings, we can do a simple similarity check on the results to see which documents would have triggered as reasonable answers in a retrieval task:"
]
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": 5,
"metadata": {},
"outputs": [
{
@@ -327,7 +320,7 @@
},
{
"cell_type": "code",
"execution_count": 20,
"execution_count": 6,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
@@ -403,20 +396,28 @@
"## RAG Retrieval:\n",
"\n",
"The following is a repurposing of the initial example of the [LangChain Expression Language Retrieval Cookbook entry](\n",
"https://python.langchain.com/docs/expression_language/cookbook/retrieval), but executed with NVIDIA AI Playground's [Mistral 7B Instruct](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/mistral-7b-instruct) and [NVOLVE Retrieval QA Embedding](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/nvolve-29k) models. The subsequent examples in the cookbook also run as expected, and we encourage you to explore with these options.\n",
"https://python.langchain.com/docs/expression_language/cookbook/retrieval), but executed with the AI Foundation Models' [Mixtral 8x7B Instruct](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/mixtral-8x7b) and [NVIDIA Retrieval QA Embedding](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/nvolve-40k) models available in their playground environments. The subsequent examples in the cookbook also run as expected, and we encourage you to explore with these options.\n",
"\n",
"**TIP:** We would recommend using Mistral for internal reasoning (i.e. instruction following for data extraction, tool selection, etc.) and Llama-Chat for a single final \"wrap-up by making a simple response that works for this user based on the history and context\" response."
"**TIP:** We would recommend using Mixtral for internal reasoning (i.e. instruction following for data extraction, tool selection, etc.) and Llama-Chat for a single final \"wrap-up by making a simple response that works for this user based on the history and context\" response."
]
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 8,
"metadata": {
"id": "zn_zeRGP64DJ"
},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"!pip install langchain faiss-cpu tiktoken -q\n",
"%pip install --quiet langchain faiss-cpu tiktoken\n",
"\n",
"from operator import itemgetter\n",
"\n",
@@ -424,12 +425,12 @@
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_core.runnables import RunnablePassthrough\n",
"from langchain_nvidia_aiplay import ChatNVAIPlay"
"from langchain_nvidia_ai_endpoints import ChatNVIDIA"
]
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 9,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
@@ -445,7 +446,7 @@
"'Based on the document provided, Harrison worked at Kensho.'"
]
},
"execution_count": 11,
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
@@ -453,7 +454,7 @@
"source": [
"vectorstore = FAISS.from_texts(\n",
" [\"harrison worked at kensho\"],\n",
" embedding=NVAIPlayEmbeddings(model=\"nvolveqa_40k\"),\n",
" embedding=NVIDIAEmbeddings(model=\"nvolveqa_40k\"),\n",
")\n",
"retriever = vectorstore.as_retriever()\n",
"\n",
@@ -467,7 +468,7 @@
" ]\n",
")\n",
"\n",
"model = ChatNVAIPlay(model=\"mixtral_8x7b\")\n",
"model = ChatNVIDIA(model=\"mixtral_8x7b\")\n",
"\n",
"chain = (\n",
" {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
@@ -481,7 +482,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 10,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
@@ -497,7 +498,7 @@
"'Harrison ha lavorato presso Kensho.\\n\\n(In English: Harrison worked at Kensho.)'"
]
},
"execution_count": 13,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
@@ -548,7 +549,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.9.18"
}
},
"nbformat": 4,