mirror of
https://github.com/hwchase17/langchain.git
synced 2025-07-19 19:11:33 +00:00
docs: Integrations NVIDIA llm documentation (#26934)
**Description:** Add Notebook for NVIDIA prompt completion llm class. cc: @sumitkbh @mattf @dglogo --------- Co-authored-by: Erick Friis <erick@langchain.dev>
This commit is contained in:
parent
ab4dab9a0c
commit
47142eb6ee
309
docs/docs/integrations/llms/nvidia_ai_endpoints.ipynb
Normal file
309
docs/docs/integrations/llms/nvidia_ai_endpoints.ipynb
Normal file
@ -0,0 +1,309 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# NVIDIA\n",
|
||||||
|
"\n",
|
||||||
|
"This will help you getting started with NVIDIA [models](/docs/concepts/#llms). For detailed documentation of all `NVIDIA` features and configurations head to the [API reference](https://python.langchain.com/api_reference/nvidia_ai_endpoints/llms/langchain_nvidia_ai_endpoints.chat_models.NVIDIA.html).\n",
|
||||||
|
"\n",
|
||||||
|
"## Overview\n",
|
||||||
|
"The `langchain-nvidia-ai-endpoints` package contains LangChain integrations building applications with models on \n",
|
||||||
|
"NVIDIA NIM inference microservice. These models are optimized by NVIDIA to deliver the best performance on NVIDIA \n",
|
||||||
|
"accelerated infrastructure and deployed as a NIM, an easy-to-use, prebuilt containers that deploy anywhere using a single \n",
|
||||||
|
"command on NVIDIA accelerated infrastructure.\n",
|
||||||
|
"\n",
|
||||||
|
"NVIDIA hosted deployments of NIMs are available to test on the [NVIDIA API catalog](https://build.nvidia.com/). After testing, \n",
|
||||||
|
"NIMs can be exported from NVIDIA’s API catalog using the NVIDIA AI Enterprise license and run on-premises or in the cloud, \n",
|
||||||
|
"giving enterprises ownership and full control of their IP and AI application.\n",
|
||||||
|
"\n",
|
||||||
|
"NIMs are packaged as container images on a per model basis and are distributed as NGC container images through the NVIDIA NGC Catalog. \n",
|
||||||
|
"At their core, NIMs provide easy, consistent, and familiar APIs for running inference on an AI model.\n",
|
||||||
|
"\n",
|
||||||
|
"This example goes over how to use LangChain to interact with NVIDIA supported via the `NVIDIA` class.\n",
|
||||||
|
"\n",
|
||||||
|
"For more information on accessing the llm models through this api, check out the [NVIDIA](https://python.langchain.com/docs/integrations/llms/nvidia_ai_endpoints/) documentation.\n",
|
||||||
|
"\n",
|
||||||
|
"### Integration details\n",
|
||||||
|
"\n",
|
||||||
|
"| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |\n",
|
||||||
|
"| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n",
|
||||||
|
"| [NVIDIA](https://python.langchain.com/api_reference/nvidia_ai_endpoints/llms/langchain_nvidia_ai_endpoints.chat_models.ChatNVIDIA.html) | [langchain_nvidia_ai_endpoints](https://python.langchain.com/api_reference/nvidia_ai_endpoints/index.html) | ✅ | beta | ❌ |  |  |\n",
|
||||||
|
"\n",
|
||||||
|
"### Model features\n",
|
||||||
|
"| JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
|
||||||
|
"| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n",
|
||||||
|
"| ❌ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | \n",
|
||||||
|
"\n",
|
||||||
|
"## Setup\n",
|
||||||
|
"\n",
|
||||||
|
"**To get started:**\n",
|
||||||
|
"\n",
|
||||||
|
"1. Create a free account with [NVIDIA](https://build.nvidia.com/), which hosts NVIDIA AI Foundation models.\n",
|
||||||
|
"\n",
|
||||||
|
"2. Click on your model of choice.\n",
|
||||||
|
"\n",
|
||||||
|
"3. Under `Input` select the `Python` tab, and click `Get API Key`. Then click `Generate Key`.\n",
|
||||||
|
"\n",
|
||||||
|
"4. Copy and save the generated key as `NVIDIA_API_KEY`. From there, you should have access to the endpoints.\n",
|
||||||
|
"\n",
|
||||||
|
"### Credentials\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 1,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import getpass\n",
|
||||||
|
"import os\n",
|
||||||
|
"\n",
|
||||||
|
"if not os.getenv(\"NVIDIA_API_KEY\"):\n",
|
||||||
|
" # Note: the API key should start with \"nvapi-\"\n",
|
||||||
|
" os.environ[\"NVIDIA_API_KEY\"] = getpass.getpass(\"Enter your NVIDIA API key: \")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Installation\n",
|
||||||
|
"\n",
|
||||||
|
"The LangChain NVIDIA AI Endpoints integration lives in the `langchain_nvidia_ai_endpoints` package:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%pip install --upgrade --quiet langchain-nvidia-ai-endpoints"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Instantiation\n",
|
||||||
|
"\n",
|
||||||
|
"See [LLM](/docs/how_to#llms) for full functionality."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 3,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from langchain_nvidia_ai_endpoints import NVIDIA"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"llm = NVIDIA().bind(max_tokens=256)\n",
|
||||||
|
"llm"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Invocation"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 5,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"prompt = \"# Function that does quicksort written in Rust without comments:\""
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"print(llm.invoke(prompt))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Stream, Batch, and Async\n",
|
||||||
|
"\n",
|
||||||
|
"These models natively support streaming, and as is the case with all LangChain LLMs they expose a batch method to handle concurrent requests, as well as async methods for invoke, stream, and batch. Below are a few examples."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"for chunk in llm.stream(prompt):\n",
|
||||||
|
" print(chunk, end=\"\", flush=True)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"llm.batch([prompt])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"await llm.ainvoke(prompt)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"async for chunk in llm.astream(prompt):\n",
|
||||||
|
" print(chunk, end=\"\", flush=True)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"await llm.abatch([prompt])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"async for chunk in llm.astream_log(prompt):\n",
|
||||||
|
" print(chunk)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"response = llm.invoke(\n",
|
||||||
|
" \"X_train, y_train, X_test, y_test = train_test_split(X, y, test_size=0.1) #Train a logistic regression model, predict the labels on the test set and compute the accuracy score\"\n",
|
||||||
|
")\n",
|
||||||
|
"print(response)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Supported models\n",
|
||||||
|
"\n",
|
||||||
|
"Querying `available_models` will still give you all of the other models offered by your API credentials."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"NVIDIA.get_available_models()\n",
|
||||||
|
"# llm.get_available_models()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Chaining\n",
|
||||||
|
"\n",
|
||||||
|
"We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from langchain_core.prompts import ChatPromptTemplate\n",
|
||||||
|
"\n",
|
||||||
|
"prompt = ChatPromptTemplate(\n",
|
||||||
|
" [\n",
|
||||||
|
" (\n",
|
||||||
|
" \"system\",\n",
|
||||||
|
" \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n",
|
||||||
|
" ),\n",
|
||||||
|
" (\"human\", \"{input}\"),\n",
|
||||||
|
" ]\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"chain = prompt | llm\n",
|
||||||
|
"chain.invoke(\n",
|
||||||
|
" {\n",
|
||||||
|
" \"input_language\": \"English\",\n",
|
||||||
|
" \"output_language\": \"German\",\n",
|
||||||
|
" \"input\": \"I love programming.\",\n",
|
||||||
|
" }\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## API reference\n",
|
||||||
|
"\n",
|
||||||
|
"For detailed documentation of all `NVIDIA` features and configurations head to the API reference: https://python.langchain.com/api_reference/nvidia_ai_endpoints/llms/langchain_nvidia_ai_endpoints.llms.NVIDIA.html"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": []
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "langchain-nvidia-ai-endpoints-m0-Y4aGr-py3.10",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.10.12"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 2
|
||||||
|
}
|
Loading…
Reference in New Issue
Block a user