[Documentation] Updates to NVIDIA Playground/Foundation Model naming.… (#14770)

… (#14723) - **Description:** Minor updates per marketing requests. Namely, name decisions (AI Foundation Models / AI Playground) - **Tag maintainer:** @hinthornw Do want to pass around the PR for a bit and ask a few more marketing questions before merge, but just want to make sure I'm not working in a vacuum. No major changes to code functionality intended; the PR should be for documentation and only minor tweaks. Note: QA model is a bit borked across staging/prod right now. Relevant teams have been informed and are looking into it, and I'm placeholdered the response to that of a working version in the notebook. Co-authored-by: Vadim Kudlay <32310964+VKudlay@users.noreply.github.com>
2025-09-16 15:04:13 +00:00 · 2023-12-15 12:21:59 -08:00
parent 65091ebe50
commit c5296fd42c
30 changed files with 387 additions and 381 deletions
--- a/libs/partners/nvidia-ai-endpoints/.gitignore
+++ b/libs/partners/nvidia-ai-endpoints/.gitignore
--- a/libs/partners/nvidia-ai-endpoints/LICENSE
+++ b/libs/partners/nvidia-ai-endpoints/LICENSE
--- a/libs/partners/nvidia-ai-endpoints/Makefile
+++ b/libs/partners/nvidia-ai-endpoints/Makefile
@@ -12,7 +12,7 @@ test:
 tests:
 	poetry run pytest $(TEST_FILE)

-check_imports: $(shell find langchain_nvidia_aiplay -name '*.py')
+check_imports: $(shell find langchain_nvidia_ai_endpoints -name '*.py')
 	poetry run python ./scripts/check_imports.py $^

 integration_tests:
@@ -28,7 +28,7 @@ PYTHON_FILES=.
 MYPY_CACHE=.mypy_cache
 lint format: PYTHON_FILES=.
 lint_diff format_diff: PYTHON_FILES=$(shell git diff --name-only --diff-filter=d master | grep -E '\.py$$|\.ipynb$$')
-lint_package: PYTHON_FILES=langchain_nvidia_aiplay
+lint_package: PYTHON_FILES=langchain_nvidia_ai_endpoints
 lint_tests: PYTHON_FILES=tests
 lint_tests: MYPY_CACHE=.mypy_cache_test

--- a/libs/partners/nvidia-ai-endpoints/README.md
+++ b/libs/partners/nvidia-ai-endpoints/README.md
@@ -1,16 +1,16 @@
-# langchain-nvidia-aiplay
+# langchain-nvidia-ai-endpoints

-The `langchain-nvidia-aiplay` package contains LangChain integrations for chat models and embeddings powered by the NVIDIA AI Playground.
+The `langchain-nvidia-ai-endpoints` package contains LangChain integrations for chat models and embeddings powered by the [NVIDIA AI Foundation Model](https://www.nvidia.com/en-us/ai-data-science/foundation-models/) playground environment. 

->[NVIDIA AI Playground](https://www.nvidia.com/en-us/research/ai-playground/) gives users easy access to hosted endpoints for generative AI models like Llama-2, SteerLM, Mistral, etc. Using the API, you can query NVCR (NVIDIA Container Registry) function endpoints and get quick results from a DGX-hosted cloud compute environment. All models are source-accessible and can be deployed on your own compute cluster.
+> [NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/) give users easy access to hosted endpoints for generative AI models like Llama-2, SteerLM, Mistral, etc. Using the API, you can query live endpoints available on the [NVIDIA GPU Cloud (NGC)](https://catalog.ngc.nvidia.com/ai-foundation-models) to get quick results from a DGX-hosted cloud compute environment. All models are source-accessible and can be deployed on your own compute cluster.

-Below is an example on how to use some common chat model functionality.
+Below is an example on how to use some common functionality surrounding text-generative and embedding models

 ## Installation


 ```python
-%pip install -U --quiet langchain-nvidia-aiplay
+%pip install -U --quiet langchain-nvidia-ai-endpoints
 ```

 ## Setup
@@ -35,9 +35,9 @@ if not os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"):

 ```python
 ## Core LC Chat Interface
-from langchain_nvidia_aiplay import ChatNVAIPlay
+from langchain_nvidia_ai_endpoints import ChatNVIDIA

-llm = ChatNVAIPlay(model="mixtral_8x7b")
+llm = ChatNVIDIA(model="mixtral_8x7b")
 result = llm.invoke("Write a ballad about LangChain.")
 print(result.content)
 ```
@@ -98,12 +98,12 @@ list(llm.available_models)

 ## Model types

-All of these models above are supported and can be accessed via `ChatNVAIPlay`. 
+All of these models above are supported and can be accessed via `ChatNVIDIA`. 

 Some model types support unique prompting techniques and chat messages. We will review a few important ones below.


-**To find out more about a specific model, please navigate to the API section of an AI Playground model [as linked here](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/codellama-13b/api).**
+**To find out more about a specific model, please navigate to the API section of an AI Foundation Model [as linked here](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/codellama-13b/api).**

 ### General Chat

@@ -111,7 +111,7 @@ Models such as `llama2_13b` and `mixtral_8x7b` are good all-around models that y


 ```python
-from langchain_nvidia_aiplay import ChatNVAIPlay
+from langchain_nvidia_ai_endpoints import ChatNVIDIA
 from langchain_core.prompts import ChatPromptTemplate
 from langchain_core.output_parsers import StrOutputParser

@@ -123,7 +123,7 @@ prompt = ChatPromptTemplate.from_messages(
 )
 chain = (
    prompt
-    | ChatNVAIPlay(model="llama2_13b")
+    | ChatNVIDIA(model="llama2_13b")
    | StrOutputParser()
 )

@@ -146,7 +146,7 @@ prompt = ChatPromptTemplate.from_messages(
 )
 chain = (
    prompt
-    | ChatNVAIPlay(model="llama2_code_13b")
+    | ChatNVIDIA(model="llama2_code_13b")
    | StrOutputParser()
 )

@@ -164,9 +164,9 @@ The "steer" models support this type of input, such as `steerlm_llama_70b`


 ```python
-from langchain_nvidia_aiplay import ChatNVAIPlay
+from langchain_nvidia_ai_endpoints import ChatNVIDIA

-llm = ChatNVAIPlay(model="steerlm_llama_70b")
+llm = ChatNVIDIA(model="steerlm_llama_70b")
 # Try making it uncreative and not verbose
 complex_result = llm.invoke(
    "What's a PB&J?",
@@ -191,7 +191,7 @@ The labels are passed as invocation params. You can `bind` these to the LLM usin


 ```python
-from langchain_nvidia_aiplay import ChatNVAIPlay
+from langchain_nvidia_ai_endpoints import ChatNVIDIA
 from langchain_core.prompts import ChatPromptTemplate
 from langchain_core.output_parsers import StrOutputParser

@@ -203,7 +203,7 @@ prompt = ChatPromptTemplate.from_messages(
 )
 chain = (
    prompt
-    | ChatNVAIPlay(model="steerlm_llama_70b").bind(labels={"creativity": 9, "complexity": 0, "verbosity": 9})
+    | ChatNVIDIA(model="steerlm_llama_70b").bind(labels={"creativity": 9, "complexity": 0, "verbosity": 9})
    | StrOutputParser()
 )

@@ -213,7 +213,7 @@ for txt in chain.stream({"input": "Why is a PB&J?"}):

 ## Multimodal

-NVidia also supports multimodal inputs, meaning you can provide both images and text for the model to reason over.
+NVIDIA also supports multimodal inputs, meaning you can provide both images and text for the model to reason over.

 These models also accept `labels`, similar to the Steering LLMs above. In addition to `creativity`, `complexity`, and `verbosity`, these models support a `quality` toggle.

@@ -232,9 +232,9 @@ image_content = requests.get(image_url).content
 Initialize the model like so:

 ```python
-from langchain_nvidia_aiplay import ChatNVAIPlay
+from langchain_nvidia_ai_endpoints import ChatNVIDIA

-llm = ChatNVAIPlay(model="playground_neva_22b")
+llm = ChatNVIDIA(model="playground_neva_22b")
 ```

 #### Passing an image as a URL
@@ -315,7 +315,7 @@ The `_qa_` models like `nemotron_qa_8b` support this.


 ```python
-from langchain_nvidia_aiplay import ChatNVAIPlay
+from langchain_nvidia_ai_endpoints import ChatNVIDIA
 from langchain_core.prompts import ChatPromptTemplate
 from langchain_core.output_parsers import StrOutputParser
 from langchain_core.messages import ChatMessage
@@ -325,7 +325,7 @@ prompt = ChatPromptTemplate.from_messages(
        ("user", "{input}")
    ]
 )
-llm = ChatNVAIPlay(model="nemotron_qa_8b")
+llm = ChatNVIDIA(model="nemotron_qa_8b")
 chain = (
    prompt
    | llm
@@ -339,9 +339,9 @@ chain.invoke({"input": "What was signed?"})
 You can also connect to embeddings models through this package. Below is an example:

 ```
-from langchain_nvidia_aiplay import NVAIPlayEmbeddings
+from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings

-embedder = NVAIPlayEmbeddings(model="nvolveqa_40k")
+embedder = NVIDIAEmbeddings(model="nvolveqa_40k")
 embedder.embed_query("What's the temperature today?")
 embedder.embed_documents([
    "The temperature is 42 degrees.",
@@ -352,7 +352,7 @@ embedder.embed_documents([
 By default the embedding model will use the "passage" type for documents and "query" type for queries, but you can fix this on the instance.

 ```python
-query_embedder = NVAIPlayEmbeddings(model="nvolveqa_40k", model_type="query")
-doc_embeddder = NVAIPlayEmbeddings(model="nvolveqa_40k", model_type="passage")
+query_embedder = NVIDIAEmbeddings(model="nvolveqa_40k", model_type="query")
+doc_embeddder = NVIDIAEmbeddings(model="nvolveqa_40k", model_type="passage")
 ```

--- a/libs/partners/nvidia-ai-endpoints/langchain_nvidia_ai_endpoints/init.py
+++ b/libs/partners/nvidia-ai-endpoints/langchain_nvidia_ai_endpoints/init.py
@@ -0,0 +1,45 @@
+"""
+**LangChain NVIDIA AI Foundation Model Playground Integration**
+
+This comprehensive module integrates NVIDIA's state-of-the-art AI Foundation Models, featuring advanced models for conversational AI and semantic embeddings, into the LangChain framework. It provides robust classes for seamless interaction with NVIDIA's AI models, particularly tailored for enriching conversational experiences and enhancing semantic understanding in various applications.
+
+**Features:**
+
+1. **Chat Models (`ChatNVIDIA`):** This class serves as the primary interface for interacting with NVIDIA's Foundation chat models. Users can effortlessly utilize NVIDIA's advanced models like 'Mistral' to engage in rich, context-aware conversations, applicable across diverse domains from customer support to interactive storytelling.
+
+2. **Semantic Embeddings (`NVIDIAEmbeddings`):** The module offers capabilities to generate sophisticated embeddings using NVIDIA's AI models. These embeddings are instrumental for tasks like semantic analysis, text similarity assessments, and contextual understanding, significantly enhancing the depth of NLP applications.
+
+**Installation:**
+
+Install this module easily using pip:
+
+```python
+pip install langchain-nvidia-ai-endpoints
+```
+
+## Utilizing Chat Models:
+
+After setting up the environment, interact with NVIDIA AI Foundation models:
+```python
+from langchain_nvidia_ai_endpoints import ChatNVIDIA
+
+ai_chat_model = ChatNVIDIA(model="llama2_13b")
+response = ai_chat_model.invoke("Tell me about the LangChain integration.")
+```
+
+# Generating Semantic Embeddings:
+
+Use NVIDIA's models for creating embeddings, useful in various NLP tasks:
+
+```python
+from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings
+
+embed_model = NVIDIAEmbeddings(model="nvolveqa_40k")
+embedding_output = embed_model.embed_query("Exploring AI capabilities.")
+```
+"""  # noqa: E501
+
+from langchain_nvidia_ai_endpoints.chat_models import ChatNVIDIA
+from langchain_nvidia_ai_endpoints.embeddings import NVIDIAEmbeddings
+
+__all__ = ["ChatNVIDIA", "NVIDIAEmbeddings"]
--- a/libs/partners/nvidia-ai-endpoints/langchain_nvidia_ai_endpoints/_common.py
+++ b/libs/partners/nvidia-ai-endpoints/langchain_nvidia_ai_endpoints/_common.py
@@ -32,14 +32,14 @@ from requests.models import Response
 logger = logging.getLogger(__name__)


-class NVCRModel(BaseModel):
+class NVEModel(BaseModel):

    """
-    Underlying Client for interacting with the AI Playground API.
-    Leveraged by the NVAIPlayBaseModel to provide a simple requests-oriented interface.
+    Underlying Client for interacting with the AI Foundation Model Function API.
+    Leveraged by the NVIDIABaseModel to provide a simple requests-oriented interface.
    Direct abstraction over NGC-recommended streaming/non-streaming Python solutions.

-    NOTE: AI Playground does not currently support raw text continuation.
+    NOTE: Models in the playground does not currently support raw text continuation.
    """

    ## Core defaults. These probably should not be changed
@@ -50,7 +50,7 @@ class NVCRModel(BaseModel):

    nvidia_api_key: SecretStr = Field(
        ...,
-        description="API key for NVIDIA AI Playground. Should start with `nvapi-`",
+        description="API key for NVIDIA Foundation Endpoints. Starts with `nvapi-`",
    )
    is_staging: bool = Field(False, description="Whether to use staging API")

@@ -150,10 +150,10 @@ class NVCRModel(BaseModel):
        return path

    ####################################################################################
-    ## Core utilities for posting and getting from NVCR
+    ## Core utilities for posting and getting from NV Endpoints

    def _post(self, invoke_url: str, payload: dict = {}) -> Tuple[Response, Any]:
-        """Method for posting to the AI Playground API."""
+        """Method for posting to the AI Foundation Model Function API."""
        call_inputs = {
            "url": invoke_url,
            "headers": self.headers["call"],
@@ -166,7 +166,7 @@ class NVCRModel(BaseModel):
        return response, session

    def _get(self, invoke_url: str, payload: dict = {}) -> Tuple[Response, Any]:
-        """Method for getting from the AI Playground API."""
+        """Method for getting from the AI Foundation Model Function API."""
        last_inputs = {
            "url": invoke_url,
            "headers": self.headers["call"],
@@ -208,7 +208,7 @@ class NVCRModel(BaseModel):
                rd = response.__dict__
                rd = rd.get("_content", rd)
                if isinstance(rd, bytes):
-                    rd = rd.decode("utf-8")[5:]  ## lop of data: prefix ??
+                    rd = rd.decode("utf-8")[5:]  ## remove "data:" prefix
                try:
                    rd = json.loads(rd)
                except Exception:
@@ -295,7 +295,7 @@ class NVCRModel(BaseModel):
        invoke_url: Optional[str] = None,
        stop: Optional[Sequence[str]] = None,
    ) -> dict:
-        """Method for an end-to-end post query with NVCR post-processing."""
+        """Method for an end-to-end post query with NVE post-processing."""
        response = self.get_req(model_name, payload, invoke_url)
        output, _ = self.postprocess(response, stop=stop)
        return output
@@ -303,7 +303,7 @@ class NVCRModel(BaseModel):
    def postprocess(
        self, response: Union[str, Response], stop: Optional[Sequence[str]] = None
    ) -> Tuple[dict, bool]:
-        """Parses a response from the AI Playground API.
+        """Parses a response from the AI Foundation Model Function API.
        Strongly assumes that the API will return a single response.
        """
        msg_list = self._process_response(response)
@@ -414,13 +414,13 @@ class NVCRModel(BaseModel):
                            break


-class _NVAIPlayClient(BaseModel):
+class _NVIDIAClient(BaseModel):
    """
-    Higher-Level Client for interacting with AI Playground API with argument defaults.
-    Is subclassed by NVAIPlayLLM/ChatNVAIPlay to provide a simple LangChain interface.
+    Higher-Level AI Foundation Model Function API Client with argument defaults.
+    Is subclassed by ChatNVIDIA to provide a simple LangChain interface.
    """

-    client: NVCRModel = Field(NVCRModel)
+    client: NVEModel = Field(NVEModel)

    model: str = Field(..., description="Name of the model to invoke")

@@ -434,7 +434,7 @@ class _NVAIPlayClient(BaseModel):
    def validate_client(cls, values: Any) -> Any:
        """Validate and update client arguments, including API key and formatting"""
        if not values.get("client"):
-            values["client"] = NVCRModel(**values)
+            values["client"] = NVEModel(**values)
        return values

    @classmethod
@@ -497,7 +497,7 @@ class _NVAIPlayClient(BaseModel):
    def get_payload(
        self, inputs: Sequence[Dict], labels: Optional[dict] = None, **kwargs: Any
    ) -> dict:
-        """Generates payload for the _NVAIPlayClient API to send to service."""
+        """Generates payload for the _NVIDIAClient API to send to service."""
        return {
            **self.preprocess(inputs=inputs, labels=labels),
            **kwargs,
--- a/libs/partners/nvidia-ai-endpoints/langchain_nvidia_ai_endpoints/chat_models.py
+++ b/libs/partners/nvidia-ai-endpoints/langchain_nvidia_ai_endpoints/chat_models.py
@@ -1,4 +1,4 @@
-"""Chat Model Components Derived from ChatModel/NVAIPlay"""
+"""Chat Model Components Derived from ChatModel/NVIDIA"""
 from __future__ import annotations

 import base64
@@ -26,7 +26,7 @@ from langchain_core.language_models.chat_models import SimpleChatModel
 from langchain_core.messages import BaseMessage, ChatMessage, ChatMessageChunk
 from langchain_core.outputs import ChatGenerationChunk

-from langchain_nvidia_aiplay import _common as nv_aiplay
+from langchain_nvidia_ai_endpoints import _common as nvidia_ai_endpoints

 logger = logging.getLogger(__name__)

@@ -70,22 +70,22 @@ def _url_to_b64_string(image_source: str) -> str:
        raise ValueError(f"Unable to process the provided image source: {e}")


-class ChatNVAIPlay(nv_aiplay._NVAIPlayClient, SimpleChatModel):
-    """NVAIPlay chat model.
+class ChatNVIDIA(nvidia_ai_endpoints._NVIDIAClient, SimpleChatModel):
+    """NVIDIA chat model.

    Example:
        .. code-block:: python

-            from langchain_nvidia_aiplay import ChatNVAIPlay
+            from langchain_nvidia_ai_endpoints import ChatNVIDIA


-            model = ChatNVAIPlay(model="llama2_13b")
+            model = ChatNVIDIA(model="llama2_13b")
            response = model.invoke("Hello")
    """

    @property
    def _llm_type(self) -> str:
-        """Return type of NVIDIA AI Playground Interface."""
+        """Return type of NVIDIA AI Foundation Model Interface."""
        return "chat-nvidia-ai-playground"

    def _call(
--- a/libs/partners/nvidia-ai-endpoints/langchain_nvidia_ai_endpoints/embeddings.py
+++ b/libs/partners/nvidia-ai-endpoints/langchain_nvidia_ai_endpoints/embeddings.py
@@ -1,16 +1,16 @@
-"""Embeddings Components Derived from ChatModel/NVAIPlay"""
+"""Embeddings Components Derived from NVEModel/Embeddings"""
 from typing import Any, List, Literal, Optional

 from langchain_core.embeddings import Embeddings
 from langchain_core.pydantic_v1 import BaseModel, Field, root_validator

-import langchain_nvidia_aiplay._common as nvaiplay_common
+import langchain_nvidia_ai_endpoints._common as nvai_common


-class NVAIPlayEmbeddings(BaseModel, Embeddings):
-    """NVIDIA's AI Playground NVOLVE Question-Answer Asymmetric Model."""
+class NVIDIAEmbeddings(BaseModel, Embeddings):
+    """NVIDIA's AI Foundation Retriever Question-Answering Asymmetric Model."""

-    client: nvaiplay_common.NVCRModel = Field(nvaiplay_common.NVCRModel)
+    client: nvai_common.NVEModel = Field(nvai_common.NVEModel)
    model: str = Field(
        ..., description="The embedding model to use. Example: nvolveqa_40k"
    )
@@ -23,7 +23,7 @@ class NVAIPlayEmbeddings(BaseModel, Embeddings):
    @root_validator(pre=True)
    def _validate_client(cls, values: Any) -> Any:
        if "client" not in values:
-            values["client"] = nvaiplay_common.NVCRModel()
+            values["client"] = nvai_common.NVEModel()
        return values

    @property
--- a/libs/partners/nvidia-ai-endpoints/langchain_nvidia_ai_endpoints/py.typed
+++ b/libs/partners/nvidia-ai-endpoints/langchain_nvidia_ai_endpoints/py.typed
--- a/libs/partners/nvidia-ai-endpoints/poetry.lock
+++ b/libs/partners/nvidia-ai-endpoints/poetry.lock
@@ -458,7 +458,7 @@ files = [

 [[package]]
 name = "langchain-core"
-version = "0.1.0"
+version = "0.1.1"
 description = "Building applications with LLMs through composability"
 optional = false
 python-versions = ">=3.8.1,<4.0"
--- a/libs/partners/nvidia-ai-endpoints/pyproject.toml
+++ b/libs/partners/nvidia-ai-endpoints/pyproject.toml
@@ -1,10 +1,10 @@
 [tool.poetry]
-name = "langchain-nvidia-aiplay"
+name = "langchain-nvidia-ai-endpoints"
 version = "0.0.1"
-description = "An integration package connecting NVidia AIPlay and LangChain"
+description = "An integration package connecting NVIDIA AI Endpoints and LangChain"
 authors = []
 readme = "README.md"
-repository = "https://github.com/langchain-ai/langchain/tree/master/libs/partners/nvidia-aiplay"
+repository = "https://github.com/langchain-ai/langchain/tree/master/libs/partners/nvidia-ai-endpoints"

 [tool.poetry.dependencies]
 python = ">=3.8.1,<4.0"
--- a/libs/partners/nvidia-ai-endpoints/scripts/check_imports.py
+++ b/libs/partners/nvidia-ai-endpoints/scripts/check_imports.py
--- a/libs/partners/nvidia-ai-endpoints/scripts/check_pydantic.sh
+++ b/libs/partners/nvidia-ai-endpoints/scripts/check_pydantic.sh
--- a/libs/partners/nvidia-ai-endpoints/scripts/lint_imports.sh
+++ b/libs/partners/nvidia-ai-endpoints/scripts/lint_imports.sh
--- a/libs/partners/nvidia-ai-endpoints/tests/init.py
+++ b/libs/partners/nvidia-ai-endpoints/tests/init.py
--- a/libs/partners/nvidia-ai-endpoints/tests/integration_tests/init.py
+++ b/libs/partners/nvidia-ai-endpoints/tests/integration_tests/init.py
--- a/libs/partners/nvidia-ai-endpoints/tests/integration_tests/test_chat_models.py
+++ b/libs/partners/nvidia-ai-endpoints/tests/integration_tests/test_chat_models.py
@@ -1,12 +1,12 @@
-"""Test ChatNVAIPlay chat model."""
+"""Test ChatNVIDIA chat model."""
 from langchain_core.messages import BaseMessage, HumanMessage, SystemMessage

-from langchain_nvidia_aiplay.chat_models import ChatNVAIPlay
+from langchain_nvidia_ai_endpoints.chat_models import ChatNVIDIA


-def test_chat_aiplay() -> None:
-    """Test ChatNVAIPlay wrapper."""
-    chat = ChatNVAIPlay(
+def test_chat_ai_endpoints() -> None:
+    """Test ChatNVIDIA wrapper."""
+    chat = ChatNVIDIA(
        model="llama2_13b",
        temperature=0.7,
    )
@@ -16,15 +16,15 @@ def test_chat_aiplay() -> None:
    assert isinstance(response.content, str)


-def test_chat_aiplay_model() -> None:
-    """Test GeneralChat wrapper handles model."""
-    chat = ChatNVAIPlay(model="mistral")
+def test_chat_ai_endpoints_model() -> None:
+    """Test wrapper handles model."""
+    chat = ChatNVIDIA(model="mistral")
    assert chat.model == "mistral"


-def test_chat_aiplay_system_message() -> None:
-    """Test GeneralChat wrapper with system message."""
-    chat = ChatNVAIPlay(model="llama2_13b", max_tokens=36)
+def test_chat_ai_endpoints_system_message() -> None:
+    """Test wrapper with system message."""
+    chat = ChatNVIDIA(model="llama2_13b", max_tokens=36)
    system_message = SystemMessage(content="You are to chat with the user.")
    human_message = HumanMessage(content="Hello")
    response = chat([system_message, human_message])
@@ -35,34 +35,34 @@ def test_chat_aiplay_system_message() -> None:
 ## TODO: Not sure if we want to support the n syntax. Trash or keep test


-def test_aiplay_streaming() -> None:
-    """Test streaming tokens from aiplay."""
-    llm = ChatNVAIPlay(model="llama2_13b", max_tokens=36)
+def test_ai_endpoints_streaming() -> None:
+    """Test streaming tokens from ai endpoints."""
+    llm = ChatNVIDIA(model="llama2_13b", max_tokens=36)

    for token in llm.stream("I'm Pickle Rick"):
        assert isinstance(token.content, str)


-async def test_aiplay_astream() -> None:
-    """Test streaming tokens from aiplay."""
-    llm = ChatNVAIPlay(model="llama2_13b", max_tokens=35)
+async def test_ai_endpoints_astream() -> None:
+    """Test streaming tokens from ai endpoints."""
+    llm = ChatNVIDIA(model="llama2_13b", max_tokens=35)

    async for token in llm.astream("I'm Pickle Rick"):
        assert isinstance(token.content, str)


-async def test_aiplay_abatch() -> None:
-    """Test streaming tokens from GeneralChat."""
-    llm = ChatNVAIPlay(model="llama2_13b", max_tokens=36)
+async def test_ai_endpoints_abatch() -> None:
+    """Test streaming tokens."""
+    llm = ChatNVIDIA(model="llama2_13b", max_tokens=36)

    result = await llm.abatch(["I'm Pickle Rick", "I'm not Pickle Rick"])
    for token in result:
        assert isinstance(token.content, str)


-async def test_aiplay_abatch_tags() -> None:
-    """Test batch tokens from GeneralChat."""
-    llm = ChatNVAIPlay(model="llama2_13b", max_tokens=55)
+async def test_ai_endpoints_abatch_tags() -> None:
+    """Test batch tokens."""
+    llm = ChatNVIDIA(model="llama2_13b", max_tokens=55)

    result = await llm.abatch(
        ["I'm Pickle Rick", "I'm not Pickle Rick"], config={"tags": ["foo"]}
@@ -71,26 +71,26 @@ async def test_aiplay_abatch_tags() -> None:
        assert isinstance(token.content, str)


-def test_aiplay_batch() -> None:
-    """Test batch tokens from GeneralChat."""
-    llm = ChatNVAIPlay(model="llama2_13b", max_tokens=60)
+def test_ai_endpoints_batch() -> None:
+    """Test batch tokens."""
+    llm = ChatNVIDIA(model="llama2_13b", max_tokens=60)

    result = llm.batch(["I'm Pickle Rick", "I'm not Pickle Rick"])
    for token in result:
        assert isinstance(token.content, str)


-async def test_aiplay_ainvoke() -> None:
-    """Test invoke tokens from GeneralChat."""
-    llm = ChatNVAIPlay(model="llama2_13b", max_tokens=60)
+async def test_ai_endpoints_ainvoke() -> None:
+    """Test invoke tokens."""
+    llm = ChatNVIDIA(model="llama2_13b", max_tokens=60)

    result = await llm.ainvoke("I'm Pickle Rick", config={"tags": ["foo"]})
    assert isinstance(result.content, str)


-def test_aiplay_invoke() -> None:
-    """Test invoke tokens from GeneralChat."""
-    llm = ChatNVAIPlay(model="llama2_13b", max_tokens=60)
+def test_ai_endpoints_invoke() -> None:
+    """Test invoke tokens."""
+    llm = ChatNVIDIA(model="llama2_13b", max_tokens=60)

    result = llm.invoke("I'm Pickle Rick", config=dict(tags=["foo"]))
    assert isinstance(result.content, str)
--- a/libs/partners/nvidia-ai-endpoints/tests/integration_tests/test_compile.py
+++ b/libs/partners/nvidia-ai-endpoints/tests/integration_tests/test_compile.py
--- a/libs/partners/nvidia-ai-endpoints/tests/integration_tests/test_embeddings.py
+++ b/libs/partners/nvidia-ai-endpoints/tests/integration_tests/test_embeddings.py
@@ -1,48 +1,48 @@
-"""Test NVIDIA AI Playground Embeddings.
+"""Test NVIDIA AI Foundation Model Embeddings.

-Note: These tests are designed to validate the functionality of NVAIPlayEmbeddings.
+Note: These tests are designed to validate the functionality of NVIDIAEmbeddings.
 """
-from langchain_nvidia_aiplay import NVAIPlayEmbeddings
+from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings


 def test_nvai_play_embedding_documents() -> None:
-    """Test NVAIPlay embeddings for documents."""
+    """Test NVIDIA embeddings for documents."""
    documents = ["foo bar"]
-    embedding = NVAIPlayEmbeddings(model="nvolveqa_40k")
+    embedding = NVIDIAEmbeddings(model="nvolveqa_40k")
    output = embedding.embed_documents(documents)
    assert len(output) == 1
    assert len(output[0]) == 1024  # Assuming embedding size is 2048


 def test_nvai_play_embedding_documents_multiple() -> None:
-    """Test NVAIPlay embeddings for multiple documents."""
+    """Test NVIDIA embeddings for multiple documents."""
    documents = ["foo bar", "bar foo", "foo"]
-    embedding = NVAIPlayEmbeddings(model="nvolveqa_40k")
+    embedding = NVIDIAEmbeddings(model="nvolveqa_40k")
    output = embedding.embed_documents(documents)
    assert len(output) == 3
    assert all(len(doc) == 1024 for doc in output)


 def test_nvai_play_embedding_query() -> None:
-    """Test NVAIPlay embeddings for a single query."""
+    """Test NVIDIA embeddings for a single query."""
    query = "foo bar"
-    embedding = NVAIPlayEmbeddings(model="nvolveqa_40k")
+    embedding = NVIDIAEmbeddings(model="nvolveqa_40k")
    output = embedding.embed_query(query)
    assert len(output) == 1024


 async def test_nvai_play_embedding_async_query() -> None:
-    """Test NVAIPlay async embeddings for a single query."""
+    """Test NVIDIA async embeddings for a single query."""
    query = "foo bar"
-    embedding = NVAIPlayEmbeddings(model="nvolveqa_40k")
+    embedding = NVIDIAEmbeddings(model="nvolveqa_40k")
    output = await embedding.aembed_query(query)
    assert len(output) == 1024


 async def test_nvai_play_embedding_async_documents() -> None:
-    """Test NVAIPlay async embeddings for multiple documents."""
+    """Test NVIDIA async embeddings for multiple documents."""
    documents = ["foo bar", "bar foo", "foo"]
-    embedding = NVAIPlayEmbeddings(model="nvolveqa_40k")
+    embedding = NVIDIAEmbeddings(model="nvolveqa_40k")
    output = await embedding.aembed_documents(documents)
    assert len(output) == 3
    assert all(len(doc) == 1024 for doc in output)
--- a/libs/partners/nvidia-ai-endpoints/tests/unit_tests/init.py
+++ b/libs/partners/nvidia-ai-endpoints/tests/unit_tests/init.py
--- a/libs/partners/nvidia-ai-endpoints/tests/unit_tests/test_chat_models.py
+++ b/libs/partners/nvidia-ai-endpoints/tests/unit_tests/test_chat_models.py
@@ -1,16 +1,16 @@
 """Test chat model integration."""


-from langchain_nvidia_aiplay.chat_models import ChatNVAIPlay
+from langchain_nvidia_ai_endpoints.chat_models import ChatNVIDIA


 def test_integration_initialization() -> None:
    """Test chat model initialization."""
-    ChatNVAIPlay(
+    ChatNVIDIA(
        model="llama2_13b",
        nvidia_api_key="nvapi-...",
        temperature=0.5,
        top_p=0.9,
        max_tokens=50,
    )
-    ChatNVAIPlay(model="mistral", nvidia_api_key="nvapi-...")
+    ChatNVIDIA(model="mistral", nvidia_api_key="nvapi-...")
--- a/libs/partners/nvidia-ai-endpoints/tests/unit_tests/test_imports.py
+++ b/libs/partners/nvidia-ai-endpoints/tests/unit_tests/test_imports.py
@@ -0,0 +1,7 @@
+from langchain_nvidia_ai_endpoints import __all__
+
+EXPECTED_ALL = ["ChatNVIDIA", "NVIDIAEmbeddings"]
+
+
+def test_all_imports() -> None:
+    assert sorted(EXPECTED_ALL) == sorted(__all__)
--- a/libs/partners/nvidia-aiplay/langchain_nvidia_aiplay/init.py
+++ b/libs/partners/nvidia-aiplay/langchain_nvidia_aiplay/init.py
@@ -1,45 +0,0 @@
-"""
-**LangChain NVIDIA AI Playground Integration**
-
-This comprehensive module integrates NVIDIA's state-of-the-art AI Playground, featuring advanced models for conversational AI and semantic embeddings, into the LangChain framework. It provides robust classes for seamless interaction with NVIDIA's AI models, particularly tailored for enriching conversational experiences and enhancing semantic understanding in various applications.
-
-**Features:**
-
-1. **Chat Models (`ChatNVAIPlay`):** This class serves as the primary interface for interacting with NVIDIA AI Playground's chat models. Users can effortlessly utilize NVIDIA's advanced models like 'Mistral' to engage in rich, context-aware conversations, applicable across diverse domains from customer support to interactive storytelling.
-
-2. **Semantic Embeddings (`NVAIPlayEmbeddings`):** The module offers capabilities to generate sophisticated embeddings using NVIDIA's AI models. These embeddings are instrumental for tasks like semantic analysis, text similarity assessments, and contextual understanding, significantly enhancing the depth of NLP applications.
-
-**Installation:**
-
-Install this module easily using pip:
-
-```python
-pip install langchain-nvidia-aiplay
-```
-
-## Utilizing Chat Models:
-
-After setting up the environment, interact with NVIDIA AI Playground models:
-```python
-from langchain_nvidia_aiplay import ChatNVAIPlay
-
-ai_chat_model = ChatNVAIPlay(model="llama2_13b")
-response = ai_chat_model.invoke("Tell me about the LangChain integration.")
-```
-
-# Generating Semantic Embeddings:
-
-Use NVIDIA's models for creating embeddings, useful in various NLP tasks:
-
-```python
-from langchain_nvidia_aiplay import NVAIPlayEmbeddings
-
-embed_model = NVAIPlayEmbeddings(model="nvolveqa_40k")
-embedding_output = embed_model.embed_query("Exploring AI capabilities.")
-```
-"""  # noqa: E501
-
-from langchain_nvidia_aiplay.chat_models import ChatNVAIPlay
-from langchain_nvidia_aiplay.embeddings import NVAIPlayEmbeddings
-
-__all__ = ["ChatNVAIPlay", "NVAIPlayEmbeddings"]
--- a/libs/partners/nvidia-aiplay/tests/unit_tests/test_imports.py
+++ b/libs/partners/nvidia-aiplay/tests/unit_tests/test_imports.py
@@ -1,7 +0,0 @@
-from langchain_nvidia_aiplay import __all__
-
-EXPECTED_ALL = ["ChatNVAIPlay", "NVAIPlayEmbeddings"]
-
-
-def test_all_imports() -> None:
-    assert sorted(EXPECTED_ALL) == sorted(__all__)