community[minor]: integrate chat models with Yuan2.0 (#16575)

1. integrate chat models with [`Yuan2.0`](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md) 2. add a new doc for [Yuan2.0 integration](docs/docs/integrations/llms/yuan2.ipynb) Yuan2.0 is a new generation Fundamental Large Language Model developed by IEIT System. We have published all three models, Yuan 2.0-102B, Yuan 2.0-51B, and Yuan 2.0-2B. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>
2025-08-16 08:06:14 +00:00 · 2024-02-14 02:55:14 +08:00 · 2024-02-14 02:55:14 +08:00 · 5d06797905
commit 5d06797905
parent 15baffc484
6 changed files with 1168 additions and 0 deletions
--- a/docs/docs/integrations/chat/yuan2.ipynb
+++ b/docs/docs/integrations/chat/yuan2.ipynb
@ -0,0 +1,463 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "source": [
+    "---\n",
+    "sidebar_label: YUAN2\n",
+    "---"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "pycharm": {
+     "name": "#%% raw\n"
+    }
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   },
+   "source": [
+    "# YUAN2.0\n",
+    "\n",
+    "This notebook shows how to use [YUAN2 API](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/docs/inference_server.md) in LangChain with the langchain.chat_models.ChatYuan2.\n",
+    "\n",
+    "[*Yuan2.0*](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md) is a new generation Fundamental Large Language Model developed by IEIT System. We have published all three models, Yuan 2.0-102B, Yuan 2.0-51B, and Yuan 2.0-2B. And we provide relevant scripts for pretraining, fine-tuning, and inference services for other developers. Yuan2.0 is based on Yuan1.0, utilizing a wider range of high-quality pre training data and instruction fine-tuning datasets to enhance the model's understanding of semantics, mathematics, reasoning, code, knowledge, and other aspects."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": false,
+    "jupyter": {
+     "outputs_hidden": false
+    },
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   },
+   "source": [
+    "## Getting started\n",
+    "### Installation\n",
+    "First, Yuan2.0 provided an OpenAI compatible API, and we integrate ChatYuan2 into langchain chat model by using OpenAI client.\n",
+    "Therefore, ensure the openai package is installed in your Python environment. Run the following command:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "pycharm": {
+     "name": "#%%\n"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%pip install --upgrade --quiet openai"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   },
+   "source": [
+    "### Importing the Required Modules\n",
+    "After installation, import the necessary modules to your Python script:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "pycharm": {
+     "is_executing": true,
+     "name": "#%%\n"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from langchain_community.chat_models import ChatYuan2\n",
+    "from langchain_core.messages import AIMessage, HumanMessage, SystemMessage"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   },
+   "source": [
+    "### Setting Up Your API server\n",
+    "Setting up your OpenAI compatible API server following [yuan2 openai api server](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md).\n",
+    "If you deployed api server locally, you can simply set `api_key=\"EMPTY\"` or anything you want.\n",
+    "Just make sure, the `api_base` is set correctly."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "pycharm": {
+     "name": "#%%\n"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "yuan2_api_key = \"your_api_key\"\n",
+    "yuan2_api_base = \"http://127.0.0.1:8001/v1\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   },
+   "source": [
+    "### Initialize the ChatYuan2 Model\n",
+    "Here's how to initialize the chat model:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "pycharm": {
+     "is_executing": true,
+     "name": "#%%\n"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "chat = ChatYuan2(\n",
+    "    yuan2_api_base=\"http://127.0.0.1:8001/v1\",\n",
+    "    temperature=1.0,\n",
+    "    model_name=\"yuan2\",\n",
+    "    max_retries=3,\n",
+    "    streaming=False,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   },
+   "source": [
+    "### Basic Usage\n",
+    "Invoke the model with system and human messages like this:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "pycharm": {
+     "name": "#%%\n"
+    },
+    "scrolled": true
+   },
+   "outputs": [],
+   "source": [
+    "messages = [\n",
+    "    SystemMessage(content=\"你是一个人工智能助手。\"),\n",
+    "    HumanMessage(content=\"你好，你是谁？\"),\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "pycharm": {
+     "is_executing": true,
+     "name": "#%%\n"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "print(chat(messages))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   },
+   "source": [
+    "### Basic Usage with streaming\n",
+    "For continuous interaction, use the streaming feature:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false,
+    "jupyter": {
+     "outputs_hidden": false
+    },
+    "pycharm": {
+     "name": "#%%\n"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
+    "\n",
+    "chat = ChatYuan2(\n",
+    "    yuan2_api_base=\"http://127.0.0.1:8001/v1\",\n",
+    "    temperature=1.0,\n",
+    "    model_name=\"yuan2\",\n",
+    "    max_retries=3,\n",
+    "    streaming=True,\n",
+    "    callbacks=[StreamingStdOutCallbackHandler()],\n",
+    ")\n",
+    "messages = [\n",
+    "    SystemMessage(content=\"你是个旅游小助手。\"),\n",
+    "    HumanMessage(content=\"给我介绍一下北京有哪些好玩的。\"),\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false,
+    "jupyter": {
+     "outputs_hidden": false
+    },
+    "pycharm": {
+     "is_executing": true,
+     "name": "#%%\n"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "chat(messages)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": false,
+    "jupyter": {
+     "outputs_hidden": false
+    },
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   },
+   "source": [
+    "## Advanced Features\n",
+    "### Usage with async calls\n",
+    "\n",
+    "Invoke the model with non-blocking calls, like this:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false,
+    "jupyter": {
+     "outputs_hidden": false
+    },
+    "pycharm": {
+     "name": "#%%\n"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "async def basic_agenerate():\n",
+    "    chat = ChatYuan2(\n",
+    "        yuan2_api_base=\"http://127.0.0.1:8001/v1\",\n",
+    "        temperature=1.0,\n",
+    "        model_name=\"yuan2\",\n",
+    "        max_retries=3,\n",
+    "    )\n",
+    "    messages = [\n",
+    "        [\n",
+    "            SystemMessage(content=\"你是个旅游小助手。\"),\n",
+    "            HumanMessage(content=\"给我介绍一下北京有哪些好玩的。\"),\n",
+    "        ]\n",
+    "    ]\n",
+    "\n",
+    "    result = await chat.agenerate(messages)\n",
+    "    print(result)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false,
+    "jupyter": {
+     "outputs_hidden": false
+    },
+    "pycharm": {
+     "is_executing": true,
+     "name": "#%%\n"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "import asyncio\n",
+    "\n",
+    "asyncio.run(basic_agenerate())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": false,
+    "jupyter": {
+     "outputs_hidden": false
+    },
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   },
+   "source": [
+    "### Usage with prompt template\n",
+    "\n",
+    "Invoke the model with non-blocking calls and used chat template like this:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "pycharm": {
+     "name": "#%%\n"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "async def ainvoke_with_prompt_template():\n",
+    "    from langchain.prompts.chat import (\n",
+    "        ChatPromptTemplate,\n",
+    "    )\n",
+    "\n",
+    "    chat = ChatYuan2(\n",
+    "        yuan2_api_base=\"http://127.0.0.1:8001/v1\",\n",
+    "        temperature=1.0,\n",
+    "        model_name=\"yuan2\",\n",
+    "        max_retries=3,\n",
+    "    )\n",
+    "    prompt = ChatPromptTemplate.from_messages(\n",
+    "        [\n",
+    "            (\"system\", \"你是一个诗人，擅长写诗。\"),\n",
+    "            (\"human\", \"给我写首诗，主题是{theme}。\"),\n",
+    "        ]\n",
+    "    )\n",
+    "    chain = prompt | chat\n",
+    "    result = await chain.ainvoke({\"theme\": \"明月\"})\n",
+    "    print(f\"type(result): {type(result)}; {result}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "pycharm": {
+     "is_executing": true,
+     "name": "#%%\n"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "asyncio.run(ainvoke_with_prompt_template())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   },
+   "source": [
+    "### Usage with async calls in streaming\n",
+    "For non-blocking calls with streaming output, use the astream method:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "pycharm": {
+     "name": "#%%\n"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "async def basic_astream():\n",
+    "    chat = ChatYuan2(\n",
+    "        yuan2_api_base=\"http://127.0.0.1:8001/v1\",\n",
+    "        temperature=1.0,\n",
+    "        model_name=\"yuan2\",\n",
+    "        max_retries=3,\n",
+    "    )\n",
+    "    messages = [\n",
+    "        SystemMessage(content=\"你是个旅游小助手。\"),\n",
+    "        HumanMessage(content=\"给我介绍一下北京有哪些好玩的。\"),\n",
+    "    ]\n",
+    "    result = chat.astream(messages)\n",
+    "    async for chunk in result:\n",
+    "        print(chunk.content, end=\"\", flush=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "pycharm": {
+     "is_executing": true,
+     "name": "#%%\n"
+    },
+    "scrolled": true
+   },
+   "outputs": [],
+   "source": [
+    "import asyncio\n",
+    "\n",
+    "asyncio.run(basic_astream())"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.5"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/libs/community/langchain_community/chat_models/init.py
+++ b/libs/community/langchain_community/chat_models/init.py
@ -54,6 +54,7 @@ from langchain_community.chat_models.tongyi import ChatTongyi
 from langchain_community.chat_models.vertexai import ChatVertexAI
 from langchain_community.chat_models.volcengine_maas import VolcEngineMaasChat
 from langchain_community.chat_models.yandex import ChatYandexGPT
+from langchain_community.chat_models.yuan2 import ChatYuan2
 from langchain_community.chat_models.zhipuai import ChatZhipuAI

 __all__ = [
@ -94,5 +95,6 @@ __all__ = [
    "ChatSparkLLM",
    "VolcEngineMaasChat",
    "GPTRouter",
+    "ChatYuan2",
    "ChatZhipuAI",
 ]
--- a/libs/community/langchain_community/chat_models/yuan2.py
+++ b/libs/community/langchain_community/chat_models/yuan2.py
@ -0,0 +1,486 @@
+"""ChatYuan2 wrapper."""
+from __future__ import annotations
+
+import logging
+from typing import (
+    TYPE_CHECKING,
+    Any,
+    AsyncIterator,
+    Callable,
+    Dict,
+    Iterator,
+    List,
+    Mapping,
+    Optional,
+    Tuple,
+    Type,
+    Union,
+)
+
+from langchain_core.callbacks import (
+    AsyncCallbackManagerForLLMRun,
+    CallbackManagerForLLMRun,
+)
+from langchain_core.language_models.chat_models import (
+    BaseChatModel,
+    agenerate_from_stream,
+    generate_from_stream,
+)
+from langchain_core.messages import (
+    AIMessage,
+    AIMessageChunk,
+    BaseMessage,
+    BaseMessageChunk,
+    ChatMessage,
+    ChatMessageChunk,
+    FunctionMessage,
+    HumanMessage,
+    HumanMessageChunk,
+    SystemMessage,
+    SystemMessageChunk,
+)
+from langchain_core.outputs import ChatGeneration, ChatGenerationChunk, ChatResult
+from langchain_core.pydantic_v1 import Field, root_validator
+from langchain_core.utils import (
+    get_from_dict_or_env,
+    get_pydantic_field_names,
+)
+from tenacity import (
+    before_sleep_log,
+    retry,
+    retry_if_exception_type,
+    stop_after_attempt,
+    wait_exponential,
+)
+
+if TYPE_CHECKING:
+    from openai.types.chat import ChatCompletion, ChatCompletionMessage
+
+logger = logging.getLogger(__name__)
+
+
+class ChatYuan2(BaseChatModel):
+    """`Yuan2.0` Chat models API.
+
+    To use, you should have the ``openai-python`` package installed, if package
+    not installed, using ```pip install openai``` to install it. The
+    environment variable ``YUAN2_API_KEY`` set to your API key, if not set,
+    everyone can access apis.
+
+    Any parameters that are valid to be passed to the openai.create call can be passed
+    in, even if not explicitly saved on this class.
+
+    Example:
+        .. code-block:: python
+
+            from langchain_community.chat_models import ChatYuan2
+
+            chat = ChatYuan2()
+    """
+
+    client: Any  #: :meta private:
+    async_client: Any = Field(default=None, exclude=True)  #: :meta private:
+
+    model_name: str = Field(default="yuan2", alias="model")
+    """Model name to use."""
+
+    model_kwargs: Dict[str, Any] = Field(default_factory=dict)
+    """Holds any model parameters valid for `create` call not explicitly specified."""
+
+    yuan2_api_key: Optional[str] = Field(default="EMPTY", alias="api_key")
+    """Automatically inferred from env var `YUAN2_API_KEY` if not provided."""
+
+    yuan2_api_base: Optional[str] = Field(
+        default="http://127.0.0.1:8000", alias="base_url"
+    )
+    """Base URL path for API requests, an OpenAI compatible API server."""
+
+    request_timeout: Optional[Union[float, Tuple[float, float]]] = None
+    """Timeout for requests to yuan2 completion API. Default is 600 seconds."""
+
+    max_retries: int = 6
+    """Maximum number of retries to make when generating."""
+
+    streaming: bool = False
+    """Whether to stream the results or not."""
+
+    max_tokens: Optional[int] = None
+    """Maximum number of tokens to generate."""
+
+    temperature: float = 1.0
+    """What sampling temperature to use."""
+
+    top_p: Optional[float] = 0.9
+    """The top-p value to use for sampling."""
+
+    stop: Optional[List[str]] = ["<eod>"]
+    """A list of strings to stop generation when encountered."""
+
+    repeat_last_n: Optional[int] = 64
+    "Last n tokens to penalize"
+
+    repeat_penalty: Optional[float] = 1.18
+    """The penalty to apply to repeated tokens."""
+
+    class Config:
+        """Configuration for this pydantic object."""
+
+        allow_population_by_field_name = True
+
+    @property
+    def lc_secrets(self) -> Dict[str, str]:
+        return {"yuan2_api_key": "YUAN2_API_KEY"}
+
+    @property
+    def lc_attributes(self) -> Dict[str, Any]:
+        attributes: Dict[str, Any] = {}
+
+        if self.yuan2_api_base:
+            attributes["yuan2_api_base"] = self.yuan2_api_base
+
+        if self.yuan2_api_key:
+            attributes["yuan2_api_key"] = self.yuan2_api_key
+
+        return attributes
+
+    @root_validator(pre=True)
+    def build_extra(cls, values: Dict[str, Any]) -> Dict[str, Any]:
+        """Build extra kwargs from additional params that were passed in."""
+        all_required_field_names = get_pydantic_field_names(cls)
+        extra = values.get("model_kwargs", {})
+        for field_name in list(values):
+            if field_name in extra:
+                raise ValueError(f"Found {field_name} supplied twice.")
+            if field_name not in all_required_field_names:
+                logger.warning(
+                    f"""WARNING! {field_name} is not default parameter.
+                    {field_name} was transferred to model_kwargs.
+                    Please confirm that {field_name} is what you intended."""
+                )
+                extra[field_name] = values.pop(field_name)
+
+        invalid_model_kwargs = all_required_field_names.intersection(extra.keys())
+        if invalid_model_kwargs:
+            raise ValueError(
+                f"Parameters {invalid_model_kwargs} should be specified explicitly. "
+                f"Instead they were passed in as part of `model_kwargs` parameter."
+            )
+
+        values["model_kwargs"] = extra
+        return values
+
+    @root_validator()
+    def validate_environment(cls, values: Dict) -> Dict:
+        """Validate that api key and python package exists in environment."""
+        values["yuan2_api_key"] = get_from_dict_or_env(
+            values, "yuan2_api_key", "YUAN2_API_KEY"
+        )
+
+        try:
+            import openai
+
+        except ImportError:
+            raise ValueError(
+                "Could not import openai python package. "
+                "Please install it with `pip install openai`."
+            )
+        client_params = {
+            "api_key": values["yuan2_api_key"],
+            "base_url": values["yuan2_api_base"],
+            "timeout": values["request_timeout"],
+            "max_retries": values["max_retries"],
+        }
+
+        # generate client and async_client
+        if not values.get("client"):
+            values["client"] = openai.OpenAI(**client_params).chat.completions
+        if not values.get("async_client"):
+            values["async_client"] = openai.AsyncOpenAI(
+                **client_params
+            ).chat.completions
+
+        return values
+
+    @property
+    def _default_params(self) -> Dict[str, Any]:
+        """Get the default parameters for calling yuan2 API."""
+        params = {
+            "model": self.model_name,
+            "stream": self.streaming,
+            "temperature": self.temperature,
+            "top_p": self.top_p,
+            **self.model_kwargs,
+        }
+        if self.max_tokens is not None:
+            params["max_tokens"] = self.max_tokens
+        if self.request_timeout is not None:
+            params["request_timeout"] = self.request_timeout
+        return params
+
+    def completion_with_retry(self, **kwargs: Any) -> Any:
+        """Use tenacity to retry the completion call."""
+        retry_decorator = _create_retry_decorator(self)
+
+        @retry_decorator
+        def _completion_with_retry(**kwargs: Any) -> Any:
+            return self.client.create(**kwargs)
+
+        return _completion_with_retry(**kwargs)
+
+    def _combine_llm_outputs(self, llm_outputs: List[Optional[dict]]) -> dict:
+        overall_token_usage: dict = {}
+        logger.debug(
+            f"type(llm_outputs): {type(llm_outputs)}; llm_outputs: {llm_outputs}"
+        )
+        for output in llm_outputs:
+            if output is None:
+                # Happens in streaming
+                continue
+            token_usage = output["token_usage"]
+            for k, v in token_usage.__dict__.items():
+                if k in overall_token_usage:
+                    overall_token_usage[k] += v
+                else:
+                    overall_token_usage[k] = v
+        return {"token_usage": overall_token_usage, "model_name": self.model_name}
+
+    def _stream(
+        self,
+        messages: List[BaseMessage],
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> Iterator[ChatGenerationChunk]:
+        message_dicts, params = self._create_message_dicts(messages, stop)
+        params = {**params, **kwargs, "stream": True}
+
+        default_chunk_class = AIMessageChunk
+        for chunk in self.completion_with_retry(messages=message_dicts, **params):
+            if not isinstance(chunk, dict):
+                chunk = chunk.model_dump()
+            if len(chunk["choices"]) == 0:
+                continue
+            choice = chunk["choices"][0]
+            chunk = _convert_delta_to_message_chunk(
+                choice["delta"], default_chunk_class
+            )
+            finish_reason = choice.get("finish_reason")
+            generation_info = (
+                dict(finish_reason=finish_reason) if finish_reason is not None else None
+            )
+            default_chunk_class = chunk.__class__
+            yield ChatGenerationChunk(
+                message=chunk,
+                generation_info=generation_info,
+            )
+            if run_manager:
+                run_manager.on_llm_new_token(chunk.content)
+
+    def _generate(
+        self,
+        messages: List[BaseMessage],
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> ChatResult:
+        if self.streaming:
+            stream_iter = self._stream(
+                messages=messages, stop=stop, run_manager=run_manager, **kwargs
+            )
+            return generate_from_stream(stream_iter)
+
+        message_dicts, params = self._create_message_dicts(messages, stop)
+        params = {**params, **kwargs}
+        response = self.completion_with_retry(messages=message_dicts, **params)
+        return self._create_chat_result(response)
+
+    def _create_message_dicts(
+        self, messages: List[BaseMessage], stop: Optional[List[str]]
+    ) -> Tuple[List[Dict[str, Any]], Dict[str, Any]]:
+        params = dict(self._invocation_params)
+        if stop is not None:
+            if "stop" in params:
+                raise ValueError("`stop` found in both the input and default params.")
+            params["stop"] = stop
+        message_dicts = [_convert_message_to_dict(m) for m in messages]
+        return message_dicts, params
+
+    def _create_chat_result(self, response: ChatCompletion) -> ChatResult:
+        generations = []
+        logger.debug(f"type(response): {type(response)}; response: {response}")
+        for res in response.choices:
+            message = _convert_dict_to_message(res.message)
+            generation_info = dict(finish_reason=res.finish_reason)
+            if "logprobs" in res:
+                generation_info["logprobs"] = res.logprobs
+            gen = ChatGeneration(
+                message=message,
+                generation_info=generation_info,
+            )
+            generations.append(gen)
+        llm_output = {
+            "token_usage": response.usage,
+            "model_name": self.model_name,
+        }
+        return ChatResult(generations=generations, llm_output=llm_output)
+
+    async def _astream(
+        self,
+        messages: List[BaseMessage],
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[AsyncCallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> AsyncIterator[ChatGenerationChunk]:
+        message_dicts, params = self._create_message_dicts(messages, stop)
+        params = {**params, **kwargs, "stream": True}
+
+        default_chunk_class = AIMessageChunk
+        async for chunk in await acompletion_with_retry(
+            self, messages=message_dicts, **params
+        ):
+            if not isinstance(chunk, dict):
+                chunk = chunk.model_dump()
+            if len(chunk["choices"]) == 0:
+                continue
+            choice = chunk["choices"][0]
+            chunk = _convert_delta_to_message_chunk(
+                choice["delta"], default_chunk_class
+            )
+            finish_reason = choice.get("finish_reason")
+            generation_info = (
+                dict(finish_reason=finish_reason) if finish_reason is not None else None
+            )
+            default_chunk_class = chunk.__class__
+            yield ChatGenerationChunk(
+                message=chunk,
+                generation_info=generation_info,
+            )
+            if run_manager:
+                await run_manager.on_llm_new_token(chunk.content)
+
+    async def _agenerate(
+        self,
+        messages: List[BaseMessage],
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[AsyncCallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> ChatResult:
+        if self.streaming:
+            stream_iter = self._astream(
+                messages=messages, stop=stop, run_manager=run_manager, **kwargs
+            )
+            return await agenerate_from_stream(stream_iter)
+
+        message_dicts, params = self._create_message_dicts(messages, stop)
+        params = {**params, **kwargs}
+        response = await acompletion_with_retry(self, messages=message_dicts, **params)
+        return self._create_chat_result(response)
+
+    @property
+    def _invocation_params(self) -> Mapping[str, Any]:
+        """Get the parameters used to invoke the model."""
+        yuan2_creds: Dict[str, Any] = {
+            "model": self.model_name,
+        }
+        return {**yuan2_creds, **self._default_params}
+
+    @property
+    def _llm_type(self) -> str:
+        """Return type of chat model."""
+        return "chat-yuan2"
+
+
+def _create_retry_decorator(llm: ChatYuan2) -> Callable[[Any], Any]:
+    import openai
+
+    min_seconds = 1
+    max_seconds = 60
+    # Wait 2^x * 1 second between each retry starting with
+    # 4 seconds, then up to 10 seconds, then 10 seconds afterwards
+    return retry(
+        reraise=True,
+        stop=stop_after_attempt(llm.max_retries),
+        wait=wait_exponential(multiplier=1, min=min_seconds, max=max_seconds),
+        retry=(
+            retry_if_exception_type(openai.APITimeoutError)
+            | retry_if_exception_type(openai.APIError)
+            | retry_if_exception_type(openai.APIConnectionError)
+            | retry_if_exception_type(openai.RateLimitError)
+            | retry_if_exception_type(openai.InternalServerError)
+        ),
+        before_sleep=before_sleep_log(logger, logging.WARNING),
+    )
+
+
+async def acompletion_with_retry(llm: ChatYuan2, **kwargs: Any) -> Any:
+    """Use tenacity to retry the async completion call."""
+    retry_decorator = _create_retry_decorator(llm)
+
+    @retry_decorator
+    async def _completion_with_retry(**kwargs: Any) -> Any:
+        # Use OpenAI's async api https://github.com/openai/openai-python#async-api
+        return await llm.async_client.create(**kwargs)
+
+    return await _completion_with_retry(**kwargs)
+
+
+def _convert_delta_to_message_chunk(
+    _dict: ChatCompletionMessage, default_class: Type[BaseMessageChunk]
+) -> BaseMessageChunk:
+    role = _dict.get("role")
+    content = _dict.get("content") or ""
+
+    if role == "user" or default_class == HumanMessageChunk:
+        return HumanMessageChunk(content=content)
+    elif role == "assistant" or default_class == AIMessageChunk:
+        return AIMessageChunk(content=content)
+    elif role == "system" or default_class == SystemMessageChunk:
+        return SystemMessageChunk(content=content)
+    elif role or default_class == ChatMessageChunk:
+        return ChatMessageChunk(content=content, role=role)
+    else:
+        return default_class(content=content)
+
+
+def _convert_dict_to_message(_dict: ChatCompletionMessage) -> BaseMessage:
+    role = _dict.get("role")
+    if role == "user":
+        return HumanMessage(content=_dict.get("content"))
+    elif role == "assistant":
+        content = _dict.get("content") or ""
+        return AIMessage(content=content)
+    elif role == "system":
+        return SystemMessage(content=_dict.get("content"))
+    else:
+        return ChatMessage(content=_dict.get("content"), role=role)
+
+
+def _convert_message_to_dict(message: BaseMessage) -> dict:
+    """Convert a LangChain message to a dictionary.
+
+    Args:
+        message: The LangChain message.
+
+    Returns:
+        The dictionary.
+    """
+    message_dict: Dict[str, Any]
+    if isinstance(message, ChatMessage):
+        message_dict = {"role": message.role, "content": message.content}
+    elif isinstance(message, HumanMessage):
+        message_dict = {"role": "user", "content": message.content}
+    elif isinstance(message, AIMessage):
+        message_dict = {"role": "assistant", "content": message.content}
+    elif isinstance(message, SystemMessage):
+        message_dict = {"role": "system", "content": message.content}
+    elif isinstance(message, FunctionMessage):
+        message_dict = {
+            "role": "function",
+            "name": message.name,
+            "content": message.content,
+        }
+    else:
+        raise ValueError(f"Got unknown type {message}")
+    if "name" in message.additional_kwargs:
+        message_dict["name"] = message.additional_kwargs["name"]
+    return message_dict
--- a/libs/community/tests/integration_tests/chat_models/test_yuan2.py
+++ b/libs/community/tests/integration_tests/chat_models/test_yuan2.py
@ -0,0 +1,152 @@
+"""Test ChatYuan2 wrapper."""
+from typing import List
+
+import pytest
+from langchain_core.callbacks import CallbackManager
+from langchain_core.messages import BaseMessage, HumanMessage, SystemMessage
+from langchain_core.outputs import (
+    ChatGeneration,
+    LLMResult,
+)
+
+from langchain_community.chat_models.yuan2 import ChatYuan2
+from tests.unit_tests.callbacks.fake_callback_handler import FakeCallbackHandler
+
+
+@pytest.mark.scheduled
+def test_chat_yuan2() -> None:
+    """Test ChatYuan2 wrapper."""
+    chat = ChatYuan2(
+        yuan2_api_key="EMPTY",
+        yuan2_api_base="http://127.0.0.1:8001/v1",
+        temperature=1.0,
+        model_name="yuan2",
+        max_retries=3,
+        streaming=False,
+    )
+    messages = [
+        HumanMessage(content="Hello"),
+    ]
+    response = chat(messages)
+    assert isinstance(response, BaseMessage)
+    assert isinstance(response.content, str)
+
+
+def test_chat_yuan2_system_message() -> None:
+    """Test ChatYuan2 wrapper with system message."""
+    chat = ChatYuan2(
+        yuan2_api_key="EMPTY",
+        yuan2_api_base="http://127.0.0.1:8001/v1",
+        temperature=1.0,
+        model_name="yuan2",
+        max_retries=3,
+        streaming=False,
+    )
+    messages = [
+        SystemMessage(content="You are an AI assistant."),
+        HumanMessage(content="Hello"),
+    ]
+    response = chat(messages)
+    assert isinstance(response, BaseMessage)
+    assert isinstance(response.content, str)
+
+
+@pytest.mark.scheduled
+def test_chat_yuan2_generate() -> None:
+    """Test ChatYuan2 wrapper with generate."""
+    chat = ChatYuan2(
+        yuan2_api_key="EMPTY",
+        yuan2_api_base="http://127.0.0.1:8001/v1",
+        temperature=1.0,
+        model_name="yuan2",
+        max_retries=3,
+        streaming=False,
+    )
+    messages: List = [
+        HumanMessage(content="Hello"),
+    ]
+    response = chat.generate([messages])
+    assert isinstance(response, LLMResult)
+    assert len(response.generations) == 1
+    assert response.llm_output
+    generation = response.generations[0]
+    for gen in generation:
+        assert isinstance(gen, ChatGeneration)
+        assert isinstance(gen.text, str)
+        assert gen.text == gen.message.content
+
+
+@pytest.mark.scheduled
+def test_chat_yuan2_streaming() -> None:
+    """Test that streaming correctly invokes on_llm_new_token callback."""
+    callback_handler = FakeCallbackHandler()
+    callback_manager = CallbackManager([callback_handler])
+
+    chat = ChatYuan2(
+        yuan2_api_key="EMPTY",
+        yuan2_api_base="http://127.0.0.1:8001/v1",
+        temperature=1.0,
+        model_name="yuan2",
+        max_retries=3,
+        streaming=True,
+        callback_manager=callback_manager,
+    )
+    messages = [
+        HumanMessage(content="Hello"),
+    ]
+    response = chat(messages)
+    assert callback_handler.llm_streams > 0
+    assert isinstance(response, BaseMessage)
+
+
+@pytest.mark.asyncio
+async def test_async_chat_yuan2() -> None:
+    """Test async generation."""
+    chat = ChatYuan2(
+        yuan2_api_key="EMPTY",
+        yuan2_api_base="http://127.0.0.1:8001/v1",
+        temperature=1.0,
+        model_name="yuan2",
+        max_retries=3,
+        streaming=False,
+    )
+    messages: List = [
+        HumanMessage(content="Hello"),
+    ]
+    response = await chat.agenerate([messages])
+    assert isinstance(response, LLMResult)
+    assert len(response.generations) == 1
+    generations = response.generations[0]
+    for generation in generations:
+        assert isinstance(generation, ChatGeneration)
+        assert isinstance(generation.text, str)
+        assert generation.text == generation.message.content
+
+
+@pytest.mark.asyncio
+async def test_async_chat_yuan2_streaming() -> None:
+    """Test that streaming correctly invokes on_llm_new_token callback."""
+    callback_handler = FakeCallbackHandler()
+    callback_manager = CallbackManager([callback_handler])
+
+    chat = ChatYuan2(
+        yuan2_api_key="EMPTY",
+        yuan2_api_base="http://127.0.0.1:8001/v1",
+        temperature=1.0,
+        model_name="yuan2",
+        max_retries=3,
+        streaming=True,
+        callback_manager=callback_manager,
+    )
+    messages: List = [
+        HumanMessage(content="Hello"),
+    ]
+    response = await chat.agenerate([messages])
+    assert callback_handler.llm_streams > 0
+    assert isinstance(response, LLMResult)
+    assert len(response.generations) == 1
+    generations = response.generations[0]
+    for generation in generations:
+        assert isinstance(generation, ChatGeneration)
+        assert isinstance(generation.text, str)
+        assert generation.text == generation.message.content
--- a/libs/community/tests/unit_tests/chat_models/test_imports.py
+++ b/libs/community/tests/unit_tests/chat_models/test_imports.py
@ -38,6 +38,7 @@ EXPECTED_ALL = [
    "VolcEngineMaasChat",
    "LlamaEdgeChatService",
    "GPTRouter",
+    "ChatYuan2",
    "ChatZhipuAI",
 ]

--- a/libs/community/tests/unit_tests/chat_models/test_yuan2.py
+++ b/libs/community/tests/unit_tests/chat_models/test_yuan2.py
@ -0,0 +1,64 @@
+"""Test ChatYuan2 wrapper."""
+
+import pytest
+from langchain_core.messages import (
+    AIMessage,
+    HumanMessage,
+    SystemMessage,
+)
+
+from langchain_community.chat_models.yuan2 import (
+    ChatYuan2,
+    _convert_dict_to_message,
+    _convert_message_to_dict,
+)
+
+
+@pytest.mark.requires("openai")
+def test_yuan2_model_param() -> None:
+    chat = ChatYuan2(model="foo")
+    assert chat.model_name == "foo"
+    chat = ChatYuan2(model_name="foo")
+    assert chat.model_name == "foo"
+
+
+def test__convert_message_to_dict_human() -> None:
+    message = HumanMessage(content="foo")
+    result = _convert_message_to_dict(message)
+    expected_output = {"role": "user", "content": "foo"}
+    assert result == expected_output
+
+
+def test__convert_message_to_dict_ai() -> None:
+    message = AIMessage(content="foo")
+    result = _convert_message_to_dict(message)
+    expected_output = {"role": "assistant", "content": "foo"}
+    assert result == expected_output
+
+
+def test__convert_message_to_dict_system() -> None:
+    message = SystemMessage(content="foo")
+    result = _convert_message_to_dict(message)
+    expected_output = {"role": "system", "content": "foo"}
+    assert result == expected_output
+
+
+def test__convert_dict_to_message_human() -> None:
+    message = {"role": "user", "content": "hello"}
+    result = _convert_dict_to_message(message)
+    expected_output = HumanMessage(content="hello")
+    assert result == expected_output
+
+
+def test__convert_dict_to_message_ai() -> None:
+    message = {"role": "assistant", "content": "hello"}
+    result = _convert_dict_to_message(message)
+    expected_output = AIMessage(content="hello")
+    assert result == expected_output
+
+
+def test__convert_dict_to_message_system() -> None:
+    message = {"role": "system", "content": "hello"}
+    result = _convert_dict_to_message(message)
+    expected_output = SystemMessage(content="hello")
+    assert result == expected_output