community[patch]: Update model client to support vision model in Tong… (#21474)

- **Description:** Tongyi uses different client for chat model and
vision model. This PR chooses proper client based on model name to
support both chat model and vision model. Reference [tongyi
document](https://help.aliyun.com/zh/dashscope/developer-reference/tongyi-qianwen-vl-plus-api?spm=a2c4g.11186623.0.0.27404c9a7upm11)
for details.

```
from langchain_core.messages import HumanMessage
from langchain_community.chat_models import ChatTongyi

llm = ChatTongyi(model_name='qwen-vl-max')
image_message = {
    "image": "https://lilianweng.github.io/posts/2023-06-23-agent/agent-overview.png"
}
text_message = {
    "text": "summarize this picture",
}
message = HumanMessage(content=[text_message, image_message])
llm.invoke([message])
```

- **Issue:** None
- **Dependencies:** None
- **Twitter handle:** None
This commit is contained in:
Pengcheng Liu 2024-05-22 02:58:27 +08:00 committed by GitHub
parent 98b64f3ae3
commit 4cf523949a
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 67 additions and 1 deletions

View File

@ -265,6 +265,50 @@
"ai_message = chatLLM.bind(**llm_kwargs).invoke(messages)\n",
"ai_message"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Tongyi With Vision\n",
"Qwen-VL(qwen-vl-plus/qwen-vl-max) are models that can process images."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=[{'text': 'The image presents a flowchart of an artificial intelligence system. The system is divided into two main components: short-term memory and long-term memory, which are connected to the \"Memory\" box.\\n\\nFrom the \"Memory\" box, there are three branches leading to different functionalities:\\n\\n1. \"Tools\" - This branch represents various tools that the AI system can utilize, including \"Calendar()\", \"Calculator()\", \"CodeInterpreter()\", \"Search()\" and others not explicitly listed.\\n\\n2. \"Action\" - This branch represents the action taken by the AI system based on its processing of information. It\\'s connected to both the \"Tools\" and the \"Agent\" boxes.\\n\\n3. \"Planning\" - This branch represents the planning process of the AI system, which involves reflection, self-critics, chain of thoughts, subgoal decomposition, and other processes not shown.\\n\\nThe central component of the system is the \"Agent\" box, which seems to orchestrate the flow of information between the different components. The \"Agent\" interacts with the \"Tools\" and \"Memory\" boxes, suggesting it plays a crucial role in the AI\\'s decision-making process. \\n\\nOverall, the image depicts a complex and interconnected artificial intelligence system, where different components work together to process information, make decisions, and take actions.'}], response_metadata={'model_name': 'qwen-vl-max', 'finish_reason': 'stop', 'request_id': '6a2b9e90-7c3b-960d-8a10-6a0cf9991ae5', 'token_usage': {'input_tokens': 1262, 'output_tokens': 260, 'image_tokens': 1232}}, id='run-fd030661-c734-4580-b977-b77d42680742-0')"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_community.chat_models import ChatTongyi\n",
"from langchain_core.messages import HumanMessage\n",
"\n",
"chatLLM = ChatTongyi(model_name=\"qwen-vl-max\")\n",
"image_message = {\n",
" \"image\": \"https://lilianweng.github.io/posts/2023-06-23-agent/agent-overview.png\",\n",
"}\n",
"text_message = {\n",
" \"text\": \"summarize this picture\",\n",
"}\n",
"message = HumanMessage(content=[text_message, image_message])\n",
"chatLLM.invoke([message])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {

View File

@ -281,6 +281,9 @@ class ChatTongyi(BaseChatModel):
"Please install it with `pip install dashscope --upgrade`."
)
try:
if "vl" in values["model_name"]:
values["client"] = dashscope.MultiModalConversation
else:
values["client"] = dashscope.Generation
except AttributeError:
raise ValueError(

View File

@ -77,6 +77,25 @@ def test_model() -> None:
assert isinstance(response.content, str)
def test_vision_model() -> None:
"""Test model kwarg works."""
chat = ChatTongyi(model="qwen-vl-max") # type: ignore[call-arg]
response = chat.invoke(
[
HumanMessage(
content=[
{
"image": "https://python.langchain.com/v0.1/assets/images/run_details-806f6581cd382d4887a5bc3e8ac62569.png"
},
{"text": "Summarize the image"},
]
)
]
)
assert isinstance(response, BaseMessage)
assert isinstance(response.content, list)
def test_functions_call_thoughts() -> None:
chat = ChatTongyi(model="qwen-plus") # type: ignore[call-arg]