mirror of
https://github.com/hwchase17/langchain.git
synced 2026-01-24 05:50:18 +00:00
Here we accept ToolMessages of the form
```python
ToolMessage(
content=<representation of screenshot> (see below),
tool_call_id="abc123",
additional_kwargs={"type": "computer_call_output"},
)
```
and translate them to `computer_call_output` items for the Responses
API.
We also propagate `reasoning_content` items from AIMessages.
## Example
### Load screenshots
```python
import base64
def load_png_as_base64(file_path):
with open(file_path, "rb") as image_file:
encoded_string = base64.b64encode(image_file.read())
return encoded_string.decode('utf-8')
screenshot_1_base64 = load_png_as_base64("/path/to/screenshot/of/application.png")
screenshot_2_base64 = load_png_as_base64("/path/to/screenshot/of/desktop.png")
```
### Initial message and response
```python
from langchain_core.messages import HumanMessage, ToolMessage
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="computer-use-preview",
model_kwargs={"truncation": "auto"},
)
tool = {
"type": "computer_use_preview",
"display_width": 1024,
"display_height": 768,
"environment": "browser"
}
llm_with_tools = llm.bind_tools([tool])
input_message = HumanMessage(
content=[
{
"type": "text",
"text": (
"Click the red X to close and reveal my Desktop. "
"Proceed, no confirmation needed."
)
},
{
"type": "input_image",
"image_url": f"data:image/png;base64,{screenshot_1_base64}",
}
]
)
response = llm_with_tools.invoke(
[input_message],
reasoning={
"generate_summary": "concise",
},
)
response.additional_kwargs["tool_outputs"]
```
### Construct ToolMessage
```python
tool_call_id = response.additional_kwargs["tool_outputs"][0]["call_id"]
tool_message = ToolMessage(
content=[
{
"type": "input_image",
"image_url": f"data:image/png;base64,{screenshot_2_base64}"
}
],
# content=f"data:image/png;base64,{screenshot_2_base64}", # <-- also acceptable
tool_call_id=tool_call_id,
additional_kwargs={"type": "computer_call_output"},
)
```
### Invoke again
```python
messages = [
input_message,
response,
tool_message,
]
response_2 = llm_with_tools.invoke(
messages,
reasoning={
"generate_summary": "concise",
},
)
```
LangChain Documentation
For more information on contributing to our documentation, see the Documentation Contributing Guide