langchain/libs/partners/ollama/tests/unit_tests
Copilot d40fd5a3ce
feat(ollama): warn on empty load responses (#32161)
## Problem

When using `ChatOllama` with `create_react_agent`, agents would
sometimes terminate prematurely with empty responses when Ollama
returned `done_reason: 'load'` responses with no content. This caused
agents to return empty `AIMessage` objects instead of actual generated
text.

```python
from langchain_ollama import ChatOllama
from langgraph.prebuilt import create_react_agent
from langchain_core.messages import HumanMessage

llm = ChatOllama(model='qwen2.5:7b', temperature=0)
agent = create_react_agent(model=llm, tools=[])

result = agent.invoke(HumanMessage('Hello'), {"configurable": {"thread_id": "1"}})
# Before fix: AIMessage(content='', response_metadata={'done_reason': 'load'})
# Expected: AIMessage with actual generated content
```

## Root Cause

The `_iterate_over_stream` and `_aiterate_over_stream` methods treated
any response with `done: True` as final, regardless of `done_reason`.
When Ollama returns `done_reason: 'load'` with empty content, it
indicates the model was loaded but no actual generation occurred - this
should not be considered a complete response.

## Solution

Modified the streaming logic to skip responses when:
- `done: True`
- `done_reason: 'load'` 
- Content is empty or contains only whitespace

This ensures agents only receive actual generated content while
preserving backward compatibility for load responses that do contain
content.

## Changes

- **`_iterate_over_stream`**: Skip empty load responses instead of
yielding them
- **`_aiterate_over_stream`**: Apply same fix to async streaming
- **Tests**: Added comprehensive test cases covering all edge cases

## Testing

All scenarios now work correctly:
-  Empty load responses are skipped (fixes original issue)
-  Load responses with actual content are preserved (backward
compatibility)
-  Normal stop responses work unchanged
-  Streaming behavior preserved
-  `create_react_agent` integration fixed

Fixes #31482.

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-07-22 13:21:11 -04:00
..
__init__.py
test_chat_models.py feat(ollama): warn on empty load responses (#32161) 2025-07-22 13:21:11 -04:00
test_embeddings.py fix(ollama): num_gpu parameter not working in async OllamaEmbeddings method (#32074) 2025-07-16 18:42:52 -04:00
test_imports.py
test_llms.py ollama[patch]: fix model validation, ensure per-call reasoning can be set, tests (#31927) 2025-07-08 16:39:41 -04:00