fix(core): resolve cache validation error by safely converting Generation to ChatGeneration objects (#32156)

## Problem ChatLiteLLM encounters a `ValidationError` when using cache on subsequent calls, causing the following error: ``` ValidationError(model='ChatResult', errors=[{'loc': ('generations', 0, 'type'), 'msg': "unexpected value; permitted: 'ChatGeneration'", 'type': 'value_error.const', 'ctx': {'given': 'Generation', 'permitted': ('ChatGeneration',)}}]) ``` This occurs because: 1. The cache stores `Generation` objects (with `type="Generation"`) 2. But `ChatResult` expects `ChatGeneration` objects (with `type="ChatGeneration"` and a required `message` field) 3. When cached values are retrieved, validation fails due to the type mismatch ## Solution Added graceful handling in both sync (`_generate_with_cache`) and async (`_agenerate_with_cache`) cache methods to: 1. **Detect** when cached values contain `Generation` objects instead of expected `ChatGeneration` objects 2. **Convert** them to `ChatGeneration` objects by wrapping the text content in an `AIMessage` 3. **Preserve** all original metadata (`generation_info`) 4. **Allow** `ChatResult` creation to succeed without validation errors ## Example ```python # Before: This would fail with ValidationError from langchain_community.chat_models import ChatLiteLLM from langchain_community.cache import SQLiteCache from langchain.globals import set_llm_cache set_llm_cache(SQLiteCache(database_path="cache.db")) llm = ChatLiteLLM(model_name="openai/gpt-4o", cache=True, temperature=0) print(llm.predict("test")) # Works fine (cache empty) print(llm.predict("test")) # Now works instead of ValidationError # After: Seamlessly handles both Generation and ChatGeneration objects ``` ## Changes - **`libs/core/langchain_core/language_models/chat_models.py`**: - Added `Generation` import from `langchain_core.outputs` - Enhanced cache retrieval logic in `_generate_with_cache` and `_agenerate_with_cache` methods - Added conversion from `Generation` to `ChatGeneration` objects when needed - **`libs/core/tests/unit_tests/language_models/chat_models/test_cache.py`**: - Added test case to validate the conversion logic handles mixed object types ## Impact - **Backward Compatible**: Existing code continues to work unchanged - **Minimal Change**: Only affects cache retrieval path, no API changes - **Robust**: Handles both legacy cached `Generation` objects and new `ChatGeneration` objects - **Preserves Data**: All original content and metadata is maintained during conversion Fixes #22389.  --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com> Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-09-24 12:01:54 +00:00 · 2025-07-28 22:28:16 +00:00
parent 30e3ed6a19
commit ad88e5aaec
2 changed files with 124 additions and 18 deletions
--- a/libs/core/langchain_core/language_models/chat_models.py
+++ b/libs/core/langchain_core/language_models/chat_models.py
@@ -11,22 +11,9 @@ from abc import ABC, abstractmethod
 from collections.abc import AsyncIterator, Iterator, Sequence
 from functools import cached_property
 from operator import itemgetter
-from typing import (
-    TYPE_CHECKING,
-    Any,
-    Callable,
-    Literal,
-    Optional,
-    Union,
-    cast,
-)
+from typing import TYPE_CHECKING, Any, Callable, Literal, Optional, Union, cast

-from pydantic import (
-    BaseModel,
-    ConfigDict,
-    Field,
-    model_validator,
-)
+from pydantic import BaseModel, ConfigDict, Field, model_validator
 from typing_extensions import override

 from langchain_core._api import deprecated
@@ -63,6 +50,7 @@ from langchain_core.outputs import (
    ChatGeneration,
    ChatGenerationChunk,
    ChatResult,
+    Generation,
    LLMResult,
    RunInfo,
 )
@@ -653,6 +641,34 @@ class BaseChatModel(BaseLanguageModel[BaseMessage], ABC):
    def _combine_llm_outputs(self, llm_outputs: list[Optional[dict]]) -> dict:  # noqa: ARG002
        return {}

+    def _convert_cached_generations(self, cache_val: list) -> list[ChatGeneration]:
+        """Convert cached Generation objects to ChatGeneration objects.
+
+        Handle case where cache contains Generation objects instead of
+        ChatGeneration objects. This can happen due to serialization/deserialization
+        issues or legacy cache data (see #22389).
+
+        Args:
+            cache_val: List of cached generation objects.
+
+        Returns:
+            List of ChatGeneration objects.
+        """
+        converted_generations = []
+        for gen in cache_val:
+            if isinstance(gen, Generation) and not isinstance(gen, ChatGeneration):
+                # Convert Generation to ChatGeneration by creating AIMessage
+                # from the text content
+                chat_gen = ChatGeneration(
+                    message=AIMessage(content=gen.text),
+                    generation_info=gen.generation_info,
+                )
+                converted_generations.append(chat_gen)
+            else:
+                # Already a ChatGeneration or other expected type
+                converted_generations.append(gen)
+        return converted_generations
+
    def _get_invocation_params(
        self,
        stop: Optional[list[str]] = None,
@@ -1010,7 +1026,8 @@ class BaseChatModel(BaseLanguageModel[BaseMessage], ABC):
                prompt = dumps(messages)
                cache_val = llm_cache.lookup(prompt, llm_string)
                if isinstance(cache_val, list):
-                    return ChatResult(generations=cache_val)
+                    converted_generations = self._convert_cached_generations(cache_val)
+                    return ChatResult(generations=converted_generations)
            elif self.cache is None:
                pass
            else:
@@ -1082,7 +1099,8 @@ class BaseChatModel(BaseLanguageModel[BaseMessage], ABC):
                prompt = dumps(messages)
                cache_val = await llm_cache.alookup(prompt, llm_string)
                if isinstance(cache_val, list):
-                    return ChatResult(generations=cache_val)
+                    converted_generations = self._convert_cached_generations(cache_val)
+                    return ChatResult(generations=converted_generations)
            elif self.cache is None:
                pass
            else: