feat: prune summary buffer (#4004)

If the library user has to decrease the `max_token_limit`, he would probably want to prune the summary buffer even though he haven't added any new messages. Personally, I need it because I want to serialise memory buffer object and save to database, and when I load it, I may have re-configured my code to have a shorter memory to save on tokens.
2025-07-16 09:48:04 +00:00 · 2023-05-04 08:45:48 +03:00 · 2023-05-04 08:45:48 +03:00 · bd277b5327
commit bd277b5327
parent bf726f9d8a
1 changed files with 4 additions and 1 deletions
--- a/langchain/memory/summary_buffer.py
+++ b/langchain/memory/summary_buffer.py
@ -57,7 +57,10 @@ class ConversationSummaryBufferMemory(BaseChatMemory, SummarizerMixin):
    def save_context(self, inputs: Dict[str, Any], outputs: Dict[str, str]) -> None:
        """Save context from this conversation to buffer."""
        super().save_context(inputs, outputs)
-        # Prune buffer if it exceeds max token limit
+        self.prune()
+
+    def prune(self) -> None:
+        """Prune buffer if it exceeds max token limit"""
        buffer = self.chat_memory.messages
        curr_buffer_length = self.llm.get_num_tokens_from_messages(buffer)
        if curr_buffer_length > self.max_token_limit: