perf[core]: remove generations summation from hot loop (#31231)

1. Removes summation of `ChatGenerationChunk` from hot loops in `stream`
and `astream`
2. Removes run id gen from loop as well (minor impact)

Again, benchmarking on processing ~200k chunks (a poem about broccoli).

Before: ~4.2s

Blue circle is all the time spent adding up gen chunks

<img width="1345" alt="Screenshot 2025-05-14 at 7 48 33 AM"
src="https://github.com/user-attachments/assets/08a59d78-134d-4cd3-9d54-214de689df51"
/>

After: ~2.3s

Blue circle is remaining time spent on adding chunks, which can be
minimized in a future PR by optimizing the `merge_content`,
`merge_dicts`, and `merge_lists` utilities.

<img width="1353" alt="Screenshot 2025-05-14 at 7 50 08 AM"
src="https://github.com/user-attachments/assets/df6b3506-929e-4b6d-b198-7c4e992c6d34"
/>
This commit is contained in:
Sydney Runkle
2025-05-14 08:13:05 -07:00
committed by GitHub
parent 17b799860f
commit 263c215112
2 changed files with 42 additions and 22 deletions

View File

@@ -2,17 +2,15 @@
from __future__ import annotations
from typing import TYPE_CHECKING, Literal, Union
from typing import Literal, Union
from pydantic import model_validator
from typing_extensions import Self
from langchain_core.messages import BaseMessage, BaseMessageChunk
from langchain_core.outputs.generation import Generation
from langchain_core.utils._merge import merge_dicts
if TYPE_CHECKING:
from typing_extensions import Self
class ChatGeneration(Generation):
"""A single chat generation output.
@@ -115,3 +113,16 @@ class ChatGenerationChunk(ChatGeneration):
)
msg = f"unsupported operand type(s) for +: '{type(self)}' and '{type(other)}'"
raise TypeError(msg)
def merge_chat_generation_chunks(
chunks: list[ChatGenerationChunk],
) -> Union[ChatGenerationChunk, None]:
"""Merge a list of ChatGenerationChunks into a single ChatGenerationChunk."""
if not chunks:
return None
if len(chunks) == 1:
return chunks[0]
return chunks[0] + chunks[1:]