From c863b92b9eb1a8cc437a989f9bb7a630449aa310 Mon Sep 17 00:00:00 2001 From: Mason Daugherty Date: Mon, 29 Jun 2026 01:21:12 -0400 Subject: [PATCH] docs(fireworks): clarify prompt-cache session affinity guidance (#38522) Clarifies the Fireworks chat model documentation around prompt-cache session affinity. The example now focuses on the supported `x-session-affinity` header and presents `prompt_cache_key` as the typed SDK alternative without mixing in multi-turn trajectory guidance. ## Changes - Tightened the `extra_headers` example so prompt-cache reuse is explained through `x-session-affinity` only. - Clarified that `prompt_cache_key` is the preferred typed alternative to passing the raw session-affinity header. AI-agent assistance was used in preparing this contribution. --- .../langchain_fireworks/chat_models.py | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/libs/partners/fireworks/langchain_fireworks/chat_models.py b/libs/partners/fireworks/langchain_fireworks/chat_models.py index 7a4662c66ec..ee94e9ed7a8 100644 --- a/libs/partners/fireworks/langchain_fireworks/chat_models.py +++ b/libs/partners/fireworks/langchain_fireworks/chat_models.py @@ -667,24 +667,20 @@ class ChatFireworks(BaseChatModel): model = ChatFireworks(model_name="accounts/fireworks/models/gpt-oss-120b") ``` - Fireworks request headers can be passed with `extra_headers`, including - session-affinity headers for prompt caching and multi-turn trajectories. - `x-session-affinity` pins requests to a replica for prompt-cache reuse, - while `x-multi-turn-session-id` groups the turns of a single trajectory: + Fireworks request headers can be passed with `extra_headers`. For prompt + caching, `x-session-affinity` pins requests to a replica so related calls can + reuse the same prompt-cache session: ```python model.invoke( "Hello", - extra_headers={ - "x-session-affinity": "user-42", - "x-multi-turn-session-id": "thread-123", - }, + extra_headers={"x-session-affinity": "user-42"}, ) ``` - For prompt-cache session affinity, the Fireworks SDK also accepts a typed - `prompt_cache_key` field (passed as a regular keyword argument), which it - treats as the preferred alternative to the raw `x-session-affinity` header: + The Fireworks SDK also accepts a typed `prompt_cache_key` field (passed as a + regular keyword argument), which it treats as the preferred alternative to + the raw `x-session-affinity` header: ```python model.invoke("Hello", prompt_cache_key="user-42")