`_sanitize_chat_completions_content` now filters every content-part dict
against an allowlist derived at import time from the `fireworks-ai`
SDK's `ContentUnionMember1` TypedDict, and runs on every message role —
not just `ToolMessage`. Fixes 400s of the form `Extra inputs are not
permitted, field:
'messages[N].content.list[ChatMessageContent][i].<key>'` when
cross-provider history (e.g. an Anthropic-shaped `AIMessage` carrying
the v1 streaming-reassembly `index` marker) is forwarded to a
Fireworks-hosted model.
Closes#37172
---
Bumps `langchain-fireworks` to the rewritten `fireworks-ai` 1.x SDK
(currently 1.2.0a*; Stainless-generated, pure-httpx, no
`grpcio`/`protobuf`/`googleapis-common-protos`).
The motivating bug is a startup crash in self-hosted LangGraph
environments that also import `langchain-google-vertexai`. Importing
`fireworks-ai` 0.19.x eagerly loads vendored grpcio protobuf modules
under `fireworks.control_plane.generated.protos_grpcio.*`, which
register `google/rpc/status.proto`, `google/api/*.proto`, and
`google/longrunning/*.proto` in the default protobuf descriptor pool.
When `langchain-google-vertexai` later triggers
`google.api_core.exceptions` → `grpc_status.rpc_status` →
`google.rpc.status_pb2`, the pool already holds a byte-different
descriptor for `google/rpc/status.proto` and startup dies with:
```
TypeError: Couldn't build proto file into descriptor pool:
duplicate file name google/rpc/status.proto
```
Fleet has been pinning around this by routing Fireworks through
`ChatOpenAI` against the OpenAI-compat endpoint, which works for
inference but means Fireworks `ModelProfile` data never loads — so Kimi
K2.6's ~262k context window goes unrecognized and summarization triggers
below limit.
The 1.x SDK does not vendor protobuf at all. The control-plane gRPC code
path is gone; chat inference goes over httpx. Verified locally that
`import langchain_fireworks` and `from langchain_fireworks import
ChatFireworks` load zero `_pb2` / `google.*` modules.
## What changed in `ChatFireworks`
- Imports switch from `fireworks.client` to the top-level `fireworks`
package.
- Async path now `await client.chat.completions.create(...)`; the 0.x
`acreate` shim is no longer used.
- Error classes remapped to the 1.x hierarchy. `InvalidRequestError` →
`BadRequestError`. `BadGatewayError` and `ServiceUnavailableError` no
longer exist (1.x maps all `>=500` to `InternalServerError`) and were
dropped from the retryable set with no loss of coverage.
`FireworksContextOverflowError`'s parent class becomes
`BadRequestError`.
- `stream_options` is moved into the SDK's `extra_body` because the
Stainless-generated `create()` signature does not model it as a typed
kwarg. Top-level `stream_options` is preserved as a caller convenience;
if a caller supplies both `extra_body["stream_options"]` and a top-level
value, `extra_body` wins and the discarded value is logged.
- The 0.x `(connect, read)` tuple form of `request_timeout` is
normalized to an `httpx.Timeout` so existing user code keeps working.
- The SDK's built-in retry layer is suppressed via `max_retries=0` on
client construction so retries remain owned by
`create_base_retry_decorator` and surface through the LangChain
`run_manager`.
## Lifecycle methods
Adds `close()` and `aclose()` on `ChatFireworks`. The 1.x
`AsyncFireworks` client defaults to `httpx_aiohttp.HttpxAiohttpClient`,
whose underlying aiohttp `ClientSession` is created lazily on first
request. Sync-only paths therefore never open a session — which fixes
the "Unclosed client session" warnings from #37172 at the source.
Callers using async paths can now release the connector
deterministically rather than relying on GC after the event loop has
stopped. An autouse fixture in the integration `conftest.py` calls
`aclose()` between tests to silence the corresponding `Unclosed
connector` warning that surfaces under `pytest-asyncio`.
## Relation to #37227
Supersedes #37227. That PR monkey-patched
`fireworks._util.is_running_in_async_context` and
`fireworks.client.api_client.is_running_in_async_context` to suppress
the 0.x SDK's eager `aiohttp.ClientSession` creation in async contexts.
Both module paths are removed in 1.x; the SDK's lazy-session behavior
makes the suppression unnecessary, and the explicit `aclose()` provides
the cleaner long-term lifecycle hook. Thanks to @keenborder786 for
surfacing the failure mode.
## Installation note
`fireworks-ai` 1.x is currently published as an alpha (`1.2.0a*`); a
stable 1.x is not yet out. `pip install langchain-fireworks` / `uv pip
install langchain-fireworks` will need `--pre` (or `--prerelease=allow`)
until Fireworks GAs 1.x. The `pyproject.toml` adds `[tool.uv] prerelease
= "allow"` so the in-repo dev environment resolves cleanly. The package
version is bumped to `1.4.0` — the public surface (`ChatFireworks`,
`Fireworks`, `FireworksEmbeddings`) is unchanged; the breakage is
confined to internal error classes and the transitive SDK.
Fireworks's chat completions endpoint rejects unknown fields on tool
message content blocks — specifically the `id` key that LangChain
auto-generates on `TextContentBlock`. Add
`_sanitize_chat_completions_content` to strip those extra keys before
the payload hits the wire, preventing `Extra inputs are not permitted`
errors on tool message round-trips.
Add a `service_tier` init kwarg to `ChatFireworks`, mirroring the field
on `ChatOpenAI`. Forwards to the Fireworks chat completions API when
set, and echoes the response's tier back onto `response_metadata` and
`llm_output` so callbacks and consumers can read what the server
actually applied.
## Summary
`langchain_fireworks._convert_message_to_dict` ships LangChain canonical
v0/v1 multimodal content blocks (e.g. `{"type": "image", "base64": ...,
"mime_type": ...}`) on the wire unchanged. Fireworks' OpenAI-compatible
chat completions API rejects the unknown `base64`/`mime_type` keys and
the list shape on roles that expect a string, returning HTTP 422 — so
any image upload, including via tools that return image content blocks,
fails for Kimi K2.6 and other Fireworks vision models.
This change mirrors
`langchain_openai.chat_models.base._format_message_content`:
- Walk `content` blocks.
- Drop block types the chat-completions wire doesn't carry (`tool_use`,
`thinking`, `reasoning_content`, `function_call`,
`code_interpreter_call`).
- Detect v0/v1 multimodal data blocks via
`langchain_core.messages.is_data_content_block`, and translate them via
`convert_to_openai_data_block(..., api="chat/completions")`.
- Strings and non-list content pass through unchanged.
Applied in the `ChatMessage`, `HumanMessage`, `SystemMessage`, and
`ToolMessage` paths of `_convert_message_to_dict`. `AIMessage` already
routes through `_convert_from_v1_to_chat_completions` for v1 output and
assistant content is text-only on the way out, so it is left untouched.
## Why this approach
Fireworks is OpenAI-compatible. The canonical → OpenAI translator
already exists in `langchain_core.messages.block_translators.openai` and
is the same one `langchain-openai` uses. Reusing it (rather than
inventing a Fireworks-specific translator) gives:
- v0 (`source_type`-based) and v1 (`base64`/`url`-based) data block
coverage for free.
- Consistent behavior with `langchain-openai` for image, file, and any
future canonical data block.
- A small, focused diff (≈30 lines of new code, plus tests).
## Test plan
- [x] `make test` passes (64/64 unit tests, including 9 new ones for the
new helper and translation paths).
- [x] `make lint` passes (ruff check, ruff format, mypy, lint_imports).
- [ ] End-to-end: image upload to a Kimi K2.6 (Fireworks) agent
translates to `{"type": "image_url", "image_url": {"url":
"data:image/png;base64,..."}}` on the wire and the model returns a
coherent description (validated locally against
`langchain-fireworks==1.0.0` site-packages with the same patch).
---------
Co-authored-by: murugand23 <murugand23@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
`ChatFireworks.max_retries` silently did nothing. The old code assigned
the value to a `ChatCompletionV2` sub-object rather than the base
client, and the pinned Fireworks SDK (0.13.0–0.19.20) never honors its
own `_max_retries` attribute on the base client either. Since the
Stainless-generated 1.x SDK that does implement retries is still
pre-release (1.0.1a63 at time of writing), retry responsibility is
ported to the LangChain side until the pin can be bumped.
Populate `usage_metadata` on streaming responses. Newer Fireworks models
(e.g. Kimi K2 slugs) require an explicit
`stream_options.include_usage=True` opt-in and return token counts in a
final empty-`choices` chunk; the chunk was previously `continue`-d past,
so streaming usage silently came back as `None`.
Add a `model` property to `ChatFireworks`, `ChatGroq`, and
`ChatOpenRouter` that returns `model_name`. These partners use
Pydantic's `Field(alias="model")` on `model_name`, which means
`instance.model` doesn't work as a read accessor after construction — it
raises an `AttributeError` or returns the field descriptor. `ChatOpenAI`
already has this property; this brings the remaining in-repo partners to
parity.
Extract strict from kwargs and pass it to convert_to_openai_tool when
converting tools. This ensures that when strict is provided, it's
properly used during tool conversion and removed from kwargs before
calling the parent bind method.
Also extract reasoning_content from API responses and store it in
additional_kwargs for AIMessage objects.
Fixes https://github.com/langchain-ai/langchain/issues/34341 and
https://github.com/langchain-ai/langchain/issues/34342
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
`mixtral-8x-7b-instruct` was recently retired from Fireworks Serverless.
Here we remove the default model altogether, so that the model must be
explicitly specified on init:
```python
ChatFireworks(model="accounts/fireworks/models/llama-v3p1-70b-instruct") # for example
```
We also set a null default for `temperature`, which previously defaulted
to 0.0. This parameter will no longer be included in request payloads
unless it is explicitly provided.
Thank you for contributing to LangChain!
**Description**
The current code snippet for `Fireworks` had incorrect parameters. This
PR fixes those parameters.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- Refactor standard test classes to make them easier to configure
- Update openai to support stop_sequences init param
- Update groq to support stop_sequences init param
- Update fireworks to support max_retries init param
- Update ChatModel.bind_tools to type tool_choice
- Update groq to handle tool_choice="any". **this may be controversial**
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>