Commit Graph

1 Commits

Author SHA1 Message Date
Mason Daugherty
d39950cb18 feat(fireworks): migrate to fireworks-ai 1.x SDK (#37581)
Closes #37172

---

Bumps `langchain-fireworks` to the rewritten `fireworks-ai` 1.x SDK
(currently 1.2.0a*; Stainless-generated, pure-httpx, no
`grpcio`/`protobuf`/`googleapis-common-protos`).

The motivating bug is a startup crash in self-hosted LangGraph
environments that also import `langchain-google-vertexai`. Importing
`fireworks-ai` 0.19.x eagerly loads vendored grpcio protobuf modules
under `fireworks.control_plane.generated.protos_grpcio.*`, which
register `google/rpc/status.proto`, `google/api/*.proto`, and
`google/longrunning/*.proto` in the default protobuf descriptor pool.
When `langchain-google-vertexai` later triggers
`google.api_core.exceptions` → `grpc_status.rpc_status` →
`google.rpc.status_pb2`, the pool already holds a byte-different
descriptor for `google/rpc/status.proto` and startup dies with:

```
TypeError: Couldn't build proto file into descriptor pool:
duplicate file name google/rpc/status.proto
```

Fleet has been pinning around this by routing Fireworks through
`ChatOpenAI` against the OpenAI-compat endpoint, which works for
inference but means Fireworks `ModelProfile` data never loads — so Kimi
K2.6's ~262k context window goes unrecognized and summarization triggers
below limit.

The 1.x SDK does not vendor protobuf at all. The control-plane gRPC code
path is gone; chat inference goes over httpx. Verified locally that
`import langchain_fireworks` and `from langchain_fireworks import
ChatFireworks` load zero `_pb2` / `google.*` modules.

## What changed in `ChatFireworks`

- Imports switch from `fireworks.client` to the top-level `fireworks`
package.
- Async path now `await client.chat.completions.create(...)`; the 0.x
`acreate` shim is no longer used.
- Error classes remapped to the 1.x hierarchy. `InvalidRequestError` →
`BadRequestError`. `BadGatewayError` and `ServiceUnavailableError` no
longer exist (1.x maps all `>=500` to `InternalServerError`) and were
dropped from the retryable set with no loss of coverage.
`FireworksContextOverflowError`'s parent class becomes
`BadRequestError`.
- `stream_options` is moved into the SDK's `extra_body` because the
Stainless-generated `create()` signature does not model it as a typed
kwarg. Top-level `stream_options` is preserved as a caller convenience;
if a caller supplies both `extra_body["stream_options"]` and a top-level
value, `extra_body` wins and the discarded value is logged.
- The 0.x `(connect, read)` tuple form of `request_timeout` is
normalized to an `httpx.Timeout` so existing user code keeps working.
- The SDK's built-in retry layer is suppressed via `max_retries=0` on
client construction so retries remain owned by
`create_base_retry_decorator` and surface through the LangChain
`run_manager`.

## Lifecycle methods

Adds `close()` and `aclose()` on `ChatFireworks`. The 1.x
`AsyncFireworks` client defaults to `httpx_aiohttp.HttpxAiohttpClient`,
whose underlying aiohttp `ClientSession` is created lazily on first
request. Sync-only paths therefore never open a session — which fixes
the "Unclosed client session" warnings from #37172 at the source.
Callers using async paths can now release the connector
deterministically rather than relying on GC after the event loop has
stopped. An autouse fixture in the integration `conftest.py` calls
`aclose()` between tests to silence the corresponding `Unclosed
connector` warning that surfaces under `pytest-asyncio`.

## Relation to #37227

Supersedes #37227. That PR monkey-patched
`fireworks._util.is_running_in_async_context` and
`fireworks.client.api_client.is_running_in_async_context` to suppress
the 0.x SDK's eager `aiohttp.ClientSession` creation in async contexts.
Both module paths are removed in 1.x; the SDK's lazy-session behavior
makes the suppression unnecessary, and the explicit `aclose()` provides
the cleaner long-term lifecycle hook. Thanks to @keenborder786 for
surfacing the failure mode.

## Installation note

`fireworks-ai` 1.x is currently published as an alpha (`1.2.0a*`); a
stable 1.x is not yet out. `pip install langchain-fireworks` / `uv pip
install langchain-fireworks` will need `--pre` (or `--prerelease=allow`)
until Fireworks GAs 1.x. The `pyproject.toml` adds `[tool.uv] prerelease
= "allow"` so the in-repo dev environment resolves cleanly. The package
version is bumped to `1.4.0` — the public surface (`ChatFireworks`,
`Fireworks`, `FireworksEmbeddings`) is unchanged; the breakage is
confined to internal error classes and the transitive SDK.
2026-05-20 16:39:01 -05:00