feat(perplexity): use_responses_api flag on ChatPerplexity (#37359)

Closes #37360

Adds a `use_responses_api` flag to `ChatPerplexity` so requests can be
routed through Perplexity's Agent API (the Perplexity-flavored Responses
API) in addition to the existing Chat Completions endpoint. This mirrors
the `use_responses_api` flag on `ChatOpenAI`.

## Motivation

Perplexity exposes two HTTP surfaces from the same SDK client object:
`client.chat.completions.create()` (Chat Completions) and
`client.responses.create()` (Agent API, OpenAI-compatible Responses
shape). The Agent API supports built-in tools (`web_search`,
`fetch_url`, `finance_search`, `people_search`), `instructions`,
`input`, `previous_response_id`, and `include` — none of which exist on
Chat Completions. Today `ChatPerplexity` only calls Chat Completions, so
users who want the Agent API have to drop down to the raw SDK.

## What this changes

- New field `use_responses_api: bool | None = None` on `ChatPerplexity`.
- New module-level helper `_use_responses_api(payload)` that returns
`True` when the payload contains a built-in tool (any `tools[*]` whose
`type` is not `"function"`) or any of the Responses-only fields
`previous_response_id`, `instructions`, `input`, `include`.
- New instance method `ChatPerplexity._use_responses_api(payload)` that
honors `self.use_responses_api` when it is a `bool`, otherwise delegates
to the module helper.
- New converters `_convert_responses_to_chat_result(response)` and
`_convert_responses_stream_event_to_chunk(event)` that translate Agent
API objects/events into `AIMessage` and `AIMessageChunk` (preserving
`usage_metadata`, `response_metadata`, citations, images, related
questions, search results, and `function_call` tool calls).
- A surgical `_to_responses_payload(...)` helper that renames `messages`
→ `input` and `max_tokens` → `max_output_tokens`, passes through
Responses-supported fields, and parks anything Perplexity-specific under
`extra_body`.
- Each of the four API call sites (`_stream`, `_astream`, `_generate`,
`_agenerate`) now branches on `self._use_responses_api(payload)`. The
Chat Completions path is untouched.

## Auto-detection rules

When `use_responses_api` is unset (the default), routing is decided per
call from the outgoing payload:

- Has a built-in tool? → Responses
- Has `previous_response_id`, `instructions`, `input`, or `include`? →
Responses
- Otherwise → Chat Completions

Explicit `use_responses_api=True` or `=False` always overrides
auto-detection.

## Backwards compatibility

Existing usage is unchanged.
`ChatPerplexity(model="sonar").invoke("hi")` still calls
`client.chat.completions.create()`. No public field was renamed or
removed; the new field is purely additive.

## Tests

Adds `tests/unit_tests/test_chat_models_responses.py` covering the
helper, auto-detect routing, explicit overrides in both directions,
response-to-`AIMessage` conversion (content, `usage_metadata`,
`response_metadata.id`), `function_call` → `tool_calls` conversion, and
sync + async streaming of `response.output_text.delta` and
`response.completed` events. All mocks use `MagicMock`/`AsyncMock`; no
network calls.

## Notes for reviewers

This was implemented with help from an AI agent. The shape mirrors
`langchain-openai`'s `use_responses_api` — same field name, same helper
name, same docstring style — so the diff should be familiar.

Closes nothing — net new feature.

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
This commit is contained in:
James Liounis
2026-05-26 20:17:37 -04:00
committed by GitHub
parent 7bb4130c7d
commit f9be3cc328
3 changed files with 1757 additions and 0 deletions

View File

@@ -100,6 +100,54 @@ class TestChatPerplexityIntegration:
if citations := response.additional_kwargs.get("citations"):
assert any("wikipedia.org" in c for c in citations)
def test_responses_api_with_web_search(self) -> None:
"""Hit the real Agent (Responses) API with a built-in tool."""
# The Agent API requires a `preset` or `provider/model` format — bare
# Chat-Completions names like `sonar-pro` are rejected. Use a preset
# and let the `model` field get dropped by `_to_responses_payload`.
# `temperature` is intentionally omitted: the Responses API does not
# accept it, and supplying it would emit a per-call WARNING log.
chat = ChatPerplexity(model="sonar-pro", use_responses_api=True)
response = chat.invoke(
"What is the capital of France?",
tools=[{"type": "web_search"}],
preset="pro-search",
)
assert isinstance(response.content, str)
assert response.content
if response.usage_metadata is not None:
assert response.usage_metadata["input_tokens"] >= 0
assert response.usage_metadata["output_tokens"] >= 0
async def test_responses_api_async_with_web_search(self) -> None:
"""Hit the real Agent API asynchronously to cover `ainvoke`."""
chat = ChatPerplexity(model="sonar-pro", use_responses_api=True)
response = await chat.ainvoke(
"What is the capital of France?",
tools=[{"type": "web_search"}],
preset="pro-search",
)
assert isinstance(response.content, str)
assert response.content
def test_responses_api_streaming_surfaces_citations(self) -> None:
"""Stream the real Agent API and verify citations surface on chunks."""
chat = ChatPerplexity(model="sonar-pro", use_responses_api=True)
chunks = list(
chat.stream(
"Who is the CEO of OpenAI?",
tools=[{"type": "web_search"}],
preset="pro-search",
)
)
assert chunks
full_content = "".join(c.content for c in chunks if isinstance(c.content, str))
assert full_content
# Citations, when returned, must land on additional_kwargs (not
# response_metadata) to match the Chat Completions path.
for chunk in chunks:
assert "citations" not in chunk.response_metadata
def test_media_and_metadata(self) -> None:
"""Test related questions and images."""
chat = ChatPerplexity(

File diff suppressed because it is too large Load Diff