feat(perplexity): use_responses_api flag on ChatPerplexity (#37359)

Closes #37360 Adds a `use_responses_api` flag to `ChatPerplexity` so requests can be routed through Perplexity's Agent API (the Perplexity-flavored Responses API) in addition to the existing Chat Completions endpoint. This mirrors the `use_responses_api` flag on `ChatOpenAI`. ## Motivation Perplexity exposes two HTTP surfaces from the same SDK client object: `client.chat.completions.create()` (Chat Completions) and `client.responses.create()` (Agent API, OpenAI-compatible Responses shape). The Agent API supports built-in tools (`web_search`, `fetch_url`, `finance_search`, `people_search`), `instructions`, `input`, `previous_response_id`, and `include` — none of which exist on Chat Completions. Today `ChatPerplexity` only calls Chat Completions, so users who want the Agent API have to drop down to the raw SDK. ## What this changes - New field `use_responses_api: bool | None = None` on `ChatPerplexity`. - New module-level helper `_use_responses_api(payload)` that returns `True` when the payload contains a built-in tool (any `tools[*]` whose `type` is not `"function"`) or any of the Responses-only fields `previous_response_id`, `instructions`, `input`, `include`. - New instance method `ChatPerplexity._use_responses_api(payload)` that honors `self.use_responses_api` when it is a `bool`, otherwise delegates to the module helper. - New converters `_convert_responses_to_chat_result(response)` and `_convert_responses_stream_event_to_chunk(event)` that translate Agent API objects/events into `AIMessage` and `AIMessageChunk` (preserving `usage_metadata`, `response_metadata`, citations, images, related questions, search results, and `function_call` tool calls). - A surgical `_to_responses_payload(...)` helper that renames `messages` → `input` and `max_tokens` → `max_output_tokens`, passes through Responses-supported fields, and parks anything Perplexity-specific under `extra_body`. - Each of the four API call sites (`_stream`, `_astream`, `_generate`, `_agenerate`) now branches on `self._use_responses_api(payload)`. The Chat Completions path is untouched. ## Auto-detection rules When `use_responses_api` is unset (the default), routing is decided per call from the outgoing payload: - Has a built-in tool? → Responses - Has `previous_response_id`, `instructions`, `input`, or `include`? → Responses - Otherwise → Chat Completions Explicit `use_responses_api=True` or `=False` always overrides auto-detection. ## Backwards compatibility Existing usage is unchanged. `ChatPerplexity(model="sonar").invoke("hi")` still calls `client.chat.completions.create()`. No public field was renamed or removed; the new field is purely additive. ## Tests Adds `tests/unit_tests/test_chat_models_responses.py` covering the helper, auto-detect routing, explicit overrides in both directions, response-to-`AIMessage` conversion (content, `usage_metadata`, `response_metadata.id`), `function_call` → `tool_calls` conversion, and sync + async streaming of `response.output_text.delta` and `response.completed` events. All mocks use `MagicMock`/`AsyncMock`; no network calls. ## Notes for reviewers This was implemented with help from an AI agent. The shape mirrors `langchain-openai`'s `use_responses_api` — same field name, same helper name, same docstring style — so the diff should be familiar. Closes nothing — net new feature. --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-06-09 10:17:00 +00:00 · 2026-05-26 20:17:37 -04:00
parent 7bb4130c7d
commit f9be3cc328
3 changed files with 1757 additions and 0 deletions
--- a/libs/partners/perplexity/tests/integration_tests/test_chat_models.py
+++ b/libs/partners/perplexity/tests/integration_tests/test_chat_models.py
@@ -100,6 +100,54 @@ class TestChatPerplexityIntegration:
        if citations := response.additional_kwargs.get("citations"):
            assert any("wikipedia.org" in c for c in citations)

+    def test_responses_api_with_web_search(self) -> None:
+        """Hit the real Agent (Responses) API with a built-in tool."""
+        # The Agent API requires a `preset` or `provider/model` format — bare
+        # Chat-Completions names like `sonar-pro` are rejected. Use a preset
+        # and let the `model` field get dropped by `_to_responses_payload`.
+        # `temperature` is intentionally omitted: the Responses API does not
+        # accept it, and supplying it would emit a per-call WARNING log.
+        chat = ChatPerplexity(model="sonar-pro", use_responses_api=True)
+        response = chat.invoke(
+            "What is the capital of France?",
+            tools=[{"type": "web_search"}],
+            preset="pro-search",
+        )
+        assert isinstance(response.content, str)
+        assert response.content
+        if response.usage_metadata is not None:
+            assert response.usage_metadata["input_tokens"] >= 0
+            assert response.usage_metadata["output_tokens"] >= 0
+
+    async def test_responses_api_async_with_web_search(self) -> None:
+        """Hit the real Agent API asynchronously to cover `ainvoke`."""
+        chat = ChatPerplexity(model="sonar-pro", use_responses_api=True)
+        response = await chat.ainvoke(
+            "What is the capital of France?",
+            tools=[{"type": "web_search"}],
+            preset="pro-search",
+        )
+        assert isinstance(response.content, str)
+        assert response.content
+
+    def test_responses_api_streaming_surfaces_citations(self) -> None:
+        """Stream the real Agent API and verify citations surface on chunks."""
+        chat = ChatPerplexity(model="sonar-pro", use_responses_api=True)
+        chunks = list(
+            chat.stream(
+                "Who is the CEO of OpenAI?",
+                tools=[{"type": "web_search"}],
+                preset="pro-search",
+            )
+        )
+        assert chunks
+        full_content = "".join(c.content for c in chunks if isinstance(c.content, str))
+        assert full_content
+        # Citations, when returned, must land on additional_kwargs (not
+        # response_metadata) to match the Chat Completions path.
+        for chunk in chunks:
+            assert "citations" not in chunk.response_metadata
+
    def test_media_and_metadata(self) -> None:
        """Test related questions and images."""
        chat = ChatPerplexity(
--- a/libs/partners/perplexity/tests/unit_tests/test_chat_models_responses.py
+++ b/libs/partners/perplexity/tests/unit_tests/test_chat_models_responses.py