fix: LLM mimicking Unicode responses due to forced Unicode conversion of non-ASCII characters. (#32222)

fix: Fix LLM mimicking Unicode responses due to forced Unicode
conversion of non-ASCII characters.

- **Description:** This PR fixes an issue where the LLM would mimic
Unicode responses due to forced Unicode conversion of non-ASCII
characters in tool calls. The fix involves disabling the `ensure_ascii`
flag in `json.dumps()` when converting tool calls to OpenAI format.
- **Issue:** Fixes ↓↓↓
input:
```json
{'role': 'assistant', 'tool_calls': [{'type': 'function', 'id': 'call_nv9trcehdpihr21zj9po19vq', 'function': {'name': 'create_customer', 'arguments': '{"customer_name": "你好啊集团"}'}}]}
```
output:
```json
{'role': 'assistant', 'tool_calls': [{'type': 'function', 'id': 'call_nv9trcehdpihr21zj9po19vq', 'function': {'name': 'create_customer', 'arguments': '{"customer_name": "\\u4f60\\u597d\\u554a\\u96c6\\u56e2"}'}}]}
```
then:
llm will mimic outputting unicode. Unicode's vast number of symbols can
lengthen LLM responses, leading to slower performance.
<img width="686" height="277" alt="image"
src="https://github.com/user-attachments/assets/28f3b007-3964-4455-bee2-68f86ac1906d"
/>

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
This commit is contained in:
niceg
2025-07-25 05:01:31 +08:00
committed by GitHub
parent d53ebf367e
commit 0d6f915442
30 changed files with 1963 additions and 1794 deletions

View File

@@ -98,7 +98,7 @@ print("Similar Results:", similar_results)
All Exa tools support the following common parameters:
- `num_results` (1-100): Number of search results to return
- `type`: Search type - "neural", "keyword", or "auto"
- `type`: Search type - "neural", "keyword", or "auto"
- `livecrawl`: Live crawling mode - "always", "fallback", or "never"
- `summary`: Get AI-generated summaries (True/False or custom prompt dict)
- `text_contents_options`: Dict to limit text length (e.g. `{"max_characters": 2000}`)