fix: LLM mimicking Unicode responses due to forced Unicode conversion of non-ASCII characters. (#32222)

fix: Fix LLM mimicking Unicode responses due to forced Unicode
conversion of non-ASCII characters.

- **Description:** This PR fixes an issue where the LLM would mimic
Unicode responses due to forced Unicode conversion of non-ASCII
characters in tool calls. The fix involves disabling the `ensure_ascii`
flag in `json.dumps()` when converting tool calls to OpenAI format.
- **Issue:** Fixes ↓↓↓
input:
```json
{'role': 'assistant', 'tool_calls': [{'type': 'function', 'id': 'call_nv9trcehdpihr21zj9po19vq', 'function': {'name': 'create_customer', 'arguments': '{"customer_name": "你好啊集团"}'}}]}
```
output:
```json
{'role': 'assistant', 'tool_calls': [{'type': 'function', 'id': 'call_nv9trcehdpihr21zj9po19vq', 'function': {'name': 'create_customer', 'arguments': '{"customer_name": "\\u4f60\\u597d\\u554a\\u96c6\\u56e2"}'}}]}
```
then:
llm will mimic outputting unicode. Unicode's vast number of symbols can
lengthen LLM responses, leading to slower performance.
<img width="686" height="277" alt="image"
src="https://github.com/user-attachments/assets/28f3b007-3964-4455-bee2-68f86ac1906d"
/>

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
This commit is contained in:
niceg
2025-07-25 05:01:31 +08:00
committed by GitHub
parent d53ebf367e
commit 0d6f915442
30 changed files with 1963 additions and 1794 deletions

View File

@@ -56,7 +56,6 @@ select = [
"C4", # flake8-comprehensions
"COM", # flake8-commas
"D", # pydocstyle
"DOC", # pydoclint
"E", # pycodestyle error
"EM", # flake8-errmsg
"F", # pyflakes

1418
libs/cli/uv.lock generated

File diff suppressed because it is too large Load Diff