text-splitters[patch]: fix json split of RecursiveJsonSplitter (#19119)

- **Description:** This modification addresses the issue of mutable
default parameters in functions. In the original code, the `chunks`
parameter is defaulted to a list containing an empty dictionary, which
is mutable. Since default parameters in Python are evaluated only once
at function definition time, modifications to the parameter would
persist across future calls. By changing the default to `None` and
checking/initializing within the function, a new list is created for
each call, thus avoiding potential issues.

---------

Co-authored-by: sixiang <sixiang@lixiang.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
This commit is contained in:
six17 2024-03-16 07:46:49 +08:00 committed by GitHub
parent 05008c4f94
commit fd4f536c77
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -48,12 +48,14 @@ class RecursiveJsonSplitter:
def _json_split(
self,
data: Dict[str, Any],
current_path: List[str] = [],
chunks: List[Dict] = [{}],
current_path: Optional[List[str]] = None,
chunks: Optional[List[Dict]] = None,
) -> List[Dict]:
"""
Split json into maximum size dictionaries while preserving structure.
"""
current_path = current_path or []
chunks = chunks or [{}]
if isinstance(data, dict):
for key, value in data.items():
new_path = current_path + [key]