mirror of
https://github.com/hwchase17/langchain.git
synced 2026-05-03 01:46:42 +00:00
fix(text-splitters): remove invalid and duplicate separators in Kotlin, Rust, and Haskell (#37039)
## Summary Fixes four issues in `get_separators_for_language()` in `character.py`: - **Kotlin**: removed `"\ncase "` — `case` is not a Kotlin keyword. Kotlin uses `when` expressions (already present in the list). This was copied from Java/Swift. - **Rust**: removed duplicate `"\nconst "` — appeared twice, once under function definitions and again under control flow statements. - **Haskell**: removed duplicate `"\n:: "` — appeared under function definitions and again under type declarations. - **Haskell**: removed duplicate `"\ndata "` — appeared under type declarations and again under record field declarations. All four are dead separators that never match or produce redundant splits. ## Issue Closes #37038 ## Types of changes - [x] Bug fix ## Checklist - [x] I have read the CONTRIBUTING doc - [x] Lint and unit tests pass locally with my changes
This commit is contained in:
@@ -266,7 +266,6 @@ class RecursiveCharacterTextSplitter(TextSplitter):
|
||||
"\nfor ",
|
||||
"\nwhile ",
|
||||
"\nwhen ",
|
||||
"\ncase ",
|
||||
"\nelse ",
|
||||
# Split by the normal type of lines
|
||||
"\n\n",
|
||||
@@ -463,7 +462,6 @@ class RecursiveCharacterTextSplitter(TextSplitter):
|
||||
"\nfor ",
|
||||
"\nloop ",
|
||||
"\nmatch ",
|
||||
"\nconst ",
|
||||
# Split by the normal type of lines
|
||||
"\n\n",
|
||||
"\n",
|
||||
@@ -718,7 +716,6 @@ class RecursiveCharacterTextSplitter(TextSplitter):
|
||||
"\ndata ",
|
||||
"\nnewtype ",
|
||||
"\ntype ",
|
||||
"\n:: ",
|
||||
# Split along module declarations
|
||||
"\nmodule ",
|
||||
# Split along import statements
|
||||
@@ -733,7 +730,6 @@ class RecursiveCharacterTextSplitter(TextSplitter):
|
||||
# Split along guards in function definitions
|
||||
"\n| ",
|
||||
# Split along record field declarations
|
||||
"\ndata ",
|
||||
"\n= {",
|
||||
"\n, ",
|
||||
# Split by the normal type of lines
|
||||
|
||||
Reference in New Issue
Block a user