fix(text-splitters): remove invalid and duplicate separators in Kotlin, Rust, and Haskell (#37039)

## Summary

Fixes four issues in `get_separators_for_language()` in `character.py`:

- **Kotlin**: removed `"\ncase "` — `case` is not a Kotlin keyword.
Kotlin uses `when` expressions (already present in the list). This was
copied from Java/Swift.
- **Rust**: removed duplicate `"\nconst "` — appeared twice, once under
function definitions and again under control flow statements.
- **Haskell**: removed duplicate `"\n:: "` — appeared under function
definitions and again under type declarations.
- **Haskell**: removed duplicate `"\ndata "` — appeared under type
declarations and again under record field declarations.

All four are dead separators that never match or produce redundant
splits.

## Issue

Closes #37038

## Types of changes

- [x] Bug fix

## Checklist

- [x] I have read the CONTRIBUTING doc
- [x] Lint and unit tests pass locally with my changes
This commit is contained in:
Deepak Bhagat
2026-04-28 00:38:12 +05:30
committed by GitHub
parent 3b9750f0a4
commit cd80a805b2

View File

@@ -266,7 +266,6 @@ class RecursiveCharacterTextSplitter(TextSplitter):
"\nfor ",
"\nwhile ",
"\nwhen ",
"\ncase ",
"\nelse ",
# Split by the normal type of lines
"\n\n",
@@ -463,7 +462,6 @@ class RecursiveCharacterTextSplitter(TextSplitter):
"\nfor ",
"\nloop ",
"\nmatch ",
"\nconst ",
# Split by the normal type of lines
"\n\n",
"\n",
@@ -718,7 +716,6 @@ class RecursiveCharacterTextSplitter(TextSplitter):
"\ndata ",
"\nnewtype ",
"\ntype ",
"\n:: ",
# Split along module declarations
"\nmodule ",
# Split along import statements
@@ -733,7 +730,6 @@ class RecursiveCharacterTextSplitter(TextSplitter):
# Split along guards in function definitions
"\n| ",
# Split along record field declarations
"\ndata ",
"\n= {",
"\n, ",
# Split by the normal type of lines