langchain/libs/text-splitters/langchain_text_splitters
Sumin Shin 683da2c9e9
text-splitters: Fix regex separator merge bug in CharacterTextSplitter (#31137)
**Description:**
Fix the merge logic in `CharacterTextSplitter.split_text` so that when
using a regex lookahead separator (`is_separator_regex=True`) with
`keep_separator=False`, the raw pattern is not re-inserted between
chunks.

**Issue:**
Fixes #31136 

**Dependencies:**
None

**Twitter handle:**
None

Since this is my first open-source PR, please feel free to point out any
mistakes, and I'll be eager to make corrections.
2025-05-10 15:42:03 -04:00
..
xsl text-splitters[patch]: delete unused html_chunks_with_headers.xslt (#29340) 2025-01-23 11:29:08 -05:00
__init__.py text-splitters: Add JSFrameworkTextSplitter for Handling JavaScript Framework Code (#28972) 2025-03-17 23:32:33 +00:00
base.py text-splitters: Set strict mypy rules (#30900) 2025-04-22 20:41:24 -07:00
character.py text-splitters: Fix regex separator merge bug in CharacterTextSplitter (#31137) 2025-05-10 15:42:03 -04:00
html.py text-splitters: Set strict mypy rules (#30900) 2025-04-22 20:41:24 -07:00
json.py text-splitters: Set strict mypy rules (#30900) 2025-04-22 20:41:24 -07:00
jsx.py text-splitters: Add JSFrameworkTextSplitter for Handling JavaScript Framework Code (#28972) 2025-03-17 23:32:33 +00:00
konlpy.py
latex.py
markdown.py text-splitters: Set strict mypy rules (#30900) 2025-04-22 20:41:24 -07:00
nltk.py text-splitters: Inconsistent results with NLTKTextSplitter's add_start_index=True (#27782) 2024-12-16 19:53:15 +00:00
py.typed
python.py
sentence_transformers.py text-splitters: Set strict mypy rules (#30900) 2025-04-22 20:41:24 -07:00
spacy.py text-splitters: add pydocstyle linting (#28127) 2024-12-09 06:01:03 +00:00