langchain/libs/text-splitters/langchain_text_splitters
Ahmed Tammaa d5b8aabb32
text-splitters[patch]: delete unused html_chunks_with_headers.xslt (#29340)
This pull request removes the now-unused html_chunks_with_headers.xslt
file from the codebase. In a previous update ([PR
#27678](https://github.com/langchain-ai/langchain/pull/27678)), the
HTMLHeaderTextSplitter class was refactored to utilize BeautifulSoup
instead of lxml and XSLT for HTML processing. As a result, the
html_chunks_with_headers.xslt file is no longer necessary and can be
safely deleted to maintain code cleanliness and reduce potential
confusion.

Issue: N/A

Dependencies: N/A
2025-01-23 11:29:08 -05:00
..
xsl text-splitters[patch]: delete unused html_chunks_with_headers.xslt (#29340) 2025-01-23 11:29:08 -05:00
__init__.py text_splitters: Add HTMLSemanticPreservingSplitter (#25911) 2024-12-19 12:09:22 -05:00
base.py text-splitters: add pydocstyle linting (#28127) 2024-12-09 06:01:03 +00:00
character.py text-splitters: add pydocstyle linting (#28127) 2024-12-09 06:01:03 +00:00
html.py text-splitters[minor]: Replace lxml and XSLT with BeautifulSoup in HTMLHeaderTextSplitter for Improved Large HTML File Processing (#27678) 2025-01-20 16:10:37 -05:00
json.py text-splitters: add pydocstyle linting (#28127) 2024-12-09 06:01:03 +00:00
konlpy.py
latex.py
markdown.py text-splitters: Bump ruff version to 0.9 (#29231) 2025-01-22 00:27:58 +00:00
nltk.py text-splitters: Inconsistent results with NLTKTextSplitter's add_start_index=True (#27782) 2024-12-16 19:53:15 +00:00
py.typed
python.py
sentence_transformers.py text-splitters: Inconsistent results with NLTKTextSplitter's add_start_index=True (#27782) 2024-12-16 19:53:15 +00:00
spacy.py text-splitters: add pydocstyle linting (#28127) 2024-12-09 06:01:03 +00:00