langchain/libs/text-splitters/langchain_text_splitters
Matthew DeGenaro 66828f4ecc
text-splitters[patch]: Modified SpacyTextSplitter to fully keep whitespace when strip_whitespace is false (#23272)
Previously, regardless of whether or not strip_whitespace was set to
true or false, the strip text method in the SpacyTextSplitter class used
`sent.text` to get the sentence. I modified this to include a ternary
such that if strip_whitespace is false, it uses `sent.text_with_ws`
I also modified the project.toml to include the spacy pipeline package
and to lock the numpy version, as higher versions break spacy.

- **Issue:** N/a
- **Dependencies:** None
2024-09-02 21:15:56 +00:00
..
xsl text-splitters[minor]: Adding a new section aware splitter to langchain (#16526) 2024-04-01 20:32:26 +00:00
__init__.py text-splitters[minor]: Adding a new section aware splitter to langchain (#16526) 2024-04-01 20:32:26 +00:00
base.py langchain : text_splitters Added PowerShell (#24582) 2024-07-30 16:13:52 +00:00
character.py text-splitters[patch]: fix typing for keep_separator (#25706) 2024-08-23 17:22:02 +00:00
html.py text_splitters: add request parameters for function HTMLHeaderTextSplitter.split_text… (#24178) 2024-07-15 16:43:56 +00:00
json.py text-splitters: Fix/recursive json splitter data persistence issue (#21529) 2024-06-18 20:21:55 -07:00
konlpy.py text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346) 2024-02-29 18:33:21 -08:00
latex.py text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346) 2024-02-29 18:33:21 -08:00
markdown.py text-splitters: Introduce Experimental Markdown Syntax Splitter (#22257) 2024-06-18 19:44:00 -07:00
nltk.py text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346) 2024-02-29 18:33:21 -08:00
py.typed text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346) 2024-02-29 18:33:21 -08:00
python.py text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346) 2024-02-29 18:33:21 -08:00
sentence_transformers.py text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346) 2024-02-29 18:33:21 -08:00
spacy.py text-splitters[patch]: Modified SpacyTextSplitter to fully keep whitespace when strip_whitespace is false (#23272) 2024-09-02 21:15:56 +00:00