Files
langchain/libs/text-splitters
dependabot[bot] 7ed0eb3a17 chore: bump nltk from 3.9.3 to 3.9.4 in /libs/text-splitters (#36237)
Bumps [nltk](https://github.com/nltk/nltk) from 3.9.3 to 3.9.4.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/nltk/nltk/blob/develop/ChangeLog">nltk's
changelog</a>.</em></p>
<blockquote>
<p>Version 3.9.4 2026-03-24</p>
<ul>
<li>Support Python 3.14</li>
<li>Fix bug in Levenshtein distance when substitution_cost &gt; 2</li>
<li>Fix bug in Treebank detokeniser re quote ordering</li>
<li>Fix bug in Jaro similarity for empty strings</li>
<li>Several security enhancements</li>
<li>Fix GHSA-rf74-v2fm-23pw: unbounded recursion in
JSONTaggedDecoder</li>
<li>Implement TextTiling vocabulary introduction method (Hearst
1997)</li>
<li>Fix ALINE feature matrix errors and add comprehensive tests</li>
<li>Support multiple VerbNet versions, fix longid/shortid regex for
VerbNet ids</li>
<li>Let downloader fallback to md5 when sha256 is unavailable</li>
<li>Several other minor bugfixes and code cleanups</li>
</ul>
<p>Thanks to the following contributors to 3.9.4:
Min-Yen Kan, Eric Kafe, Emily Voss, bowiechen, Hrudhai01,
jancallewaert, Mr-Neutr0n, pollak.peter89, ylwango613,</p>
<p>Version 3.9.3 2026-02-21</p>
<ul>
<li>Fix CVE-2025-14009: secure ZIP extraction in nltk.downloader (<a
href="https://redirect.github.com/nltk/nltk/issues/3468">#3468</a>)</li>
<li>Block path traversal/arbitrary reads in nltk.data for protocol-less
refs (<a
href="https://redirect.github.com/nltk/nltk/issues/3467">#3467</a>)</li>
<li>Block path traversal/abs paths in corpus readers and FS pointers (<a
href="https://redirect.github.com/nltk/nltk/issues/3479">#3479</a>, <a
href="https://redirect.github.com/nltk/nltk/issues/3480">#3480</a>)</li>
<li>Validate external StanfordSegmenter JARs using SHA256 (<a
href="https://redirect.github.com/nltk/nltk/issues/3477">#3477</a>)</li>
<li>Add optional sandbox enforcement for filestring() (<a
href="https://redirect.github.com/nltk/nltk/issues/3485">#3485</a>)</li>
<li>Maintenance: downloader/zipped models, CI/tooling updates</li>
</ul>
<p>Thanks to the following contributors to 3.9.3:
Chris Clauss, Eric Kafe, HyperPS, purificant, Shivansh-Game, Christopher
Smith</p>
<p>Version 3.9.2 2025-10-01</p>
<ul>
<li>Update download checksums to use SHA256 in built index</li>
<li>Fix percentage escape in new-style string formatting</li>
<li>replace shortened URLs using goo.gl</li>
<li>Make Wordnet interoperable with various taggers and tagged
corpora</li>
<li>Fix saving PerceptronTagger</li>
<li>Document how to reproduce old Wordnet studies</li>
<li>properly initialize Portuguese corpus reader</li>
<li>support for mixed rules conversion into Chomsky Normal Form</li>
<li>only import tkinter if a GUI is needed</li>
<li>issue <a
href="https://redirect.github.com/nltk/nltk/issues/2112">#2112</a> with
Corenlp</li>
<li>new environment variable
NLTK_DOWNLOADER_FORCE_INTERACTIVE_SHELL</li>
<li>Lesk defaults to most frequent sense in case of ties</li>
</ul>
<p>Thanks to the following contributors to 3.9.2:
Jose Cols, Peter de Blanc, GeneralPoxter, Eric Kafe, William LaCroix,
Jason Liu,
Samer Masterson, Mike014, purificant, Andrew Ernest Ritz, samertm, Ikram
Ul Haq,
Christopher Smith, Ryan Mannion</p>
<p>Version 3.9.1 2024-08-19</p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="ad9c96ba00"><code>ad9c96b</code></a>
Update copyright year</li>
<li><a
href="7edcddfda5"><code>7edcddf</code></a>
Updates for 3.9.4 release</li>
<li><a
href="67a2736f89"><code>67a2736</code></a>
Merge pull request <a
href="https://redirect.github.com/nltk/nltk/issues/3180">#3180</a> from
yzhaoinuw/bug-on-edit_distance_align</li>
<li><a
href="2b17ac5358"><code>2b17ac5</code></a>
Fix edit_distance_align backtrace for high substitution costs</li>
<li><a
href="4b72976a6f"><code>4b72976</code></a>
Merge pull request <a
href="https://redirect.github.com/nltk/nltk/issues/3018">#3018</a> from
JuanIMartinezB/bug/shortid-longid</li>
<li><a
href="8a5619f53a"><code>8a5619f</code></a>
Merge pull request <a
href="https://redirect.github.com/nltk/nltk/issues/3222">#3222</a> from
Syzygy2048/feature/texttiling-vocabulary-introd...</li>
<li><a
href="c6574d755e"><code>c6574d7</code></a>
Merge pull request <a
href="https://redirect.github.com/nltk/nltk/issues/3289">#3289</a> from
ihitamandal/codeflash/optimize-windowdiff-2024-...</li>
<li><a
href="98ff5d9eaa"><code>98ff5d9</code></a>
Merge pull request <a
href="https://redirect.github.com/nltk/nltk/issues/3435">#3435</a> from
Hrudhai01/fix-3260-detokenize-quotes</li>
<li><a
href="aec4fce1b8"><code>aec4fce</code></a>
Merge pull request <a
href="https://redirect.github.com/nltk/nltk/issues/3522">#3522</a> from
ekaf/pathsec</li>
<li><a
href="eec4ee3591"><code>eec4ee3</code></a>
Merge pull request <a
href="https://redirect.github.com/nltk/nltk/issues/3526">#3526</a> from
nltk/update-contributing</li>
<li>Additional commits viewable in <a
href="https://github.com/nltk/nltk/compare/3.9.3...3.9.4">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=nltk&package-manager=uv&previous-version=3.9.3&new-version=3.9.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/langchain-ai/langchain/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-26 02:41:22 +00:00
..
2026-01-13 01:54:11 -05:00

🦜✂️ LangChain Text Splitters

PyPI - Version PyPI - License PyPI - Downloads Twitter

Looking for the JS/TS version? Check out LangChain.js.

Quick Install

pip install langchain-text-splitters

🤔 What is this?

LangChain Text Splitters contains utilities for splitting into chunks a wide variety of text documents.

📖 Documentation

For full documentation, see the API reference.

📕 Releases & Versioning

See our Releases and Versioning policies.

We encourage pinning your version to a specific version in order to avoid breaking your CI when we publish new tests. We recommend upgrading to the latest version periodically to make sure you have the latest tests.

Not pinning your version will ensure you always have the latest tests, but it may also break your CI if we introduce tests that your integration doesn't pass.

💁 Contributing

As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.

For detailed information on how to contribute, see the Contributing Guide.