Bumps [nltk](https://github.com/nltk/nltk) from 3.9.3 to 3.9.4. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/nltk/nltk/blob/develop/ChangeLog">nltk's changelog</a>.</em></p> <blockquote> <p>Version 3.9.4 2026-03-24</p> <ul> <li>Support Python 3.14</li> <li>Fix bug in Levenshtein distance when substitution_cost > 2</li> <li>Fix bug in Treebank detokeniser re quote ordering</li> <li>Fix bug in Jaro similarity for empty strings</li> <li>Several security enhancements</li> <li>Fix GHSA-rf74-v2fm-23pw: unbounded recursion in JSONTaggedDecoder</li> <li>Implement TextTiling vocabulary introduction method (Hearst 1997)</li> <li>Fix ALINE feature matrix errors and add comprehensive tests</li> <li>Support multiple VerbNet versions, fix longid/shortid regex for VerbNet ids</li> <li>Let downloader fallback to md5 when sha256 is unavailable</li> <li>Several other minor bugfixes and code cleanups</li> </ul> <p>Thanks to the following contributors to 3.9.4: Min-Yen Kan, Eric Kafe, Emily Voss, bowiechen, Hrudhai01, jancallewaert, Mr-Neutr0n, pollak.peter89, ylwango613,</p> <p>Version 3.9.3 2026-02-21</p> <ul> <li>Fix CVE-2025-14009: secure ZIP extraction in nltk.downloader (<a href="https://redirect.github.com/nltk/nltk/issues/3468">#3468</a>)</li> <li>Block path traversal/arbitrary reads in nltk.data for protocol-less refs (<a href="https://redirect.github.com/nltk/nltk/issues/3467">#3467</a>)</li> <li>Block path traversal/abs paths in corpus readers and FS pointers (<a href="https://redirect.github.com/nltk/nltk/issues/3479">#3479</a>, <a href="https://redirect.github.com/nltk/nltk/issues/3480">#3480</a>)</li> <li>Validate external StanfordSegmenter JARs using SHA256 (<a href="https://redirect.github.com/nltk/nltk/issues/3477">#3477</a>)</li> <li>Add optional sandbox enforcement for filestring() (<a href="https://redirect.github.com/nltk/nltk/issues/3485">#3485</a>)</li> <li>Maintenance: downloader/zipped models, CI/tooling updates</li> </ul> <p>Thanks to the following contributors to 3.9.3: Chris Clauss, Eric Kafe, HyperPS, purificant, Shivansh-Game, Christopher Smith</p> <p>Version 3.9.2 2025-10-01</p> <ul> <li>Update download checksums to use SHA256 in built index</li> <li>Fix percentage escape in new-style string formatting</li> <li>replace shortened URLs using goo.gl</li> <li>Make Wordnet interoperable with various taggers and tagged corpora</li> <li>Fix saving PerceptronTagger</li> <li>Document how to reproduce old Wordnet studies</li> <li>properly initialize Portuguese corpus reader</li> <li>support for mixed rules conversion into Chomsky Normal Form</li> <li>only import tkinter if a GUI is needed</li> <li>issue <a href="https://redirect.github.com/nltk/nltk/issues/2112">#2112</a> with Corenlp</li> <li>new environment variable NLTK_DOWNLOADER_FORCE_INTERACTIVE_SHELL</li> <li>Lesk defaults to most frequent sense in case of ties</li> </ul> <p>Thanks to the following contributors to 3.9.2: Jose Cols, Peter de Blanc, GeneralPoxter, Eric Kafe, William LaCroix, Jason Liu, Samer Masterson, Mike014, purificant, Andrew Ernest Ritz, samertm, Ikram Ul Haq, Christopher Smith, Ryan Mannion</p> <p>Version 3.9.1 2024-08-19</p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="ad9c96ba00"><code>ad9c96b</code></a> Update copyright year</li> <li><a href="7edcddfda5"><code>7edcddf</code></a> Updates for 3.9.4 release</li> <li><a href="67a2736f89"><code>67a2736</code></a> Merge pull request <a href="https://redirect.github.com/nltk/nltk/issues/3180">#3180</a> from yzhaoinuw/bug-on-edit_distance_align</li> <li><a href="2b17ac5358"><code>2b17ac5</code></a> Fix edit_distance_align backtrace for high substitution costs</li> <li><a href="4b72976a6f"><code>4b72976</code></a> Merge pull request <a href="https://redirect.github.com/nltk/nltk/issues/3018">#3018</a> from JuanIMartinezB/bug/shortid-longid</li> <li><a href="8a5619f53a"><code>8a5619f</code></a> Merge pull request <a href="https://redirect.github.com/nltk/nltk/issues/3222">#3222</a> from Syzygy2048/feature/texttiling-vocabulary-introd...</li> <li><a href="c6574d755e"><code>c6574d7</code></a> Merge pull request <a href="https://redirect.github.com/nltk/nltk/issues/3289">#3289</a> from ihitamandal/codeflash/optimize-windowdiff-2024-...</li> <li><a href="98ff5d9eaa"><code>98ff5d9</code></a> Merge pull request <a href="https://redirect.github.com/nltk/nltk/issues/3435">#3435</a> from Hrudhai01/fix-3260-detokenize-quotes</li> <li><a href="aec4fce1b8"><code>aec4fce</code></a> Merge pull request <a href="https://redirect.github.com/nltk/nltk/issues/3522">#3522</a> from ekaf/pathsec</li> <li><a href="eec4ee3591"><code>eec4ee3</code></a> Merge pull request <a href="https://redirect.github.com/nltk/nltk/issues/3526">#3526</a> from nltk/update-contributing</li> <li>Additional commits viewable in <a href="https://github.com/nltk/nltk/compare/3.9.3...3.9.4">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/langchain-ai/langchain/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
🦜✂️ LangChain Text Splitters
Looking for the JS/TS version? Check out LangChain.js.
Quick Install
pip install langchain-text-splitters
🤔 What is this?
LangChain Text Splitters contains utilities for splitting into chunks a wide variety of text documents.
📖 Documentation
For full documentation, see the API reference.
📕 Releases & Versioning
See our Releases and Versioning policies.
We encourage pinning your version to a specific version in order to avoid breaking your CI when we publish new tests. We recommend upgrading to the latest version periodically to make sure you have the latest tests.
Not pinning your version will ensure you always have the latest tests, but it may also break your CI if we introduce tests that your integration doesn't pass.
💁 Contributing
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.
For detailed information on how to contribute, see the Contributing Guide.