mirror of
https://github.com/hwchase17/langchain.git
synced 2025-08-04 18:53:02 +00:00
**Description:** Previously, when transitioning from a deeper Markdown header (e.g., ###) to a shallower one (e.g., ##), the ExperimentalMarkdownSyntaxTextSplitter retained the deeper header in the metadata. This commit updates the `_resolve_header_stack` method to remove headers at the same or deeper levels before appending the current header. As a result, each chunk now reflects only the active header context. Fixes unexpected metadata leakage across sections in nested Markdown documents. Additionally, test cases have been updated to: - Validate correct header resolution and metadata assignment. - Cover edge cases with nested headers and horizontal rules. **Issue:** Fixes [#31596](https://github.com/langchain-ai/langchain/issues/31596) **Dependencies:** None **Twitter handle:** -> [_RaghuKapur](https://twitter.com/_RaghuKapur) **LinkedIn:** -> [https://www.linkedin.com/in/raghukapur/](https://www.linkedin.com/in/raghukapur/) |
||
---|---|---|
.. | ||
langchain_text_splitters | ||
scripts | ||
tests | ||
extended_testing_deps.txt | ||
Makefile | ||
pyproject.toml | ||
README.md | ||
uv.lock |
🦜✂️ LangChain Text Splitters
Quick Install
pip install langchain-text-splitters
What is it?
LangChain Text Splitters contains utilities for splitting into chunks a wide variety of text documents.
For full documentation see the API reference and the Text Splitters module in the main docs.
📕 Releases & Versioning
langchain-text-splitters
is currently on version 0.0.x
.
Minor version increases will occur for:
- Breaking changes for any public interfaces NOT marked
beta
Patch version increases will occur for:
- Bug fixes
- New features
- Any changes to private interfaces
- Any changes to
beta
features
💁 Contributing
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.
For detailed information on how to contribute, see the Contributing Guide.