langchain/docs/extras/integrations/providers/html2text.mdx
Leonid Ganeline cb84f612c9
docs: document_transformers consistency (#10467)
- Updated `document_transformers` examples: titles, descriptions, links
- Added `integrations/providers` for missed document_transformers
2023-09-30 16:36:23 -07:00

20 lines
477 B
Plaintext

# HTML to text
>[html2text](https://github.com/Alir3z4/html2text/) is a Python package that converts a page of `HTML` into clean, easy-to-read plain `ASCII text`.
The ASCII also happens to be a valid `Markdown` (a text-to-HTML format).
## Installation and Setup
```bash
pip install html2text
```
## Document Transformer
See a [usage example](/docs/integrations/document_transformers/html2text).
```python
from langchain.document_loaders import Html2TextTransformer
```