mirror of
https://github.com/hwchase17/langchain.git
synced 2025-06-01 12:38:45 +00:00
- Updated `document_transformers` examples: titles, descriptions, links - Added `integrations/providers` for missed document_transformers
20 lines
477 B
Plaintext
20 lines
477 B
Plaintext
# HTML to text
|
|
|
|
>[html2text](https://github.com/Alir3z4/html2text/) is a Python package that converts a page of `HTML` into clean, easy-to-read plain `ASCII text`.
|
|
|
|
The ASCII also happens to be a valid `Markdown` (a text-to-HTML format).
|
|
|
|
## Installation and Setup
|
|
|
|
```bash
|
|
pip install html2text
|
|
```
|
|
|
|
## Document Transformer
|
|
|
|
See a [usage example](/docs/integrations/document_transformers/html2text).
|
|
|
|
```python
|
|
from langchain.document_loaders import Html2TextTransformer
|
|
```
|