feat: document loader for epublications (#2202)

### Summary

Adds a new document loader for processing e-publications. Works with
`unstructured>=0.5.4`. You need to have
[`pandoc`](https://pandoc.org/installing.html) installed for this loader
to work.

### Testing

```python
from langchain.document_loaders import UnstructuredEPubLoader

loader = UnstructuredEPubLoader("winter-sports.epub", mode="elements")
data = loader.load()
data[0]
```
This commit is contained in:
Matt Robinson
2023-03-30 23:45:31 -04:00
committed by GitHub
parent a4a1ee6b5d
commit 3dfe1cf60e
5 changed files with 154 additions and 5 deletions

View File

@@ -311,7 +311,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.8.13"
}
},
"nbformat": 4,