langchain/docs/modules/document_loaders
Matt Robinson 2f15c11b87
feat: document loader for MS Word documents (#1282)
### Summary

Adds a document loader for MS Word Documents. Works with both `.docx`
and `.doc` files as longer as the user has installed
`unstructured>=0.4.11`.

### Testing

The follow workflow test the loader for both `.doc` and `.docx` files
using example docs from the `unstructured` repo.

#### `.docx`

```python
from langchain.document_loaders import UnstructuredWordDocumentLoader

filename = "../unstructured/example-docs/fake.docx"
loader = UnstructuredWordDocumentLoader(filename)
loader.load()
```

#### `.doc`

```python
from langchain.document_loaders import UnstructuredWordDocumentLoader

filename = "../unstructured/example-docs/fake.doc"
loader = UnstructuredWordDocumentLoader(filename)
loader.load()
```
2023-02-24 08:26:19 -08:00
..
examples feat: document loader for MS Word documents (#1282) 2023-02-24 08:26:19 -08:00
how_to_guides.rst add gitbook document loader (#1180) 2023-02-20 20:05:04 -08:00
key_concepts.md Harrison/unstructured support (#903) 2023-02-05 23:02:07 -08:00