mirror of
https://github.com/hwchase17/langchain.git
synced 2025-05-16 12:32:06 +00:00
### Summary Adds a document loader for MS Word Documents. Works with both `.docx` and `.doc` files as longer as the user has installed `unstructured>=0.4.11`. ### Testing The follow workflow test the loader for both `.doc` and `.docx` files using example docs from the `unstructured` repo. #### `.docx` ```python from langchain.document_loaders import UnstructuredWordDocumentLoader filename = "../unstructured/example-docs/fake.docx" loader = UnstructuredWordDocumentLoader(filename) loader.load() ``` #### `.doc` ```python from langchain.document_loaders import UnstructuredWordDocumentLoader filename = "../unstructured/example-docs/fake.doc" loader = UnstructuredWordDocumentLoader(filename) loader.load() ``` |
||
---|---|---|
.. | ||
_static | ||
ecosystem | ||
getting_started | ||
modules | ||
reference | ||
tracing | ||
use_cases | ||
conf.py | ||
deployments.md | ||
ecosystem.rst | ||
gallery.rst | ||
glossary.md | ||
index.rst | ||
make.bat | ||
Makefile | ||
reference.rst | ||
requirements.txt | ||
tracing.md |