langchain/docs/modules
Matt Robinson a97e4252e3
feat: add UnstructuredExcelLoader for .xlsx and .xls files (#5617)
# Unstructured Excel Loader

Adds an `UnstructuredExcelLoader` class for `.xlsx` and `.xls` files.
Works with `unstructured>=0.6.7`. A plain text representation of the
Excel file will be available under the `page_content` attribute in the
doc. If you use the loader in `"elements"` mode, an HTML representation
of the Excel file will be available under the `text_as_html` metadata
key. Each sheet in the Excel document is its own document.

### Testing

```python
from langchain.document_loaders import UnstructuredExcelLoader

loader = UnstructuredExcelLoader(
    "example_data/stanley-cups.xlsx",
    mode="elements"
)
docs = loader.load()
```

## Who can review?

@hwchase17
@eyurtsev
2023-06-03 12:44:12 -07:00
..
agents human approval callback (#5581) 2023-06-02 06:59:33 -07:00
callbacks Dev2049/add argilla callback (#5621) 2023-06-02 09:05:06 -07:00
chains Documentation fixes (linting and broken links) (#5563) 2023-06-01 13:06:17 -07:00
indexes feat: add UnstructuredExcelLoader for .xlsx and .xls files (#5617) 2023-06-03 12:44:12 -07:00
memory Documentation fixes (linting and broken links) (#5563) 2023-06-01 13:06:17 -07:00
models Bedrock llm and embeddings (#5464) 2023-05-31 07:17:01 -07:00
prompts Fix wrong class instantiation in docs MMR example (#5501) 2023-05-31 17:30:59 -07:00
utils/examples Pass parsed inputs through to tool _run (#4309) 2023-05-08 09:13:05 -07:00
agents.rst DOC: Misspelling in agents.rst documentation (#5038) 2023-05-20 22:24:08 -07:00
chains.rst big docs refactor (#1978) 2023-03-26 19:49:46 -07:00
indexes.rst Correct typo in documentation for word 'therefore' (#2529) 2023-04-06 23:20:30 -07:00
memory.rst big docs refactor (#1978) 2023-03-26 19:49:46 -07:00
models.rst Harrison/standard llm interface (#4615) 2023-05-13 09:05:31 -07:00
paul_graham_essay.txt Fix notebook example (#3142) 2023-04-19 08:55:06 -07:00
prompts.rst Harrison/prompt constructor methods (#4616) 2023-05-13 09:23:51 -07:00
state_of_the_union.txt Docs refactor (#480) 2023-01-02 08:24:09 -08:00