langchain/docs/modules/indexes/document_loaders/examples/example_data
Matt Robinson a97e4252e3
feat: add UnstructuredExcelLoader for .xlsx and .xls files (#5617)
# Unstructured Excel Loader

Adds an `UnstructuredExcelLoader` class for `.xlsx` and `.xls` files.
Works with `unstructured>=0.6.7`. A plain text representation of the
Excel file will be available under the `page_content` attribute in the
doc. If you use the loader in `"elements"` mode, an HTML representation
of the Excel file will be available under the `text_as_html` metadata
key. Each sheet in the Excel document is its own document.

### Testing

```python
from langchain.document_loaders import UnstructuredExcelLoader

loader = UnstructuredExcelLoader(
    "example_data/stanley-cups.xlsx",
    mode="elements"
)
docs = loader.load()
```

## Who can review?

@hwchase17
@eyurtsev
2023-06-03 12:44:12 -07:00
..
fake_discord_data Harrison/discord loader (#3200) 2023-04-19 21:04:12 -07:00
test_repo1@7e525a3b91 Add file filter param to Git loader (#2904) 2023-04-14 10:45:54 -07:00
conllu.conllu big docs refactor (#1978) 2023-03-26 19:49:46 -07:00
facebook_chat.json Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
fake_conversations.json Add ChatGPT Data Loader (#3336) 2023-04-22 09:06:24 -07:00
fake_rule.toml Harrison/toml loader (#4090) 2023-05-03 23:14:39 -07:00
fake-content.html docs retriever improvements (#4430) 2023-05-17 15:29:22 -07:00
fake-email.eml big docs refactor (#1978) 2023-03-26 19:49:46 -07:00
fake-email.msg Harrison/msg files (#2375) 2023-04-04 06:48:34 -07:00
fake-power-point.pptx big docs refactor (#1978) 2023-03-26 19:49:46 -07:00
fake.docx big docs refactor (#1978) 2023-03-26 19:49:46 -07:00
fake.odt feat: add loader for open office odt files (#4405) 2023-05-10 01:37:17 -07:00
layout-parser-paper.pdf big docs refactor (#1978) 2023-03-26 19:49:46 -07:00
mlb_teams_2012.csv big docs refactor (#1978) 2023-03-26 19:49:46 -07:00
notebook.ipynb big docs refactor (#1978) 2023-03-26 19:49:46 -07:00
sitemap.xml Harrison/sitemap local (#4704) 2023-05-14 22:04:38 -07:00
stanley-cups.xlsx feat: add UnstructuredExcelLoader for .xlsx and .xls files (#5617) 2023-06-03 12:44:12 -07:00
telegram.json big docs refactor (#1978) 2023-03-26 19:49:46 -07:00
testing.enex feature/4493 Improve Evernote Document Loader (#4577) 2023-05-19 14:28:17 -07:00
testmw_pages_current.xml Harrison/media wiki xml (#4072) 2023-05-03 20:45:33 -07:00
whatsapp_chat.txt WhatsApp document loader - update regex (#2776) 2023-04-13 09:48:32 -07:00