mirror of
https://github.com/hwchase17/langchain.git
synced 2025-08-29 06:23:20 +00:00
docs: [Retrieval > .. > PDF] update package installation instructions for Unstructured and PDFMiner (#20723)
**Description:** Adds the command to install packages required before using _Unstructured_ and _PDFMiner_ from `langchain.community` **Documentation Page Being Updated:** [LangChain > Retrieval > Document loaders > PDF > Using Unstructured](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf/#using-unstructured) **Issue:** #20719 **Dependencies:** no dependencies **Twitter handle:** SalikaDave <!-- Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --> --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
This commit is contained in:
parent
a9e2e98708
commit
6353991498
@ -129,6 +129,11 @@ data = loader.load()
|
|||||||
|
|
||||||
## Using Unstructured
|
## Using Unstructured
|
||||||
|
|
||||||
|
The `unstructured[all-docs]` package currently supports loading of text files, powerpoints, html, pdfs, images, and more.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install unstructured[pdf]
|
||||||
|
```
|
||||||
|
|
||||||
```python
|
```python
|
||||||
from langchain_community.document_loaders import UnstructuredPDFLoader
|
from langchain_community.document_loaders import UnstructuredPDFLoader
|
||||||
@ -225,6 +230,11 @@ data = loader.load()
|
|||||||
|
|
||||||
## Using PDFMiner
|
## Using PDFMiner
|
||||||
|
|
||||||
|
PDFMiner is a tool that can help with extracting information and analyzing data from PDF documents.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install pdfminer.six
|
||||||
|
```
|
||||||
|
|
||||||
```python
|
```python
|
||||||
from langchain_community.document_loaders import PDFMinerLoader
|
from langchain_community.document_loaders import PDFMinerLoader
|
||||||
|
Loading…
Reference in New Issue
Block a user