docs: [Retrieval > .. > PDF] update package installation instructions for Unstructured and PDFMiner (#20723)

**Description:** Adds the command to install packages required before using _Unstructured_ and _PDFMiner_ from `langchain.community` **Documentation Page Being Updated:** [LangChain > Retrieval > Document loaders > PDF > Using Unstructured](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf/#using-unstructured) **Issue:** #20719 **Dependencies:** no dependencies **Twitter handle:** SalikaDave  --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2025-08-31 02:11:09 +00:00 · 2024-04-24 18:24:11 -04:00
parent a9e2e98708
commit 6353991498
1 changed files with 10 additions and 0 deletions
--- a/docs/docs/modules/data_connection/document_loaders/pdf.mdx
+++ b/docs/docs/modules/data_connection/document_loaders/pdf.mdx
@@ -129,6 +129,11 @@ data = loader.load()

 ## Using Unstructured

+The `unstructured[all-docs]` package currently supports loading of text files, powerpoints, html, pdfs, images, and more.
+
+```bash
+pip install unstructured[pdf]
+```

 ```python
 from langchain_community.document_loaders import UnstructuredPDFLoader
@@ -225,6 +230,11 @@ data = loader.load()

 ## Using PDFMiner

+PDFMiner is a tool that can help with extracting information and analyzing data from PDF documents. 
+
+```bash
+pip install pdfminer.six
+```

 ```python
 from langchain_community.document_loaders import PDFMinerLoader