From 635399149814596bdfc96fd89389c5cb441e94b6 Mon Sep 17 00:00:00 2001
From: Salika Dave <salikadave26@gmail.com>
Date: Wed, 24 Apr 2024 18:24:11 -0400
Subject: [PATCH] docs: [Retrieval > .. > PDF] update package installation
 instructions for Unstructured and PDFMiner  (#20723)

**Description:** Adds the command to install packages required before
using _Unstructured_ and _PDFMiner_ from `langchain.community`
**Documentation Page Being Updated:** [LangChain > Retrieval > Document
loaders > PDF > Using
Unstructured](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf/#using-unstructured)
**Issue:** #20719
**Dependencies:** no dependencies
**Twitter handle:** SalikaDave

<!--
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17. -->

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
---
 .../modules/data_connection/document_loaders/pdf.mdx   | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/docs/docs/modules/data_connection/document_loaders/pdf.mdx b/docs/docs/modules/data_connection/document_loaders/pdf.mdx
index 936aafdd89c..ec264f61f3f 100644
--- a/docs/docs/modules/data_connection/document_loaders/pdf.mdx
+++ b/docs/docs/modules/data_connection/document_loaders/pdf.mdx
@@ -129,6 +129,11 @@ data = loader.load()
 
 ## Using Unstructured
 
+The `unstructured[all-docs]` package currently supports loading of text files, powerpoints, html, pdfs, images, and more.
+
+```bash
+pip install unstructured[pdf]
+```
 
 ```python
 from langchain_community.document_loaders import UnstructuredPDFLoader
@@ -225,6 +230,11 @@ data = loader.load()
 
 ## Using PDFMiner
 
+PDFMiner is a tool that can help with extracting information and analyzing data from PDF documents. 
+
+```bash
+pip install pdfminer.six
+```
 
 ```python
 from langchain_community.document_loaders import PDFMinerLoader