mirror of
https://github.com/hwchase17/langchain.git
synced 2025-05-19 05:51:16 +00:00
Adds a new pdf loader using the existing dependency on PDFMiner. The new loader can be helpful for chunking texts semantically into sections as the output html content can be parsed via `BeautifulSoup` to get more structured and rich information about font size, page numbers, pdf headers/footers, etc. which may not be available otherwise with other pdf loaders |
||
---|---|---|
.. | ||
examples |