docs: fix pdfloaders' descriptions at https://python.langchain.com/docs/integrations/document_loaders/ All document loaders section (#31371)

…cs/integrations/document_loaders/ All document loaders section Thank you for contributing to LangChain! - [x] **PR title**: "package: description" - Where "package" is whichever of langchain, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "core: add foobar LLM" - [x] **PR message**: ***Delete this entire checklist*** and replace with - **Description:** a description of the change - **Issue:** the issue # it fixes, if applicable - **Dependencies:** any dependencies required for this change - **Twitter handle:** if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] **Add tests and docs**: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. If no one reviews your PR within a few days, please @-mention one of baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
2026-01-25 22:49:59 +00:00 · 2025-05-28 05:56:55 +10:00
parent 0478f544d5
commit 9bd956598d
2 changed files with 3 additions and 4 deletions
--- a/docs/docs/integrations/document_loaders/unstructured_pdfloader.ipynb
+++ b/docs/docs/integrations/document_loaders/unstructured_pdfloader.ipynb
@@ -6,8 +6,6 @@
      "source": [
        "# UnstructuredPDFLoader\n",
        "\n",
-        "## Overview\n",
-        "\n",
        "[Unstructured](https://unstructured-io.github.io/unstructured/) supports a common interface for working with unstructured or semi-structured file formats, such as Markdown or PDF. LangChain's [UnstructuredPDFLoader](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.pdf.UnstructuredPDFLoader.html) integrates with Unstructured to parse PDF documents into LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) objects.\n",
        "\n",
        "Please see [this page](/docs/integrations/providers/unstructured/) for more information on installing system requirements.\n",
@@ -34,7 +32,9 @@
    {
      "cell_type": "markdown",
      "metadata": {},
-      "source": "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:"
+      "source": [
+        "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:"
+      ]
    },
    {
      "cell_type": "code",
--- a/docs/docs/integrations/document_loaders/zeroxpdfloader.ipynb
+++ b/docs/docs/integrations/document_loaders/zeroxpdfloader.ipynb
@@ -6,7 +6,6 @@
   "source": [
    "# ZeroxPDFLoader\n",
    "\n",
-    "## Overview\n",
    "`ZeroxPDFLoader` is a document loader that leverages the [Zerox](https://github.com/getomni-ai/zerox) library. Zerox converts PDF documents into images, processes them using a vision-capable language model, and generates a structured Markdown representation. This loader allows for asynchronous operations and provides page-level document extraction.\n",
    "\n",
    "### Integration details\n",