mirror of
https://github.com/hwchase17/langchain.git
synced 2025-08-30 08:12:53 +00:00
docs: fix pdfloaders' descriptions at https://python.langchain.com/docs/integrations/document_loaders/ All document loaders section (#31371)
…cs/integrations/document_loaders/ All document loaders section Thank you for contributing to LangChain! - [x] **PR title**: "package: description" - Where "package" is whichever of langchain, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "core: add foobar LLM" - [x] **PR message**: ***Delete this entire checklist*** and replace with - **Description:** a description of the change - **Issue:** the issue # it fixes, if applicable - **Dependencies:** any dependencies required for this change - **Twitter handle:** if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] **Add tests and docs**: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. If no one reviews your PR within a few days, please @-mention one of baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
This commit is contained in:
parent
0478f544d5
commit
9bd956598d
@ -6,8 +6,6 @@
|
||||
"source": [
|
||||
"# UnstructuredPDFLoader\n",
|
||||
"\n",
|
||||
"## Overview\n",
|
||||
"\n",
|
||||
"[Unstructured](https://unstructured-io.github.io/unstructured/) supports a common interface for working with unstructured or semi-structured file formats, such as Markdown or PDF. LangChain's [UnstructuredPDFLoader](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.pdf.UnstructuredPDFLoader.html) integrates with Unstructured to parse PDF documents into LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) objects.\n",
|
||||
"\n",
|
||||
"Please see [this page](/docs/integrations/providers/unstructured/) for more information on installing system requirements.\n",
|
||||
@ -34,7 +32,9 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:"
|
||||
"source": [
|
||||
"To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
|
@ -6,7 +6,6 @@
|
||||
"source": [
|
||||
"# ZeroxPDFLoader\n",
|
||||
"\n",
|
||||
"## Overview\n",
|
||||
"`ZeroxPDFLoader` is a document loader that leverages the [Zerox](https://github.com/getomni-ai/zerox) library. Zerox converts PDF documents into images, processes them using a vision-capable language model, and generates a structured Markdown representation. This loader allows for asynchronous operations and provides page-level document extraction.\n",
|
||||
"\n",
|
||||
"### Integration details\n",
|
||||
|
Loading…
Reference in New Issue
Block a user