Harrison/unstructured page number (#6464)

Co-authored-by: Reza Sanaie <reza@sanaie.ca>
This commit is contained in:
Harrison Chase
2023-06-19 22:31:43 -07:00
committed by GitHub
parent b82ddf9cfb
commit 9eec7c3206
3 changed files with 51 additions and 4 deletions

View File

@@ -226,13 +226,17 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "8de9ef16",
"metadata": {},
"source": [
"## PDF Example\n",
"\n",
"Processing PDF documents works exactly the same way. Unstructured detects the file type and extracts the same types of `elements`. "
"Processing PDF documents works exactly the same way. Unstructured detects the file type and extracts the same types of elements. Modes of operation are \n",
"- `single` all the text from all elements are combined into one (default)\n",
"- `elements` maintain individual elements\n",
"- `paged` texts from each page are only combined"
]
},
{