From 2fdccd789c5c2f86002b6668592246f9f539ca06 Mon Sep 17 00:00:00 2001 From: Ahmad Elmalah Date: Mon, 14 Jul 2025 18:36:29 +0300 Subject: [PATCH] docs: update Textract docs (#31992) I am modifying two things: 1. "This sample demonstrates" with "The following samples demonstrate" as we're talking about at least 4 samples 2. Bringing the sentence to after talking about the definition of textract to keep the document organized (textract definition then samples) --------- Co-authored-by: Mason Daugherty --- .../document_loaders/amazon_textract.ipynb | 17 ++--------------- 1 file changed, 2 insertions(+), 15 deletions(-) diff --git a/docs/docs/integrations/document_loaders/amazon_textract.ipynb b/docs/docs/integrations/document_loaders/amazon_textract.ipynb index b76b6ddf630..71da1059ea1 100644 --- a/docs/docs/integrations/document_loaders/amazon_textract.ipynb +++ b/docs/docs/integrations/document_loaders/amazon_textract.ipynb @@ -11,11 +11,9 @@ ">\n", ">It goes beyond simple optical character recognition (OCR) to identify, understand, and extract data from forms and tables. Today, many companies manually extract data from scanned documents such as PDFs, images, tables, and forms, or through simple OCR software that requires manual configuration (which often must be updated when the form changes). To overcome these manual and expensive processes, `Textract` uses ML to read and process any type of document, accurately extracting text, handwriting, tables, and other data with no manual effort. \n", "\n", - "This sample demonstrates the use of `Amazon Textract` in combination with LangChain as a DocumentLoader.\n", + "`Textract` supports `JPEG`, `PNG`, `PDF`, and `TIFF` file formats; more information is available in [the documentation](https://docs.aws.amazon.com/textract/latest/dg/limits-document.html).\n", "\n", - "`Textract` supports`PDF`, `TIFF`, `PNG` and `JPEG` format.\n", - "\n", - "`Textract` supports these [document sizes, languages and characters](https://docs.aws.amazon.com/textract/latest/dg/limits-document.html)." + "The following samples demonstrate the use of `Amazon Textract` in combination with LangChain as a DocumentLoader." ] }, { @@ -310,17 +308,6 @@ "\n", "chain.run(input_documents=documents, question=query)" ] - }, - { - "cell_type": "markdown", - "id": "bd97f1c90aff6a83", - "metadata": { - "collapsed": false, - "jupyter": { - "outputs_hidden": false - } - }, - "source": [] } ], "metadata": {