From 8961c720b8bc9f3947a552e62e9952dcfc8dd1a8 Mon Sep 17 00:00:00 2001 From: Matt Robinson Date: Tue, 1 Aug 2023 17:17:49 -0400 Subject: [PATCH] docs: update `unstructured` install instructions (#8596) ### Summary Updates the `unstructured` install instructions. For `unstructured>=0.9.0`, dependencies are broken out by document type and the base `unstructured` package includes fewer dependencies. `pip install "unstructured[local-inference]"` has been replace by `pip install "unstructured[all-docs]"`, though the `local-inference` extra is still supported for the time being. ### Reviewers - @rlancemartin - @eyurtsev - @hwchase17 --- .../integrations/document_loaders/unstructured_file.ipynb | 3 +-- docs/extras/integrations/providers/unstructured.mdx | 4 +++- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/extras/integrations/document_loaders/unstructured_file.ipynb b/docs/extras/integrations/document_loaders/unstructured_file.ipynb index 566fa027881..4653d1f41aa 100644 --- a/docs/extras/integrations/document_loaders/unstructured_file.ipynb +++ b/docs/extras/integrations/document_loaders/unstructured_file.ipynb @@ -18,8 +18,7 @@ "outputs": [], "source": [ "# # Install package\n", - "!pip install \"unstructured[local-inference]\"\n", - "!pip install layoutparser[layoutmodels,tesseract]" + "!pip install \"unstructured[all-docs]\"\n" ] }, { diff --git a/docs/extras/integrations/providers/unstructured.mdx b/docs/extras/integrations/providers/unstructured.mdx index 8a6699e2588..b0bccdbc94a 100644 --- a/docs/extras/integrations/providers/unstructured.mdx +++ b/docs/extras/integrations/providers/unstructured.mdx @@ -11,7 +11,9 @@ ecosystem within LangChain. If you are using a loader that runs locally, use the following steps to get `unstructured` and its dependencies running locally. -- Install the Python SDK with `pip install "unstructured[local-inference]"` +- Install the Python SDK with `pip install unstructured`. + - You can install document specific dependencies with extras, i.e. `pip install "unstructured[docx]"`. + - To install the dependencies for all document types, use `pip install "unstructured[all-docs]"`. - Install the following system dependencies if they are not already available on your system. Depending on what document types you're parsing, you may not need all of these. - `libmagic-dev` (filetype detection)