mirror of
https://github.com/hwchase17/langchain.git
synced 2025-06-30 10:23:30 +00:00
docs Microsoft
platform page update (#15420)
Added two new document_loader references. Improved the format consistency of the example pages
This commit is contained in:
parent
b8c6ebf647
commit
1e6519edc2
@ -8,10 +8,10 @@
|
|||||||
"# Azure AI Data\n",
|
"# Azure AI Data\n",
|
||||||
"\n",
|
"\n",
|
||||||
">[Azure AI Studio](https://ai.azure.com/) provides the capability to upload data assets to cloud storage and register existing data assets from the following sources:\n",
|
">[Azure AI Studio](https://ai.azure.com/) provides the capability to upload data assets to cloud storage and register existing data assets from the following sources:\n",
|
||||||
"\n",
|
">\n",
|
||||||
"- Microsoft OneLake\n",
|
">- `Microsoft OneLake`\n",
|
||||||
"- Azure Blob Storage\n",
|
">- `Azure Blob Storage`\n",
|
||||||
"- Azure Data Lake gen 2\n",
|
">- `Azure Data Lake gen 2`\n",
|
||||||
"\n",
|
"\n",
|
||||||
"The benefit of this approach over `AzureBlobStorageContainerLoader` and `AzureBlobStorageFileLoader` is that authentication is handled seamlessly to cloud storage. You can use either *identity-based* data access control to the data or *credential-based* (e.g. SAS token, account key). In the case of credential-based data access you do not need to specify secrets in your code or set up key vaults - the system handles that for you.\n",
|
"The benefit of this approach over `AzureBlobStorageContainerLoader` and `AzureBlobStorageFileLoader` is that authentication is handled seamlessly to cloud storage. You can use either *identity-based* data access control to the data or *credential-based* (e.g. SAS token, account key). In the case of credential-based data access you do not need to specify secrets in your code or set up key vaults - the system handles that for you.\n",
|
||||||
"\n",
|
"\n",
|
||||||
@ -166,7 +166,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.11.6"
|
"version": "3.10.12"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
@ -13,22 +13,31 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Azure AI Document Intelligence (formerly known as Azure Form Recognizer) is machine-learning \n",
|
">[Azure AI Document Intelligence](https://aka.ms/doc-intelligence) (formerly known as `Azure Form Recognizer`) is machine-learning \n",
|
||||||
"based service that extracts text (including handwriting), tables or key-value-pairs from\n",
|
">based service that extracts text (including handwriting), tables or key-value-pairs from\n",
|
||||||
"scanned documents or images.\n",
|
">scanned documents or images.\n",
|
||||||
|
">\n",
|
||||||
|
">Document Intelligence supports `PDF`, `JPEG`, `PNG`, `BMP`, or `TIFF`.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"This current implementation of a loader using Document Intelligence is able to incorporate content page-wise and turn it into LangChain documents.\n",
|
"This current implementation of a loader using `Document Intelligence` can incorporate content page-wise and turn it into LangChain documents.\n"
|
||||||
"\n",
|
|
||||||
"Document Intelligence supports PDF, JPEG, PNG, BMP, or TIFF.\n",
|
|
||||||
"\n",
|
|
||||||
"Further documentation is available at https://aka.ms/doc-intelligence.\n"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 1,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"\n",
|
||||||
|
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.3.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.3.2\u001b[0m\n",
|
||||||
|
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpython3 -m pip install --upgrade pip\u001b[0m\n",
|
||||||
|
"Note: you may need to restart the kernel to use updated packages.\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"%pip install langchain langchain-community azure-ai-documentintelligence -q"
|
"%pip install langchain langchain-community azure-ai-documentintelligence -q"
|
||||||
]
|
]
|
||||||
@ -126,7 +135,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@ -140,7 +149,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.8.10"
|
"version": "3.10.12"
|
||||||
},
|
},
|
||||||
"vscode": {
|
"vscode": {
|
||||||
"interpreter": {
|
"interpreter": {
|
||||||
|
@ -49,6 +49,50 @@ from langchain_community.llms import AzureOpenAI
|
|||||||
|
|
||||||
## Document loaders
|
## Document loaders
|
||||||
|
|
||||||
|
### Azure AI Data
|
||||||
|
|
||||||
|
>[Azure AI Studio](https://ai.azure.com/) provides the capability to upload data assets
|
||||||
|
> to cloud storage and register existing data assets from the following sources:
|
||||||
|
>
|
||||||
|
>- `Microsoft OneLake`
|
||||||
|
>- `Azure Blob Storage`
|
||||||
|
>- `Azure Data Lake gen 2`
|
||||||
|
|
||||||
|
First, you need to install several python packages.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install azureml-fsspec, azure-ai-generative
|
||||||
|
```
|
||||||
|
|
||||||
|
See a [usage example](/docs/integrations/document_loaders/azure_ai_data).
|
||||||
|
|
||||||
|
```python
|
||||||
|
from langchain.document_loaders import AzureAIDataLoader
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
### Azure AI Document Intelligence
|
||||||
|
|
||||||
|
>[Azure AI Document Intelligence](https://aka.ms/doc-intelligence) (formerly known
|
||||||
|
> as `Azure Form Recognizer`) is machine-learning
|
||||||
|
> based service that extracts text (including handwriting), tables or key-value-pairs
|
||||||
|
> from scanned documents or images.
|
||||||
|
>
|
||||||
|
>Document Intelligence supports `PDF`, `JPEG`, `PNG`, `BMP`, or `TIFF`.
|
||||||
|
|
||||||
|
First, you need to install a python package.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install azure-ai-documentintelligence
|
||||||
|
```
|
||||||
|
|
||||||
|
See a [usage example](/docs/integrations/document_loaders/azure_document_intelligence).
|
||||||
|
|
||||||
|
```python
|
||||||
|
from langchain.document_loaders import AzureAIDocumentIntelligenceLoader
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
### Azure Blob Storage
|
### Azure Blob Storage
|
||||||
|
|
||||||
>[Azure Blob Storage](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction) is Microsoft's object storage solution for the cloud. Blob Storage is optimized for storing massive amounts of unstructured data. Unstructured data is data that doesn't adhere to a particular data model or definition, such as text or binary data.
|
>[Azure Blob Storage](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction) is Microsoft's object storage solution for the cloud. Blob Storage is optimized for storing massive amounts of unstructured data. Unstructured data is data that doesn't adhere to a particular data model or definition, such as text or binary data.
|
||||||
|
Loading…
Reference in New Issue
Block a user