mirror of
https://github.com/hwchase17/langchain.git
synced 2026-01-21 21:56:38 +00:00
Compare commits
4 Commits
langchain-
...
bagatur/go
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
15951239df | ||
|
|
6def0a4ed0 | ||
|
|
80f5e05181 | ||
|
|
7fe77245af |
@@ -2,14 +2,11 @@
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b0ed136e-6983-4893-ae1b-b75753af05f8",
|
||||
"id": "0b02f34c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Google Drive\n",
|
||||
"\n",
|
||||
">[Google Drive](https://en.wikipedia.org/wiki/Google_Drive) is a file storage and synchronization service developed by Google.\n",
|
||||
"\n",
|
||||
"This notebook covers how to load documents from `Google Drive`. Currently, only `Google Docs` are supported.\n",
|
||||
"# Google Drive Loader\n",
|
||||
"This notebook covers how to retrieve documents from Google Drive.\n",
|
||||
"\n",
|
||||
"## Prerequisites\n",
|
||||
"\n",
|
||||
@@ -18,12 +15,21 @@
|
||||
"1. [Authorize credentials for desktop app](https://developers.google.com/drive/api/quickstart/python#authorize_credentials_for_a_desktop_application)\n",
|
||||
"1. `pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib`\n",
|
||||
"\n",
|
||||
"## 🧑 Instructions for ingesting your Google Docs data\n",
|
||||
"By default, the `GoogleDriveLoader` expects the `credentials.json` file to be `~/.credentials/credentials.json`, but this is configurable using the `credentials_path` keyword argument. Same thing with `token.json` - `token_path`. Note that `token.json` will be created automatically the first time you use the loader.\n",
|
||||
"\n",
|
||||
"`GoogleDriveLoader` can load from a list of Google Docs document ids or a folder id. You can obtain your folder and document id from the URL:\n",
|
||||
"## Instructions for retrieving your Google Docs data\n",
|
||||
"By default, the `GoogleDriveLoader` expects the `credentials.json` file to be `~/.credentials/credentials.json`, but this is configurable using the `GOOGLE_ACCOUNT_FILE` environment variable. \n",
|
||||
"The location of `token.json` use the same directory (or use the parameter `token_path`). Note that `token.json` will be created automatically the first time you use the loader.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a03b9067",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You can obtain your folder and document id from the URL:\n",
|
||||
"* Folder: https://drive.google.com/drive/u/0/folders/1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5 -> folder id is `\"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5\"`\n",
|
||||
"* Document: https://docs.google.com/document/d/1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw/edit -> document id is `\"1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw\"`"
|
||||
"* Document: https://docs.google.com/document/d/1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw/edit -> document id is `\"1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw\"`\n",
|
||||
"\n",
|
||||
"The special value `root` is for your personal home."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -33,12 +39,23 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib"
|
||||
"#!pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"execution_count": null,
|
||||
"id": "9bcb6cb1",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"folder_id='root'\n",
|
||||
"#folder_id='1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5'"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "878928a6-a5ae-4f74-b351-64e3b01733fe",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
@@ -50,7 +67,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"execution_count": null,
|
||||
"id": "2216c83f-68e4-4d2f-8ea2-5878fb18bbe7",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
@@ -58,174 +75,215 @@
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"loader = GoogleDriveLoader(\n",
|
||||
" folder_id=\"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5\",\n",
|
||||
" # Optional: configure whether to recursively fetch files from subfolders. Defaults to False.\n",
|
||||
" folder_id=folder_id,\n",
|
||||
" recursive=False,\n",
|
||||
" num_results=2, # Maximum number of file to load\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "de5be5d4",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"By default, all files with these mime-type can be converted to `Document`.\n",
|
||||
"- text/text\n",
|
||||
"- text/plain\n",
|
||||
"- text/html\n",
|
||||
"- text/csv\n",
|
||||
"- text/markdown\n",
|
||||
"- image/png\n",
|
||||
"- image/jpeg\n",
|
||||
"- application/epub+zip\n",
|
||||
"- application/pdf\n",
|
||||
"- application/rtf\n",
|
||||
"- application/vnd.google-apps.document (GDoc)\n",
|
||||
"- application/vnd.google-apps.presentation (GSlide)\n",
|
||||
"- application/vnd.google-apps.spreadsheet (GSheet)\n",
|
||||
"- application/vnd.google.colaboratory (Notebook colab)\n",
|
||||
"- application/vnd.openxmlformats-officedocument.presentationml.presentation (PPTX)\n",
|
||||
"- application/vnd.openxmlformats-officedocument.wordprocessingml.document (DOCX)\n",
|
||||
"\n",
|
||||
"It's possible to update or customize this. See the documentation of `GDriveLoader`.\n",
|
||||
"\n",
|
||||
"But, the corresponding packages must be installed."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"execution_count": null,
|
||||
"id": "1bca45c9",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!pip install unstructured"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "8f3b6aa0-b45d-4e37-8c50-5bebe70fdb9d",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"docs = loader.load()"
|
||||
"for doc in loader.load():\n",
|
||||
" print(\"---\")\n",
|
||||
" print(doc.page_content.strip()[:60]+\"...\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "2721ba8a",
|
||||
"id": "31170e71",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"When you pass a `folder_id` by default all files of type document, sheet and pdf are loaded. You can modify this behaviour by passing a `file_types` argument "
|
||||
"# Customize the search pattern\n",
|
||||
"\n",
|
||||
"All parameter compatible with Google [`list()`](https://developers.google.com/drive/api/v3/reference/files/list)\n",
|
||||
"API can be set.\n",
|
||||
"\n",
|
||||
"To specify the new pattern of the Google request, you can use a `PromptTemplate()`.\n",
|
||||
"The variables for the prompt can be set with `kwargs` in the constructor.\n",
|
||||
"Some pre-formated request are proposed (use `{query}`, `{folder_id}` and/or `{mime_type}`):\n",
|
||||
"\n",
|
||||
"You can customize the criteria to select the files. A set of predefined filter are proposed:\n",
|
||||
"| template | description |\n",
|
||||
"| -------------------------------------- | --------------------------------------------------------------------- |\n",
|
||||
"| gdrive-all-in-folder | Return all compatible files from a `folder_id` |\n",
|
||||
"| gdrive-query | Search `query` in all drives |\n",
|
||||
"| gdrive-by-name | Search file with name `query` |\n",
|
||||
"| gdrive-query-in-folder | Search `query` in `folder_id` (and sub-folders in `_recursive=true`) |\n",
|
||||
"| gdrive-mime-type | Search a specific `mime_type` |\n",
|
||||
"| gdrive-mime-type-in-folder | Search a specific `mime_type` in `folder_id` |\n",
|
||||
"| gdrive-query-with-mime-type | Search `query` with a specific `mime_type` |\n",
|
||||
"| gdrive-query-with-mime-type-and-folder | Search `query` with a specific `mime_type` and in `folder_id` |\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "2ff83b4c",
|
||||
"id": "0a47175f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"loader = GoogleDriveLoader(\n",
|
||||
" folder_id=\"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5\",\n",
|
||||
" file_types=[\"document\", \"sheet\"]\n",
|
||||
" recursive=False\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d6b80931",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Passing in Optional File Loaders\n",
|
||||
"\n",
|
||||
"When processing files other than Google Docs and Google Sheets, it can be helpful to pass an optional file loader to `GoogleDriveLoader`. If you pass in a file loader, that file loader will be used on documents that do not have a Google Docs or Google Sheets MIME type. Here is an example of how to load an Excel document from Google Drive using a file loader. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "94207e39",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.document_loaders import GoogleDriveLoader\n",
|
||||
"from langchain.document_loaders import UnstructuredFileIOLoader"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "a15fbee0",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"file_id = \"1x9WBtFPWMEAdjcJzPScRsjpjQvpSo_kz\"\n",
|
||||
"loader = GoogleDriveLoader(\n",
|
||||
" file_ids=[file_id],\n",
|
||||
" file_loader_cls=UnstructuredFileIOLoader,\n",
|
||||
" file_loader_kwargs={\"mode\": \"elements\"},\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "98410bda",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"docs = loader.load()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "e3e72221",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Document(page_content='\\n \\n \\n Team\\n Location\\n Stanley Cups\\n \\n \\n Blues\\n STL\\n 1\\n \\n \\n Flyers\\n PHI\\n 2\\n \\n \\n Maple Leafs\\n TOR\\n 13\\n \\n \\n', metadata={'filetype': 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet', 'page_number': 1, 'page_name': 'Stanley Cups', 'text_as_html': '<table border=\"1\" class=\"dataframe\">\\n <tbody>\\n <tr>\\n <td>Team</td>\\n <td>Location</td>\\n <td>Stanley Cups</td>\\n </tr>\\n <tr>\\n <td>Blues</td>\\n <td>STL</td>\\n <td>1</td>\\n </tr>\\n <tr>\\n <td>Flyers</td>\\n <td>PHI</td>\\n <td>2</td>\\n </tr>\\n <tr>\\n <td>Maple Leafs</td>\\n <td>TOR</td>\\n <td>13</td>\\n </tr>\\n </tbody>\\n</table>', 'category': 'Table', 'source': 'https://drive.google.com/file/d/1aA6L2AR3g0CR-PW03HEZZo4NaVlKpaP7/view'})"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"docs[0]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "238cd06f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You can also process a folder with a mix of files and Google Docs/Sheets using the following pattern:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "0e2d093f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"folder_id = \"1asMOHY1BqBS84JcRbOag5LOJac74gpmD\"\n",
|
||||
"loader = GoogleDriveLoader(\n",
|
||||
" folder_id=folder_id,\n",
|
||||
" file_loader_cls=UnstructuredFileIOLoader,\n",
|
||||
" file_loader_kwargs={\"mode\": \"elements\"},\n",
|
||||
" recursive=False,\n",
|
||||
" template=\"gdrive-query\", # Default template to use\n",
|
||||
" query=\"machine learning\",\n",
|
||||
" num_results=2, # Maximum number of file to load\n",
|
||||
" supportsAllDrives=False, # GDrive `list()` parameter\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "b35ddcc6",
|
||||
"execution_count": null,
|
||||
"id": "100cf361",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"docs = loader.load()"
|
||||
"for doc in loader.load():\n",
|
||||
" print(\"---\")\n",
|
||||
" print(doc.page_content.strip()[:60]+\"...\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "3cc141e0",
|
||||
"cell_type": "markdown",
|
||||
"id": "74e6e3aa",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Document(page_content='\\n \\n \\n Team\\n Location\\n Stanley Cups\\n \\n \\n Blues\\n STL\\n 1\\n \\n \\n Flyers\\n PHI\\n 2\\n \\n \\n Maple Leafs\\n TOR\\n 13\\n \\n \\n', metadata={'filetype': 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet', 'page_number': 1, 'page_name': 'Stanley Cups', 'text_as_html': '<table border=\"1\" class=\"dataframe\">\\n <tbody>\\n <tr>\\n <td>Team</td>\\n <td>Location</td>\\n <td>Stanley Cups</td>\\n </tr>\\n <tr>\\n <td>Blues</td>\\n <td>STL</td>\\n <td>1</td>\\n </tr>\\n <tr>\\n <td>Flyers</td>\\n <td>PHI</td>\\n <td>2</td>\\n </tr>\\n <tr>\\n <td>Maple Leafs</td>\\n <td>TOR</td>\\n <td>13</td>\\n </tr>\\n </tbody>\\n</table>', 'category': 'Table', 'source': 'https://drive.google.com/file/d/1aA6L2AR3g0CR-PW03HEZZo4NaVlKpaP7/view'})"
|
||||
]
|
||||
},
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"docs[0]"
|
||||
"You can customize your pattern."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e312268a",
|
||||
"id": "dcf07ff7",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
"source": [
|
||||
"from langchain.prompts.prompt import PromptTemplate\n",
|
||||
"loader = GoogleDriveLoader(\n",
|
||||
" folder_id=folder_id,\n",
|
||||
" recursive=False,\n",
|
||||
" template=PromptTemplate(\n",
|
||||
" input_variables=[\"query\", \"query_name\"],\n",
|
||||
" template=\"fullText contains '{query}' and name contains '{query_name}' and trashed=false\",\n",
|
||||
" ), # Default template to use\n",
|
||||
" query=\"machine learning\",\n",
|
||||
" query_name=\"ML\", \n",
|
||||
" num_results=2, # Maximum number of file to load\n",
|
||||
")\n",
|
||||
"for doc in loader.load():\n",
|
||||
" print(\"---\")\n",
|
||||
" print(doc.page_content.strip()[:60]+\"...\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8e404472",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Modes for GSlide and GSheet\n",
|
||||
"\n",
|
||||
"The parameter `mode` accept differents values:\n",
|
||||
"- `\"document\"`: return the body of each documents\n",
|
||||
"- `\"snippets\"`: return the `description` of each files.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"The parameter `gslide_mode` accept differents values:\n",
|
||||
"- `\"single\"` : one document with `<PAGE BREAK>`\n",
|
||||
"- `\"slide\"` : one document by slide\n",
|
||||
"- `\"elements\"` : one document for each `elements`."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "b33d1a53",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"loader = GoogleDriveLoader(\n",
|
||||
" template=\"gdrive-mime-type\",\n",
|
||||
" mime_type=\"application/vnd.google-apps.presentation\", # Only GSlide files\n",
|
||||
" gslide_mode=\"slide\",\n",
|
||||
" num_results=2, # Maximum number of file to load\n",
|
||||
")\n",
|
||||
"for doc in loader.load():\n",
|
||||
" print(\"---\")\n",
|
||||
" print(doc.page_content.strip()[:60]+\"...\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "498f0451",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The parameter `gsheet_mode` accept differents values:\n",
|
||||
"- `\"single\"`: Generate one document by line\n",
|
||||
"- `\"elements\"` : one document with markdown array and `<PAGE BREAK>` tags."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "884c4ca6",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"loader = GoogleDriveLoader(\n",
|
||||
" template=\"gdrive-mime-type\",\n",
|
||||
" mime_type=\"application/vnd.google-apps.spreadsheet\", # Only GSheet files\n",
|
||||
" gsheet_mode=\"elements\",\n",
|
||||
" num_results=2, # Maximum number of file to load\n",
|
||||
")\n",
|
||||
"for doc in loader.load():\n",
|
||||
" print(\"---\")\n",
|
||||
" print(doc.page_content.strip()[:60]+\"...\")"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
@@ -244,7 +302,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.8.13"
|
||||
"version": "3.10.9"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
>[Google Drive](https://en.wikipedia.org/wiki/Google_Drive) is a file storage and synchronization service developed by Google.
|
||||
|
||||
Currently, only `Google Docs` are supported.
|
||||
All Google Drive API is supported, with integration with Google Doc, Google Sheet and Google Slide.
|
||||
|
||||
## Installation and Setup
|
||||
|
||||
@@ -20,3 +20,22 @@ See a [usage example and authorizing instructions](/docs/integrations/document_l
|
||||
```python
|
||||
from langchain.document_loaders import GoogleDriveLoader
|
||||
```
|
||||
|
||||
## Retriever
|
||||
|
||||
See a [usage example and authorizing instructions](/docs/modules/data_connection/retrievers/integrations/google_drive.html).
|
||||
|
||||
```python
|
||||
from langchain.retrievers import GoogleDriveRetriever
|
||||
```
|
||||
|
||||
## Tools
|
||||
|
||||
See a [usage example and authorizing instructions](/docs/modules/agents/tools/integrations/google_drive.html).
|
||||
|
||||
```python
|
||||
from langchain.tools import GoogleDriveSearchTool
|
||||
from langchain.utilities import GoogleDriveAPIWrapper
|
||||
```
|
||||
|
||||
|
||||
|
||||
279
docs/extras/integrations/retrievers/google_drive.ipynb
Normal file
279
docs/extras/integrations/retrievers/google_drive.ipynb
Normal file
@@ -0,0 +1,279 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b0ed136e-6983-4893-ae1b-b75753af05f8",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Google Drive Retriever\n",
|
||||
"This notebook covers how to retrieve documents from Google Drive.\n",
|
||||
"\n",
|
||||
"## Prerequisites\n",
|
||||
"\n",
|
||||
"1. Create a Google Cloud project or use an existing project\n",
|
||||
"1. Enable the [Google Drive API](https://console.cloud.google.com/flows/enableapi?apiid=drive.googleapis.com)\n",
|
||||
"1. [Authorize credentials for desktop app](https://developers.google.com/drive/api/quickstart/python#authorize_credentials_for_a_desktop_application)\n",
|
||||
"1. `pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib`\n",
|
||||
"\n",
|
||||
"## Instructions for retrieving your Google Docs data\n",
|
||||
"By default, the `GoogleDriveRetriever` expects the `credentials.json` file to be `~/.credentials/credentials.json`, but this is configurable using the `GOOGLE_ACCOUNT_FILE` environment variable. \n",
|
||||
"The location of `token.json` use the same directory (or use the parameter `token_path`). Note that `token.json` will be created automatically the first time you use the retriever.\n",
|
||||
"\n",
|
||||
"`GoogleDriveRetriever` can retrieve a selection of files with some requests. \n",
|
||||
"\n",
|
||||
"By default, If you use a `folder_id`, all the files inside this folder can be retrieved to `Document`.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "35b94a93-97de-4af8-9cca-de9ffb7930c3",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You can obtain your folder and document id from the URL:\n",
|
||||
"* Folder: https://drive.google.com/drive/u/0/folders/1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5 -> folder id is `\"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5\"`\n",
|
||||
"* Document: https://docs.google.com/document/d/1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw/edit -> document id is `\"1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw\"`\n",
|
||||
"\n",
|
||||
"The special value `root` is for your personal home."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "9c9665c9-a023-4078-9d95-e43021cecb6f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#!pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "878928a6-a5ae-4f74-b351-64e3b01733fe",
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2023-05-09T10:45:59.438650905Z",
|
||||
"start_time": "2023-05-09T10:45:57.955900302Z"
|
||||
},
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.retrievers import GoogleDriveRetriever"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "755907c2-145d-4f0f-9b15-07a628a2d2d2",
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2023-05-09T10:45:59.442890834Z",
|
||||
"start_time": "2023-05-09T10:45:59.440941528Z"
|
||||
},
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"folder_id=\"root\"\n",
|
||||
"#folder_id='1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5'"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "2216c83f-68e4-4d2f-8ea2-5878fb18bbe7",
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2023-05-09T10:45:59.795842403Z",
|
||||
"start_time": "2023-05-09T10:45:59.445262457Z"
|
||||
},
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"retriever = GoogleDriveRetriever(\n",
|
||||
" num_results=2,\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "fa339ca0-f478-440c-ba80-0e5f41a19ce1",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"By default, all files with these mime-type can be converted to `Document`.\n",
|
||||
"- text/text\n",
|
||||
"- text/plain\n",
|
||||
"- text/html\n",
|
||||
"- text/csv\n",
|
||||
"- text/markdown\n",
|
||||
"- image/png\n",
|
||||
"- image/jpeg\n",
|
||||
"- application/epub+zip\n",
|
||||
"- application/pdf\n",
|
||||
"- application/rtf\n",
|
||||
"- application/vnd.google-apps.document (GDoc)\n",
|
||||
"- application/vnd.google-apps.presentation (GSlide)\n",
|
||||
"- application/vnd.google-apps.spreadsheet (GSheet)\n",
|
||||
"- application/vnd.google.colaboratory (Notebook colab)\n",
|
||||
"- application/vnd.openxmlformats-officedocument.presentationml.presentation (PPTX)\n",
|
||||
"- application/vnd.openxmlformats-officedocument.wordprocessingml.document (DOCX)\n",
|
||||
"\n",
|
||||
"It's possible to update or customize this. See the documentation of `GDriveRetriever`.\n",
|
||||
"\n",
|
||||
"But, the corresponding packages must be installed."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "9dadec48",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#!pip install unstructured"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "8f3b6aa0-b45d-4e37-8c50-5bebe70fdb9d",
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2023-05-09T10:46:00.990310466Z",
|
||||
"start_time": "2023-05-09T10:45:59.798774595Z"
|
||||
},
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"retriever.get_relevant_documents(\"machine learning\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8ff33817-8619-4897-8742-2216b9934d2a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You can customize the criteria to select the files. A set of predefined filter are proposed:\n",
|
||||
"| template | description |\n",
|
||||
"| -------------------------------------- | --------------------------------------------------------------------- |\n",
|
||||
"| gdrive-all-in-folder | Return all compatible files from a `folder_id` |\n",
|
||||
"| gdrive-query | Search `query` in all drives |\n",
|
||||
"| gdrive-by-name | Search file with name `query`) |\n",
|
||||
"| gdrive-query-in-folder | Search `query` in `folder_id` (and sub-folders in `_recursive=true`) |\n",
|
||||
"| gdrive-mime-type | Search a specific `mime_type` |\n",
|
||||
"| gdrive-mime-type-in-folder | Search a specific `mime_type` in `folder_id` |\n",
|
||||
"| gdrive-query-with-mime-type | Search `query` with a specific `mime_type` |\n",
|
||||
"| gdrive-query-with-mime-type-and-folder | Search `query` with a specific `mime_type` and in `folder_id` |"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "9977c712-9659-4959-b508-f59cc7d49d44",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"retriever = GoogleDriveRetriever(\n",
|
||||
" template=\"gdrive-query\", # Search everywhere\n",
|
||||
" num_results=2, # But take only 2 documents\n",
|
||||
")\n",
|
||||
"for doc in retriever.get_relevant_documents(\"machine learning\"):\n",
|
||||
" print(\"---\")\n",
|
||||
" print(doc.page_content.strip()[:60]+\"...\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a5a0f3ef-26fb-4a5c-85f0-5aba90b682b1",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Else, you can customize the prompt with a specialized `PromptTemplate`"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "b0bbebde-0487-4d20-9d77-8070e4f0e0d6",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain import PromptTemplate\n",
|
||||
"retriever = GoogleDriveRetriever(\n",
|
||||
" template=PromptTemplate(input_variables=['query'],\n",
|
||||
" # See https://developers.google.com/drive/api/guides/search-files\n",
|
||||
" template=\"(fullText contains '{query}') \"\n",
|
||||
" \"and mimeType='application/vnd.google-apps.document' \"\n",
|
||||
" \"and modifiedTime > '2000-01-01T00:00:00' \"\n",
|
||||
" \"and trashed=false\"),\n",
|
||||
" num_results=2,\n",
|
||||
" # See https://developers.google.com/drive/api/v3/reference/files/list\n",
|
||||
" includeItemsFromAllDrives=False,\n",
|
||||
" supportsAllDrives=False,\n",
|
||||
")\n",
|
||||
"for doc in retriever.get_relevant_documents(\"machine learning\"):\n",
|
||||
" print(f\"{doc.metadata['name']}:\")\n",
|
||||
" print(\"---\")\n",
|
||||
" print(doc.page_content.strip()[:60]+\"...\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9b6fed29-1666-452e-b677-401613270388",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Use GDrive 'description' metadata\n",
|
||||
"Each Google Drive has a `description` field in metadata (see the *details of a file*).\n",
|
||||
"Use the `snippets` mode to return the description of selected files.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "342dbe12-ed83-40f4-8957-0cc8c4609542",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"retriever = GoogleDriveRetriever(\n",
|
||||
" template='gdrive-mime-type-in-folder',\n",
|
||||
" folder_id=folder_id,\n",
|
||||
" mime_type='application/vnd.google-apps.document', # Only Google Docs\n",
|
||||
" num_results=2,\n",
|
||||
" mode='snippets',\n",
|
||||
" includeItemsFromAllDrives=False,\n",
|
||||
" supportsAllDrives=False,\n",
|
||||
")\n",
|
||||
"retriever.get_relevant_documents(\"machine learning\")"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.9"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
215
docs/extras/integrations/toolkits/google_drive.ipynb
Normal file
215
docs/extras/integrations/toolkits/google_drive.ipynb
Normal file
@@ -0,0 +1,215 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Google Drive tool\n",
|
||||
"\n",
|
||||
"This notebook walks through connecting a LangChain to the Google Drive API.\n",
|
||||
"\n",
|
||||
"## Prerequisites\n",
|
||||
"\n",
|
||||
"1. Create a Google Cloud project or use an existing project\n",
|
||||
"1. Enable the [Google Drive API](https://console.cloud.google.com/flows/enableapi?apiid=drive.googleapis.com)\n",
|
||||
"1. [Authorize credentials for desktop app](https://developers.google.com/drive/api/quickstart/python#authorize_credentials_for_a_desktop_application)\n",
|
||||
"1. `pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib`\n",
|
||||
"\n",
|
||||
"## Instructions for retrieving your Google Docs data\n",
|
||||
"By default, the `GoogleDriveTools` and `GoogleDriveWrapper` expects the `credentials.json` file to be `~/.credentials/credentials.json`, but this is configurable using the `GOOGLE_ACCOUNT_FILE` environment variable. \n",
|
||||
"The location of `token.json` use the same directory (or use the parameter `token_path`). Note that `token.json` will be created automatically the first time you use the tool.\n",
|
||||
"\n",
|
||||
"`GoogleDriveSearchTool` can retrieve a selection of files with some requests. \n",
|
||||
"\n",
|
||||
"By default, If you use a `folder_id`, all the files inside this folder can be retrieved to `Document`, if the name match the query.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#!pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You can obtain your folder and document id from the URL:\n",
|
||||
"* Folder: https://drive.google.com/drive/u/0/folders/1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5 -> folder id is `\"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5\"`\n",
|
||||
"* Document: https://docs.google.com/document/d/1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw/edit -> document id is `\"1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw\"`\n",
|
||||
"\n",
|
||||
"The special value `root` is for your personal home."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"folder_id=\"root\"\n",
|
||||
"#folder_id='1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5'"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"By default, all files with these mime-type can be converted to `Document`.\n",
|
||||
"- text/text\n",
|
||||
"- text/plain\n",
|
||||
"- text/html\n",
|
||||
"- text/csv\n",
|
||||
"- text/markdown\n",
|
||||
"- image/png\n",
|
||||
"- image/jpeg\n",
|
||||
"- application/epub+zip\n",
|
||||
"- application/pdf\n",
|
||||
"- application/rtf\n",
|
||||
"- application/vnd.google-apps.document (GDoc)\n",
|
||||
"- application/vnd.google-apps.presentation (GSlide)\n",
|
||||
"- application/vnd.google-apps.spreadsheet (GSheet)\n",
|
||||
"- application/vnd.google.colaboratory (Notebook colab)\n",
|
||||
"- application/vnd.openxmlformats-officedocument.presentationml.presentation (PPTX)\n",
|
||||
"- application/vnd.openxmlformats-officedocument.wordprocessingml.document (DOCX)\n",
|
||||
"\n",
|
||||
"It's possible to update or customize this. See the documentation of `GoogleDriveAPIWrapper`.\n",
|
||||
"\n",
|
||||
"But, the corresponding packages must installed."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#!pip install unstructured"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.utilities.google_drive import GoogleDriveAPIWrapper\n",
|
||||
"from langchain.tools.google_drive.tool import GoogleDriveSearchTool\n",
|
||||
"\n",
|
||||
"# By default, search only in the filename.\n",
|
||||
"tool = GoogleDriveSearchTool(\n",
|
||||
" api_wrapper=GoogleDriveAPIWrapper(\n",
|
||||
" folder_id=folder_id,\n",
|
||||
" num_results=2,\n",
|
||||
" template=\"gdrive-query-in-folder\", # Search in the body of documents\n",
|
||||
" )\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import logging\n",
|
||||
"logging.basicConfig(level=logging.INFO)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"tool.run(\"machine learning\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"tool.description"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.agents import load_tools\n",
|
||||
"tools = load_tools([\"google-drive-search\"],\n",
|
||||
" folder_id=folder_id,\n",
|
||||
" template=\"gdrive-query-in-folder\",\n",
|
||||
" )"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Use within an Agent"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain import OpenAI\n",
|
||||
"from langchain.agents import initialize_agent, AgentType\n",
|
||||
"llm = OpenAI(temperature=0)\n",
|
||||
"agent = initialize_agent(\n",
|
||||
" tools=tools,\n",
|
||||
" llm=llm,\n",
|
||||
" agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"agent.run(\n",
|
||||
" \"Search in google drive, who is 'Yann LeCun' ?\"\n",
|
||||
")"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.9"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
}
|
||||
@@ -57,6 +57,10 @@ from langchain.utilities.wikipedia import WikipediaAPIWrapper
|
||||
from langchain.utilities.wolfram_alpha import WolframAlphaAPIWrapper
|
||||
from langchain.utilities.openweathermap import OpenWeatherMapAPIWrapper
|
||||
from langchain.utilities.dataforseo_api_search import DataForSeoAPIWrapper
|
||||
from langchain.tools.google_drive.tool import (
|
||||
GoogleDriveSearchTool,
|
||||
GoogleDriveAPIWrapper,
|
||||
)
|
||||
|
||||
|
||||
def _get_python_repl() -> BaseTool:
|
||||
@@ -180,6 +184,10 @@ def _get_wolfram_alpha(**kwargs: Any) -> BaseTool:
|
||||
return WolframAlphaQueryRun(api_wrapper=WolframAlphaAPIWrapper(**kwargs))
|
||||
|
||||
|
||||
def _get_google_drive_search(**kwargs: Any) -> BaseTool:
|
||||
return GoogleDriveSearchTool(api_wrapper=GoogleDriveAPIWrapper(**kwargs))
|
||||
|
||||
|
||||
def _get_google_search(**kwargs: Any) -> BaseTool:
|
||||
return GoogleSearchRun(api_wrapper=GoogleSearchAPIWrapper(**kwargs))
|
||||
|
||||
@@ -287,6 +295,15 @@ _EXTRA_LLM_TOOLS: Dict[
|
||||
|
||||
_EXTRA_OPTIONAL_TOOLS: Dict[str, Tuple[Callable[[KwArg(Any)], BaseTool], List[str]]] = {
|
||||
"wolfram-alpha": (_get_wolfram_alpha, ["wolfram_alpha_appid"]),
|
||||
"google-drive-search": (
|
||||
_get_google_drive_search,
|
||||
[
|
||||
"gdrive_api_file",
|
||||
"folder_id",
|
||||
"mime_type",
|
||||
"template",
|
||||
],
|
||||
),
|
||||
"google-search": (_get_google_search, ["google_api_key", "google_cse_id"]),
|
||||
"google-search-results-json": (
|
||||
_get_google_search_results_json,
|
||||
|
||||
@@ -74,7 +74,7 @@ from langchain.document_loaders.geodataframe import GeoDataFrameLoader
|
||||
from langchain.document_loaders.git import GitLoader
|
||||
from langchain.document_loaders.gitbook import GitbookLoader
|
||||
from langchain.document_loaders.github import GitHubIssuesLoader
|
||||
from langchain.document_loaders.googledrive import GoogleDriveLoader
|
||||
from langchain.document_loaders.google_drive import GoogleDriveLoader
|
||||
from langchain.document_loaders.gutenberg import GutenbergLoader
|
||||
from langchain.document_loaders.hn import HNLoader
|
||||
from langchain.document_loaders.html import UnstructuredHTMLLoader
|
||||
|
||||
216
libs/langchain/langchain/document_loaders/google_drive.py
Normal file
216
libs/langchain/langchain/document_loaders/google_drive.py
Normal file
@@ -0,0 +1,216 @@
|
||||
"""Loads data from Google Drive.
|
||||
|
||||
Prerequisites:
|
||||
1. Create a Google Cloud project
|
||||
2. Enable the Google Drive API:
|
||||
https://console.cloud.google.com/flows/enableapi?apiid=drive.googleapis.com
|
||||
3. Authorize credentials for desktop app:
|
||||
https://developers.google.com/drive/api/quickstart/python#authorize_credentials_for_a_desktop_application
|
||||
4. For service accounts visit
|
||||
https://cloud.google.com/iam/docs/service-accounts-create
|
||||
""" # noqa: E501
|
||||
|
||||
import itertools
|
||||
import logging
|
||||
import os
|
||||
import warnings
|
||||
from pathlib import Path
|
||||
from typing import (
|
||||
Any,
|
||||
Dict,
|
||||
Iterator,
|
||||
List,
|
||||
Optional,
|
||||
Sequence,
|
||||
)
|
||||
|
||||
from pydantic.class_validators import root_validator
|
||||
|
||||
from langchain.base_language import BaseLanguageModel
|
||||
from langchain.chains.summarize import load_summarize_chain
|
||||
from langchain.document_loaders.base import BaseLoader
|
||||
from langchain.prompts import PromptTemplate
|
||||
from langchain.schema import Document
|
||||
from langchain.utilities.google_drive import (
|
||||
GoogleDriveUtilities,
|
||||
get_template,
|
||||
)
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class GoogleDriveLoader(BaseLoader, GoogleDriveUtilities):
|
||||
"""Loads data from Google Drive."""
|
||||
|
||||
document_ids: Optional[Sequence[str]] = None
|
||||
""" A list of ids of google drive documents to load."""
|
||||
|
||||
file_ids: Optional[Sequence[str]] = None
|
||||
"""A list of ids of google drive files to load."""
|
||||
|
||||
@root_validator(pre=True)
|
||||
def validate_older_api_and_new_environment_variable(
|
||||
cls, v: Dict[str, Any]
|
||||
) -> Dict[str, Any]:
|
||||
service_account_key = v.get("service_account_key")
|
||||
credentials_path = v.get("credentials_path")
|
||||
api_file = v.get("gdrive_api_file")
|
||||
|
||||
if service_account_key:
|
||||
warnings.warn(
|
||||
"service_account_key was deprecated. Use GOOGLE_ACCOUNT_FILE env "
|
||||
"variable.",
|
||||
DeprecationWarning,
|
||||
)
|
||||
if credentials_path:
|
||||
warnings.warn(
|
||||
"service_account_key was deprecated. Use GOOGLE_ACCOUNT_FILE env "
|
||||
"variable.",
|
||||
DeprecationWarning,
|
||||
)
|
||||
if service_account_key and credentials_path:
|
||||
raise ValueError("Select only service_account_key or service_account_key")
|
||||
|
||||
folder_id = v.get("folder_id")
|
||||
document_ids = v.get("document_ids")
|
||||
file_ids = v.get("file_ids")
|
||||
|
||||
if folder_id and (document_ids or file_ids):
|
||||
raise ValueError(
|
||||
"Cannot specify both folder_id and document_ids nor "
|
||||
"folder_id and file_ids"
|
||||
)
|
||||
|
||||
# To be compatible with the old approach
|
||||
if not api_file:
|
||||
api_file = (
|
||||
Path(os.environ["GOOGLE_ACCOUNT_FILE"])
|
||||
if "GOOGLE_ACCOUNT_FILE" in os.environ
|
||||
else None
|
||||
)
|
||||
# Deprecated: To be compatible with the old approach of authentication
|
||||
if service_account_key:
|
||||
api_file = service_account_key
|
||||
elif credentials_path:
|
||||
api_file = credentials_path
|
||||
elif not api_file:
|
||||
api_file = Path.home() / ".credentials" / "keys.json"
|
||||
v["gdrive_api_file"] = api_file
|
||||
|
||||
if not v.get("template"):
|
||||
if folder_id:
|
||||
template = get_template("gdrive-all-in-folder")
|
||||
elif "document_ids" in v or "file_ids" in v:
|
||||
template = PromptTemplate(input_variables=[], template="")
|
||||
else:
|
||||
raise ValueError("Use a template")
|
||||
v["template"] = template
|
||||
return v
|
||||
|
||||
def lazy_load(self) -> Iterator[Document]:
|
||||
ids = self.document_ids or self.file_ids
|
||||
if ids:
|
||||
yield from (self.load_document_from_id(_id) for _id in ids)
|
||||
else:
|
||||
return self.lazy_get_relevant_documents()
|
||||
|
||||
def load(self) -> List[Document]:
|
||||
return list(self.lazy_load())
|
||||
|
||||
|
||||
def lazy_update_description_with_summary(
|
||||
loader: GoogleDriveLoader,
|
||||
llm: BaseLanguageModel,
|
||||
*,
|
||||
force: bool = False,
|
||||
query: str = "",
|
||||
**kwargs: Any,
|
||||
) -> Iterator[Document]:
|
||||
"""Summarize all documents, and update the GDrive metadata `description`.
|
||||
|
||||
Need `write` access: set scope=["https://www.googleapis.com/auth/drive"].
|
||||
|
||||
Note: Update the description of shortcut without touch the target
|
||||
file description.
|
||||
|
||||
Args:
|
||||
llm: Language model to use.
|
||||
force: true to update all files. Else, update only if the description
|
||||
is empty.
|
||||
query: If possible, the query request.
|
||||
kwargs: Others parameters for the template (verbose, prompt, etc).
|
||||
"""
|
||||
try:
|
||||
from googleapiclient.errors import HttpError
|
||||
except ImportError as e:
|
||||
raise ImportError("""Could not import""") from e
|
||||
|
||||
if "https://www.googleapis.com/auth/drive" not in loader._creds.scopes:
|
||||
raise ValueError(
|
||||
f"Remove the file 'token.json' and "
|
||||
f"initialize the {loader.__class__.__name__} with "
|
||||
f"scopes=['https://www.googleapis.com/auth/drive']"
|
||||
)
|
||||
|
||||
chain = load_summarize_chain(llm, chain_type="stuff", **kwargs)
|
||||
updated_files = set() # Never update two time the same document (if it's split)
|
||||
for document in loader.lazy_get_relevant_documents(query, **kwargs):
|
||||
try:
|
||||
file_id = document.metadata["gdriveId"]
|
||||
if file_id not in updated_files:
|
||||
file = loader.files.get(
|
||||
fileId=file_id,
|
||||
fields=loader.fields,
|
||||
supportsAllDrives=True,
|
||||
).execute()
|
||||
if force or not file.get("description", "").strip():
|
||||
summary = chain.run([document]).strip()
|
||||
if summary:
|
||||
loader.files.update(
|
||||
fileId=file_id,
|
||||
supportsAllDrives=True,
|
||||
body={"description": summary},
|
||||
).execute()
|
||||
logger.info(
|
||||
f"For the file '{file['name']}', add description "
|
||||
f"'{summary[:40]}...'"
|
||||
)
|
||||
metadata = loader._extract_meta_data(file)
|
||||
if "summary" in metadata:
|
||||
del metadata["summary"]
|
||||
yield Document(page_content=summary, metadata=metadata)
|
||||
updated_files.add(file_id)
|
||||
except HttpError:
|
||||
logger.warning(
|
||||
f"Impossible to update the description of file "
|
||||
f"'{document.metadata['name']}'"
|
||||
)
|
||||
|
||||
|
||||
def update_description_with_summary(
|
||||
loader: GoogleDriveLoader,
|
||||
llm: BaseLanguageModel,
|
||||
*,
|
||||
force: bool = False,
|
||||
query: str = "",
|
||||
**kwargs: Any,
|
||||
) -> List[Document]:
|
||||
"""Summarize all documents, and update the GDrive metadata `description`.
|
||||
|
||||
Need `write` access: set scope=["https://www.googleapis.com/auth/drive"].
|
||||
|
||||
Note: Update the description of shortcut without touch the target
|
||||
file description.
|
||||
|
||||
Args:
|
||||
llm: Language model to use.
|
||||
force: true to update all files. Else, update only if the description
|
||||
is empty.
|
||||
query: If possible, the query request.
|
||||
kwargs: Others parameters for the template (verbose, prompt, etc).
|
||||
"""
|
||||
return list(
|
||||
lazy_update_description_with_summary(
|
||||
loader, llm, force=force, query=query, **kwargs
|
||||
)
|
||||
)
|
||||
@@ -1,353 +1,4 @@
|
||||
"""Loads data from Google Drive."""
|
||||
"""DEPRECATED: Kept for backwards compatibility."""
|
||||
from langchain.document_loaders.google_drive import GoogleDriveLoader
|
||||
|
||||
# Prerequisites:
|
||||
# 1. Create a Google Cloud project
|
||||
# 2. Enable the Google Drive API:
|
||||
# https://console.cloud.google.com/flows/enableapi?apiid=drive.googleapis.com
|
||||
# 3. Authorize credentials for desktop app:
|
||||
# https://developers.google.com/drive/api/quickstart/python#authorize_credentials_for_a_desktop_application # noqa: E501
|
||||
# 4. For service accounts visit
|
||||
# https://cloud.google.com/iam/docs/service-accounts-create
|
||||
|
||||
import os
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, List, Optional, Sequence, Union
|
||||
|
||||
from pydantic import BaseModel, root_validator, validator
|
||||
|
||||
from langchain.docstore.document import Document
|
||||
from langchain.document_loaders.base import BaseLoader
|
||||
|
||||
SCOPES = ["https://www.googleapis.com/auth/drive.readonly"]
|
||||
|
||||
|
||||
class GoogleDriveLoader(BaseLoader, BaseModel):
|
||||
"""Loads Google Docs from Google Drive."""
|
||||
|
||||
service_account_key: Path = Path.home() / ".credentials" / "keys.json"
|
||||
"""Path to the service account key file."""
|
||||
credentials_path: Path = Path.home() / ".credentials" / "credentials.json"
|
||||
"""Path to the credentials file."""
|
||||
token_path: Path = Path.home() / ".credentials" / "token.json"
|
||||
"""Path to the token file."""
|
||||
folder_id: Optional[str] = None
|
||||
"""The folder id to load from."""
|
||||
document_ids: Optional[List[str]] = None
|
||||
"""The document ids to load from."""
|
||||
file_ids: Optional[List[str]] = None
|
||||
"""The file ids to load from."""
|
||||
recursive: bool = False
|
||||
"""Whether to load recursively. Only applies when folder_id is given."""
|
||||
file_types: Optional[Sequence[str]] = None
|
||||
"""The file types to load. Only applies when folder_id is given."""
|
||||
load_trashed_files: bool = False
|
||||
"""Whether to load trashed files. Only applies when folder_id is given."""
|
||||
# NOTE(MthwRobinson) - changing the file_loader_cls to type here currently
|
||||
# results in pydantic validation errors
|
||||
file_loader_cls: Any = None
|
||||
"""The file loader class to use."""
|
||||
file_loader_kwargs: Dict["str", Any] = {}
|
||||
"""The file loader kwargs to use."""
|
||||
|
||||
@root_validator
|
||||
def validate_inputs(cls, values: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Validate that either folder_id or document_ids is set, but not both."""
|
||||
if values.get("folder_id") and (
|
||||
values.get("document_ids") or values.get("file_ids")
|
||||
):
|
||||
raise ValueError(
|
||||
"Cannot specify both folder_id and document_ids nor "
|
||||
"folder_id and file_ids"
|
||||
)
|
||||
if (
|
||||
not values.get("folder_id")
|
||||
and not values.get("document_ids")
|
||||
and not values.get("file_ids")
|
||||
):
|
||||
raise ValueError("Must specify either folder_id, document_ids, or file_ids")
|
||||
|
||||
file_types = values.get("file_types")
|
||||
if file_types:
|
||||
if values.get("document_ids") or values.get("file_ids"):
|
||||
raise ValueError(
|
||||
"file_types can only be given when folder_id is given,"
|
||||
" (not when document_ids or file_ids are given)."
|
||||
)
|
||||
type_mapping = {
|
||||
"document": "application/vnd.google-apps.document",
|
||||
"sheet": "application/vnd.google-apps.spreadsheet",
|
||||
"pdf": "application/pdf",
|
||||
}
|
||||
allowed_types = list(type_mapping.keys()) + list(type_mapping.values())
|
||||
short_names = ", ".join([f"'{x}'" for x in type_mapping.keys()])
|
||||
full_names = ", ".join([f"'{x}'" for x in type_mapping.values()])
|
||||
for file_type in file_types:
|
||||
if file_type not in allowed_types:
|
||||
raise ValueError(
|
||||
f"Given file type {file_type} is not supported. "
|
||||
f"Supported values are: {short_names}; and "
|
||||
f"their full-form names: {full_names}"
|
||||
)
|
||||
|
||||
# replace short-form file types by full-form file types
|
||||
def full_form(x: str) -> str:
|
||||
return type_mapping[x] if x in type_mapping else x
|
||||
|
||||
values["file_types"] = [full_form(file_type) for file_type in file_types]
|
||||
return values
|
||||
|
||||
@validator("credentials_path")
|
||||
def validate_credentials_path(cls, v: Any, **kwargs: Any) -> Any:
|
||||
"""Validate that credentials_path exists."""
|
||||
if not v.exists():
|
||||
raise ValueError(f"credentials_path {v} does not exist")
|
||||
return v
|
||||
|
||||
def _load_credentials(self) -> Any:
|
||||
"""Load credentials."""
|
||||
# Adapted from https://developers.google.com/drive/api/v3/quickstart/python
|
||||
try:
|
||||
from google.auth import default
|
||||
from google.auth.transport.requests import Request
|
||||
from google.oauth2 import service_account
|
||||
from google.oauth2.credentials import Credentials
|
||||
from google_auth_oauthlib.flow import InstalledAppFlow
|
||||
except ImportError:
|
||||
raise ImportError(
|
||||
"You must run "
|
||||
"`pip install --upgrade "
|
||||
"google-api-python-client google-auth-httplib2 "
|
||||
"google-auth-oauthlib` "
|
||||
"to use the Google Drive loader."
|
||||
)
|
||||
|
||||
creds = None
|
||||
if self.service_account_key.exists():
|
||||
return service_account.Credentials.from_service_account_file(
|
||||
str(self.service_account_key), scopes=SCOPES
|
||||
)
|
||||
|
||||
if self.token_path.exists():
|
||||
creds = Credentials.from_authorized_user_file(str(self.token_path), SCOPES)
|
||||
|
||||
if not creds or not creds.valid:
|
||||
if creds and creds.expired and creds.refresh_token:
|
||||
creds.refresh(Request())
|
||||
elif "GOOGLE_APPLICATION_CREDENTIALS" not in os.environ:
|
||||
creds, project = default()
|
||||
creds = creds.with_scopes(SCOPES)
|
||||
# no need to write to file
|
||||
if creds:
|
||||
return creds
|
||||
else:
|
||||
flow = InstalledAppFlow.from_client_secrets_file(
|
||||
str(self.credentials_path), SCOPES
|
||||
)
|
||||
creds = flow.run_local_server(port=0)
|
||||
with open(self.token_path, "w") as token:
|
||||
token.write(creds.to_json())
|
||||
|
||||
return creds
|
||||
|
||||
def _load_sheet_from_id(self, id: str) -> List[Document]:
|
||||
"""Load a sheet and all tabs from an ID."""
|
||||
|
||||
from googleapiclient.discovery import build
|
||||
|
||||
creds = self._load_credentials()
|
||||
sheets_service = build("sheets", "v4", credentials=creds)
|
||||
spreadsheet = sheets_service.spreadsheets().get(spreadsheetId=id).execute()
|
||||
sheets = spreadsheet.get("sheets", [])
|
||||
|
||||
documents = []
|
||||
for sheet in sheets:
|
||||
sheet_name = sheet["properties"]["title"]
|
||||
result = (
|
||||
sheets_service.spreadsheets()
|
||||
.values()
|
||||
.get(spreadsheetId=id, range=sheet_name)
|
||||
.execute()
|
||||
)
|
||||
values = result.get("values", [])
|
||||
|
||||
header = values[0]
|
||||
for i, row in enumerate(values[1:], start=1):
|
||||
metadata = {
|
||||
"source": (
|
||||
f"https://docs.google.com/spreadsheets/d/{id}/"
|
||||
f"edit?gid={sheet['properties']['sheetId']}"
|
||||
),
|
||||
"title": f"{spreadsheet['properties']['title']} - {sheet_name}",
|
||||
"row": i,
|
||||
}
|
||||
content = []
|
||||
for j, v in enumerate(row):
|
||||
title = header[j].strip() if len(header) > j else ""
|
||||
content.append(f"{title}: {v.strip()}")
|
||||
|
||||
page_content = "\n".join(content)
|
||||
documents.append(Document(page_content=page_content, metadata=metadata))
|
||||
|
||||
return documents
|
||||
|
||||
def _load_document_from_id(self, id: str) -> Document:
|
||||
"""Load a document from an ID."""
|
||||
from io import BytesIO
|
||||
|
||||
from googleapiclient.discovery import build
|
||||
from googleapiclient.errors import HttpError
|
||||
from googleapiclient.http import MediaIoBaseDownload
|
||||
|
||||
creds = self._load_credentials()
|
||||
service = build("drive", "v3", credentials=creds)
|
||||
|
||||
file = service.files().get(fileId=id, supportsAllDrives=True).execute()
|
||||
request = service.files().export_media(fileId=id, mimeType="text/plain")
|
||||
fh = BytesIO()
|
||||
downloader = MediaIoBaseDownload(fh, request)
|
||||
done = False
|
||||
try:
|
||||
while done is False:
|
||||
status, done = downloader.next_chunk()
|
||||
|
||||
except HttpError as e:
|
||||
if e.resp.status == 404:
|
||||
print("File not found: {}".format(id))
|
||||
else:
|
||||
print("An error occurred: {}".format(e))
|
||||
|
||||
text = fh.getvalue().decode("utf-8")
|
||||
metadata = {
|
||||
"source": f"https://docs.google.com/document/d/{id}/edit",
|
||||
"title": f"{file.get('name')}",
|
||||
}
|
||||
return Document(page_content=text, metadata=metadata)
|
||||
|
||||
def _load_documents_from_folder(
|
||||
self, folder_id: str, *, file_types: Optional[Sequence[str]] = None
|
||||
) -> List[Document]:
|
||||
"""Load documents from a folder."""
|
||||
from googleapiclient.discovery import build
|
||||
|
||||
creds = self._load_credentials()
|
||||
service = build("drive", "v3", credentials=creds)
|
||||
files = self._fetch_files_recursive(service, folder_id)
|
||||
# If file types filter is provided, we'll filter by the file type.
|
||||
if file_types:
|
||||
_files = [f for f in files if f["mimeType"] in file_types] # type: ignore
|
||||
else:
|
||||
_files = files
|
||||
|
||||
returns = []
|
||||
for file in _files:
|
||||
if file["trashed"] and not self.load_trashed_files:
|
||||
continue
|
||||
elif file["mimeType"] == "application/vnd.google-apps.document":
|
||||
returns.append(self._load_document_from_id(file["id"])) # type: ignore
|
||||
elif file["mimeType"] == "application/vnd.google-apps.spreadsheet":
|
||||
returns.extend(self._load_sheet_from_id(file["id"])) # type: ignore
|
||||
elif (
|
||||
file["mimeType"] == "application/pdf"
|
||||
or self.file_loader_cls is not None
|
||||
):
|
||||
returns.extend(self._load_file_from_id(file["id"])) # type: ignore
|
||||
else:
|
||||
pass
|
||||
return returns
|
||||
|
||||
def _fetch_files_recursive(
|
||||
self, service: Any, folder_id: str
|
||||
) -> List[Dict[str, Union[str, List[str]]]]:
|
||||
"""Fetch all files and subfolders recursively."""
|
||||
results = (
|
||||
service.files()
|
||||
.list(
|
||||
q=f"'{folder_id}' in parents",
|
||||
pageSize=1000,
|
||||
includeItemsFromAllDrives=True,
|
||||
supportsAllDrives=True,
|
||||
fields="nextPageToken, files(id, name, mimeType, parents, trashed)",
|
||||
)
|
||||
.execute()
|
||||
)
|
||||
files = results.get("files", [])
|
||||
returns = []
|
||||
for file in files:
|
||||
if file["mimeType"] == "application/vnd.google-apps.folder":
|
||||
if self.recursive:
|
||||
returns.extend(self._fetch_files_recursive(service, file["id"]))
|
||||
else:
|
||||
returns.append(file)
|
||||
|
||||
return returns
|
||||
|
||||
def _load_documents_from_ids(self) -> List[Document]:
|
||||
"""Load documents from a list of IDs."""
|
||||
if not self.document_ids:
|
||||
raise ValueError("document_ids must be set")
|
||||
|
||||
return [self._load_document_from_id(doc_id) for doc_id in self.document_ids]
|
||||
|
||||
def _load_file_from_id(self, id: str) -> List[Document]:
|
||||
"""Load a file from an ID."""
|
||||
from io import BytesIO
|
||||
|
||||
from googleapiclient.discovery import build
|
||||
from googleapiclient.http import MediaIoBaseDownload
|
||||
|
||||
creds = self._load_credentials()
|
||||
service = build("drive", "v3", credentials=creds)
|
||||
|
||||
file = service.files().get(fileId=id, supportsAllDrives=True).execute()
|
||||
request = service.files().get_media(fileId=id)
|
||||
fh = BytesIO()
|
||||
downloader = MediaIoBaseDownload(fh, request)
|
||||
done = False
|
||||
while done is False:
|
||||
status, done = downloader.next_chunk()
|
||||
|
||||
if self.file_loader_cls is not None:
|
||||
fh.seek(0)
|
||||
loader = self.file_loader_cls(file=fh, **self.file_loader_kwargs)
|
||||
docs = loader.load()
|
||||
for doc in docs:
|
||||
doc.metadata["source"] = f"https://drive.google.com/file/d/{id}/view"
|
||||
return docs
|
||||
|
||||
else:
|
||||
from PyPDF2 import PdfReader
|
||||
|
||||
content = fh.getvalue()
|
||||
pdf_reader = PdfReader(BytesIO(content))
|
||||
|
||||
return [
|
||||
Document(
|
||||
page_content=page.extract_text(),
|
||||
metadata={
|
||||
"source": f"https://drive.google.com/file/d/{id}/view",
|
||||
"title": f"{file.get('name')}",
|
||||
"page": i,
|
||||
},
|
||||
)
|
||||
for i, page in enumerate(pdf_reader.pages)
|
||||
]
|
||||
|
||||
def _load_file_from_ids(self) -> List[Document]:
|
||||
"""Load files from a list of IDs."""
|
||||
if not self.file_ids:
|
||||
raise ValueError("file_ids must be set")
|
||||
docs = []
|
||||
for file_id in self.file_ids:
|
||||
docs.extend(self._load_file_from_id(file_id))
|
||||
return docs
|
||||
|
||||
def load(self) -> List[Document]:
|
||||
"""Load documents."""
|
||||
if self.folder_id:
|
||||
return self._load_documents_from_folder(
|
||||
self.folder_id, file_types=self.file_types
|
||||
)
|
||||
elif self.document_ids:
|
||||
return self._load_documents_from_ids()
|
||||
else:
|
||||
return self._load_file_from_ids()
|
||||
__all__ = ["GoogleDriveLoader"]
|
||||
|
||||
@@ -30,6 +30,7 @@ from langchain.retrievers.ensemble import EnsembleRetriever
|
||||
from langchain.retrievers.google_cloud_enterprise_search import (
|
||||
GoogleCloudEnterpriseSearchRetriever,
|
||||
)
|
||||
from langchain.retrievers.google_drive import GoogleDriveRetriever
|
||||
from langchain.retrievers.kendra import AmazonKendraRetriever
|
||||
from langchain.retrievers.knn import KNNRetriever
|
||||
from langchain.retrievers.llama_index import (
|
||||
@@ -65,6 +66,7 @@ __all__ = [
|
||||
"ChaindeskRetriever",
|
||||
"ElasticSearchBM25Retriever",
|
||||
"GoogleCloudEnterpriseSearchRetriever",
|
||||
"GoogleDriveRetriever",
|
||||
"KNNRetriever",
|
||||
"LlamaIndexGraphRetriever",
|
||||
"LlamaIndexRetriever",
|
||||
|
||||
92
libs/langchain/langchain/retrievers/google_drive.py
Normal file
92
libs/langchain/langchain/retrievers/google_drive.py
Normal file
@@ -0,0 +1,92 @@
|
||||
from typing import Any, Dict, List, Literal, Optional
|
||||
|
||||
from pydantic.class_validators import root_validator
|
||||
from pydantic.config import Extra
|
||||
|
||||
from langchain.callbacks.manager import Callbacks
|
||||
from langchain.schema import BaseRetriever, Document
|
||||
|
||||
from ..utilities.google_drive import (
|
||||
GoogleDriveUtilities,
|
||||
get_template,
|
||||
)
|
||||
|
||||
|
||||
class GoogleDriveRetriever(GoogleDriveUtilities, BaseRetriever):
|
||||
"""Wrapper around Google Drive API.
|
||||
|
||||
The application must be authenticated with a json file.
|
||||
The format may be for a user or for an application via a service account.
|
||||
The environment variable `GOOGLE_ACCOUNT_FILE` may be set to reference this file.
|
||||
For more information, see [here]
|
||||
(https://developers.google.com/workspace/guides/auth-overview).
|
||||
"""
|
||||
|
||||
class Config:
|
||||
extra = Extra.allow
|
||||
allow_mutation = False
|
||||
underscore_attrs_are_private = True
|
||||
|
||||
mode: Literal[
|
||||
"snippets", "snippets-markdown", "documents", "documents-markdown"
|
||||
] = "snippets-markdown"
|
||||
|
||||
@root_validator(pre=True)
|
||||
def validate_template(cls, v: Dict[str, Any]) -> Dict[str, Any]:
|
||||
folder_id = v.get("folder_id")
|
||||
|
||||
if not v.get("template"):
|
||||
if folder_id:
|
||||
template = get_template("gdrive-query-in-folder")
|
||||
else:
|
||||
template = get_template("gdrive-query")
|
||||
v["template"] = template
|
||||
return v
|
||||
|
||||
def get_relevant_documents(
|
||||
self,
|
||||
query: str,
|
||||
*,
|
||||
callbacks: Callbacks = None,
|
||||
tags: Optional[List[str]] = None,
|
||||
metadata: Optional[Dict[str, Any]] = None,
|
||||
**kwargs: Any,
|
||||
) -> List[Document]:
|
||||
"""Get documents relevant for a query.
|
||||
|
||||
Args:
|
||||
query: string to find relevant documents for
|
||||
|
||||
Returns:
|
||||
List of relevant documents
|
||||
"""
|
||||
return list(
|
||||
self.lazy_get_relevant_documents(
|
||||
query=query,
|
||||
callbacks=callbacks,
|
||||
tags=tags,
|
||||
metadata=metadata,
|
||||
**kwargs,
|
||||
)
|
||||
)
|
||||
|
||||
async def aget_relevant_documents(
|
||||
self,
|
||||
query: str,
|
||||
*,
|
||||
callbacks: Callbacks = None,
|
||||
tags: Optional[List[str]] = None,
|
||||
metadata: Optional[Dict[str, Any]] = None,
|
||||
**kwargs: Any,
|
||||
) -> List[Document]:
|
||||
"""Get documents relevant for a query.
|
||||
|
||||
NOT IMPLEMENTED
|
||||
|
||||
Args:
|
||||
query: string to find relevant documents for
|
||||
|
||||
Returns:
|
||||
List of relevant documents
|
||||
"""
|
||||
raise NotImplementedError("GoogleSearchRun does not support async")
|
||||
@@ -45,6 +45,7 @@ from langchain.tools.gmail import (
|
||||
GmailSearch,
|
||||
GmailSendMessage,
|
||||
)
|
||||
from langchain.tools.google_drive.tool import GoogleDriveSearchTool
|
||||
from langchain.tools.google_places.tool import GooglePlacesTool
|
||||
from langchain.tools.google_search.tool import GoogleSearchResults, GoogleSearchRun
|
||||
from langchain.tools.google_serper.tool import GoogleSerperResults, GoogleSerperRun
|
||||
@@ -148,6 +149,7 @@ __all__ = [
|
||||
"GmailGetThread",
|
||||
"GmailSearch",
|
||||
"GmailSendMessage",
|
||||
"GoogleDriveSearchTool",
|
||||
"GooglePlacesTool",
|
||||
"GoogleSearchResults",
|
||||
"GoogleSearchRun",
|
||||
|
||||
41
libs/langchain/langchain/tools/google_drive/tool.py
Normal file
41
libs/langchain/langchain/tools/google_drive/tool.py
Normal file
@@ -0,0 +1,41 @@
|
||||
import logging
|
||||
from typing import Optional
|
||||
|
||||
from langchain.callbacks.manager import (
|
||||
AsyncCallbackManagerForToolRun,
|
||||
CallbackManagerForToolRun,
|
||||
)
|
||||
from langchain.tools import BaseTool
|
||||
|
||||
from ...utilities.google_drive import FORMAT_INSTRUCTION, GoogleDriveAPIWrapper
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class GoogleDriveSearchTool(BaseTool):
|
||||
"""Tool that adds the capability to query the Google Drive search API."""
|
||||
|
||||
name = "Google Drive Search"
|
||||
description = (
|
||||
"A wrapper around Google Drive Search. "
|
||||
"Useful for when you need to find a document in google drive. "
|
||||
f"{FORMAT_INSTRUCTION}"
|
||||
)
|
||||
api_wrapper: GoogleDriveAPIWrapper
|
||||
|
||||
def _run(
|
||||
self,
|
||||
query: str,
|
||||
run_manager: Optional[CallbackManagerForToolRun] = None,
|
||||
) -> str:
|
||||
"""Use the tool."""
|
||||
logger.info(f"{query=}")
|
||||
return self.api_wrapper.run(query)
|
||||
|
||||
async def _arun(
|
||||
self,
|
||||
query: str,
|
||||
run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
|
||||
) -> str:
|
||||
"""Use the tool asynchronously."""
|
||||
raise NotImplementedError("GoogleSearchRun does not support async")
|
||||
@@ -11,6 +11,7 @@ from langchain.utilities.bing_search import BingSearchAPIWrapper
|
||||
from langchain.utilities.brave_search import BraveSearchWrapper
|
||||
from langchain.utilities.duckduckgo_search import DuckDuckGoSearchAPIWrapper
|
||||
from langchain.utilities.golden_query import GoldenQueryAPIWrapper
|
||||
from langchain.utilities.google_drive import GoogleDriveAPIWrapper
|
||||
from langchain.utilities.google_places_api import GooglePlacesAPIWrapper
|
||||
from langchain.utilities.google_search import GoogleSearchAPIWrapper
|
||||
from langchain.utilities.google_serper import GoogleSerperAPIWrapper
|
||||
@@ -42,6 +43,7 @@ __all__ = [
|
||||
"BraveSearchWrapper",
|
||||
"DuckDuckGoSearchAPIWrapper",
|
||||
"GoldenQueryAPIWrapper",
|
||||
"GoogleDriveAPIWrapper",
|
||||
"GooglePlacesAPIWrapper",
|
||||
"GoogleSearchAPIWrapper",
|
||||
"GoogleSerperAPIWrapper",
|
||||
|
||||
1545
libs/langchain/langchain/utilities/google_drive.py
Normal file
1545
libs/langchain/langchain/utilities/google_drive.py
Normal file
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,161 @@
|
||||
import unittest
|
||||
from pathlib import Path
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
import pytest
|
||||
from pytest_mock import MockerFixture
|
||||
|
||||
from langchain.document_loaders.google_drive import GoogleDriveLoader
|
||||
from tests.unit_tests.llms.fake_llm import FakeLLM
|
||||
from tests.unit_tests.utilities.test_google_drive import (
|
||||
gdrive_docs,
|
||||
google_workspace_installed,
|
||||
patch_google_workspace,
|
||||
)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def google_workspace(mocker: MockerFixture) -> MagicMock:
|
||||
return patch_google_workspace(
|
||||
mocker, [{"nextPageToken": None, "files": gdrive_docs}]
|
||||
)
|
||||
|
||||
|
||||
@unittest.skipIf(not google_workspace_installed, "Google api not installed")
|
||||
def test_load_returns_list_of_google_documents_single(
|
||||
google_workspace: MagicMock,
|
||||
) -> None:
|
||||
loader = GoogleDriveLoader(
|
||||
api_file=Path(__file__).parent.parent
|
||||
/ "utilities"
|
||||
/ "examples"
|
||||
/ "gdrive_credentials.json",
|
||||
folder_id="999",
|
||||
)
|
||||
assert loader.mode == "documents" # Check default value
|
||||
assert loader.gsheet_mode == "single" # Check default value
|
||||
assert loader.gslide_mode == "single" # Check default value
|
||||
|
||||
|
||||
@unittest.skipIf(not google_workspace_installed, "Google api not installed")
|
||||
def test_service_account_key(google_workspace: MagicMock) -> None:
|
||||
loader = GoogleDriveLoader(
|
||||
service_account_key=Path(__file__).parent.parent
|
||||
/ "utilities"
|
||||
/ "examples"
|
||||
/ "gdrive_service.json",
|
||||
template="gdrive-all-in-folder",
|
||||
)
|
||||
assert (
|
||||
loader.gdrive_api_file
|
||||
== Path(__file__).parent.parent
|
||||
/ "utilities"
|
||||
/ "examples"
|
||||
/ "gdrive_service.json"
|
||||
)
|
||||
|
||||
|
||||
# @unittest.skipIf(not google_workspace_installed, "Google api not installed")
|
||||
# def test_no_path(mocker,google_workspace) -> None:
|
||||
# import os
|
||||
# mocker.patch.dict(os.environ,{},clear=True)
|
||||
# loader = GoogleDriveLoader(
|
||||
# template="gdrive-all-in-folder",
|
||||
# )
|
||||
# assert loader.gdrive_api_file == Path.home() / ".credentials" / "keys.json"
|
||||
|
||||
|
||||
@unittest.skipIf(not google_workspace_installed, "Google api not installed")
|
||||
def test_credentials_path(mocker: MockerFixture, google_workspace: MagicMock) -> None:
|
||||
loader = GoogleDriveLoader(
|
||||
credentials_path=Path(__file__).parent.parent
|
||||
/ "utilities"
|
||||
/ "examples"
|
||||
/ "gdrive_credentials.json",
|
||||
template="gdrive-all-in-folder",
|
||||
)
|
||||
assert (
|
||||
loader.gdrive_api_file
|
||||
== Path(__file__).parent.parent
|
||||
/ "utilities"
|
||||
/ "examples"
|
||||
/ "gdrive_credentials.json"
|
||||
)
|
||||
|
||||
|
||||
@unittest.skipIf(not google_workspace_installed, "Google api not installed")
|
||||
def test_folder_id(google_workspace: MagicMock) -> None:
|
||||
loader = GoogleDriveLoader(
|
||||
api_file=Path(__file__).parent.parent
|
||||
/ "utilities"
|
||||
/ "examples"
|
||||
/ "gdrive_credentials.json",
|
||||
folder_id="999",
|
||||
)
|
||||
docs = loader.load()
|
||||
assert len(docs) == 3
|
||||
|
||||
|
||||
@unittest.skipIf(not google_workspace_installed, "Google api not installed")
|
||||
def test_query(google_workspace: MagicMock) -> None:
|
||||
loader = GoogleDriveLoader(
|
||||
api_file=Path(__file__).parent.parent
|
||||
/ "utilities"
|
||||
/ "examples"
|
||||
/ "gdrive_credentials.json",
|
||||
query="",
|
||||
template="gdrive-query",
|
||||
)
|
||||
docs = loader.load()
|
||||
assert len(docs) == 3
|
||||
|
||||
|
||||
@unittest.skipIf(not google_workspace_installed, "Google api not installed")
|
||||
def test_document_ids(google_workspace: MagicMock) -> None:
|
||||
loader = GoogleDriveLoader(
|
||||
api_file=Path(__file__).parent.parent
|
||||
/ "utilities"
|
||||
/ "examples"
|
||||
/ "gdrive_credentials.json",
|
||||
document_ids=["1", "1"],
|
||||
)
|
||||
docs = loader.load()
|
||||
assert len(docs) == 2
|
||||
|
||||
|
||||
@unittest.skipIf(not google_workspace_installed, "Google api not installed")
|
||||
def test_files_ids(google_workspace: MagicMock) -> None:
|
||||
loader = GoogleDriveLoader(
|
||||
api_file=Path(__file__).parent.parent
|
||||
/ "utilities"
|
||||
/ "examples"
|
||||
/ "gdrive_credentials.json",
|
||||
file_ids=["1", "2"],
|
||||
)
|
||||
docs = loader.load()
|
||||
assert len(docs) == 2
|
||||
|
||||
|
||||
@unittest.skipIf(not google_workspace_installed, "Google api not installed")
|
||||
def test_update_description_with_summary(google_workspace: MagicMock) -> None:
|
||||
loader = GoogleDriveLoader(
|
||||
api_file=Path(__file__).parent.parent
|
||||
/ "utilities"
|
||||
/ "examples"
|
||||
/ "gdrive_credentials.json",
|
||||
file_ids=["1", "2"],
|
||||
scopes=["https://www.googleapis.com/auth/drive"],
|
||||
)
|
||||
result = list(
|
||||
loader.lazy_update_description_with_summary(
|
||||
llm=FakeLLM(), force=True, prompt=None, verbose=True, query=""
|
||||
)
|
||||
)
|
||||
assert len(result) == 2
|
||||
|
||||
result = list(
|
||||
loader.lazy_update_description_with_summary(
|
||||
llm=FakeLLM(), force=False, prompt=None, query=""
|
||||
)
|
||||
)
|
||||
assert len(result) == 0
|
||||
@@ -0,0 +1,53 @@
|
||||
import unittest
|
||||
from pathlib import Path
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
import pytest
|
||||
from pytest_mock import MockerFixture
|
||||
|
||||
from langchain.retrievers.google_drive import GoogleDriveRetriever
|
||||
from tests.unit_tests.utilities.test_google_drive import (
|
||||
_text_text,
|
||||
gdrive_docs,
|
||||
google_workspace_installed,
|
||||
patch_google_workspace,
|
||||
)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def google_workspace(mocker: MockerFixture) -> MagicMock:
|
||||
return patch_google_workspace(
|
||||
mocker, [{"nextPageToken": None, "files": gdrive_docs}]
|
||||
)
|
||||
|
||||
|
||||
@unittest.skipIf(not google_workspace_installed, "Google api not installed")
|
||||
def test_get_relevant_documents(
|
||||
mocker: MockerFixture,
|
||||
) -> None:
|
||||
patch_google_workspace(mocker, [{"nextPageToken": None, "files": [_text_text]}])
|
||||
retriever = GoogleDriveRetriever(
|
||||
api_file=Path(__file__).parent.parent
|
||||
/ "utilities"
|
||||
/ "examples"
|
||||
/ "gdrive_credentials.json",
|
||||
)
|
||||
docs = retriever.get_relevant_documents("machine learning")
|
||||
assert len(docs) == 1
|
||||
|
||||
|
||||
@unittest.skipIf(not google_workspace_installed, "Google api not installed")
|
||||
def test_extra_parameters(
|
||||
mocker: MockerFixture,
|
||||
) -> None:
|
||||
patch_google_workspace(mocker, [{"nextPageToken": None, "files": [_text_text]}])
|
||||
retriever = GoogleDriveRetriever(
|
||||
template="gdrive-mime-type-in-folders",
|
||||
folder_id="root",
|
||||
mime_type="application/vnd.google-apps.document", # Only Google Docs
|
||||
num_results=2,
|
||||
mode="snippets",
|
||||
includeItemsFromAllDrives=False,
|
||||
supportsAllDrives=False,
|
||||
)
|
||||
retriever.get_relevant_documents("machine learning")
|
||||
40
libs/langchain/tests/unit_tests/tools/test_google_drive.py
Normal file
40
libs/langchain/tests/unit_tests/tools/test_google_drive.py
Normal file
@@ -0,0 +1,40 @@
|
||||
import unittest
|
||||
from pathlib import Path
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
import pytest as pytest
|
||||
from pytest_mock import MockerFixture
|
||||
|
||||
from langchain.tools.google_drive.tool import GoogleDriveSearchTool
|
||||
from langchain.utilities import GoogleDriveAPIWrapper
|
||||
from tests.unit_tests.utilities.test_google_drive import (
|
||||
gdrive_docs,
|
||||
google_workspace_installed,
|
||||
patch_google_workspace,
|
||||
)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def google_workspace(mocker: MockerFixture) -> MagicMock:
|
||||
return patch_google_workspace(
|
||||
mocker, [{"nextPageToken": None, "files": gdrive_docs}]
|
||||
)
|
||||
|
||||
|
||||
@unittest.skipIf(not google_workspace_installed, "Google api not installed")
|
||||
def test_run(google_workspace: MagicMock) -> None:
|
||||
tool = GoogleDriveSearchTool(
|
||||
api_wrapper=GoogleDriveAPIWrapper(
|
||||
api_file=(
|
||||
Path(__file__).parent.parent
|
||||
/ "utilities"
|
||||
/ "examples"
|
||||
/ "gdrive_credentials.json"
|
||||
)
|
||||
)
|
||||
)
|
||||
result = tool._run("machine learning")
|
||||
assert result.startswith(
|
||||
"[vnd.google-apps.document](https://docs.google.com/document/d/1/edit?usp=drivesdk)<br/>\n"
|
||||
"It is a doc summary\n\n"
|
||||
)
|
||||
@@ -32,6 +32,7 @@ _EXPECTED = [
|
||||
"GmailGetThread",
|
||||
"GmailSearch",
|
||||
"GmailSendMessage",
|
||||
"GoogleDriveSearchTool",
|
||||
"GooglePlacesTool",
|
||||
"GoogleSearchResults",
|
||||
"GoogleSearchRun",
|
||||
|
||||
1641
libs/langchain/tests/unit_tests/utilities/examples/gdrive.gdoc
Normal file
1641
libs/langchain/tests/unit_tests/utilities/examples/gdrive.gdoc
Normal file
File diff suppressed because it is too large
Load Diff
161
libs/langchain/tests/unit_tests/utilities/examples/gdrive.gsheet
Normal file
161
libs/langchain/tests/unit_tests/utilities/examples/gdrive.gsheet
Normal file
@@ -0,0 +1,161 @@
|
||||
{
|
||||
"spreadsheetId": "1iuGLyUDgw6mCjyXnaqNtpXS2-ALJbZ4wq1cWBuCfTRg",
|
||||
"properties": {
|
||||
"title": "vnd.google-apps.spreadsheet",
|
||||
"locale": "fr_FR",
|
||||
"autoRecalc": "ON_CHANGE",
|
||||
"timeZone": "Europe/Paris",
|
||||
"defaultFormat": {
|
||||
"backgroundColor": {
|
||||
"red": 1,
|
||||
"green": 1,
|
||||
"blue": 1
|
||||
},
|
||||
"padding": {
|
||||
"top": 2,
|
||||
"right": 3,
|
||||
"bottom": 2,
|
||||
"left": 3
|
||||
},
|
||||
"verticalAlignment": "BOTTOM",
|
||||
"wrapStrategy": "OVERFLOW_CELL",
|
||||
"textFormat": {
|
||||
"foregroundColor": {},
|
||||
"fontFamily": "arial,sans,sans-serif",
|
||||
"fontSize": 10,
|
||||
"bold": false,
|
||||
"italic": false,
|
||||
"strikethrough": false,
|
||||
"underline": false,
|
||||
"foregroundColorStyle": {
|
||||
"rgbColor": {}
|
||||
}
|
||||
},
|
||||
"backgroundColorStyle": {
|
||||
"rgbColor": {
|
||||
"red": 1,
|
||||
"green": 1,
|
||||
"blue": 1
|
||||
}
|
||||
}
|
||||
},
|
||||
"spreadsheetTheme": {
|
||||
"primaryFontFamily": "Arial",
|
||||
"themeColors": [
|
||||
{
|
||||
"colorType": "TEXT",
|
||||
"color": {
|
||||
"rgbColor": {}
|
||||
}
|
||||
},
|
||||
{
|
||||
"colorType": "BACKGROUND",
|
||||
"color": {
|
||||
"rgbColor": {
|
||||
"red": 1,
|
||||
"green": 1,
|
||||
"blue": 1
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"colorType": "ACCENT1",
|
||||
"color": {
|
||||
"rgbColor": {
|
||||
"red": 0.25882354,
|
||||
"green": 0.52156866,
|
||||
"blue": 0.95686275
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"colorType": "ACCENT2",
|
||||
"color": {
|
||||
"rgbColor": {
|
||||
"red": 0.91764706,
|
||||
"green": 0.2627451,
|
||||
"blue": 0.20784314
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"colorType": "ACCENT3",
|
||||
"color": {
|
||||
"rgbColor": {
|
||||
"red": 0.9843137,
|
||||
"green": 0.7372549,
|
||||
"blue": 0.015686275
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"colorType": "ACCENT4",
|
||||
"color": {
|
||||
"rgbColor": {
|
||||
"red": 0.20392157,
|
||||
"green": 0.65882355,
|
||||
"blue": 0.3254902
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"colorType": "ACCENT5",
|
||||
"color": {
|
||||
"rgbColor": {
|
||||
"red": 1,
|
||||
"green": 0.42745098,
|
||||
"blue": 0.003921569
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"colorType": "ACCENT6",
|
||||
"color": {
|
||||
"rgbColor": {
|
||||
"red": 0.27450982,
|
||||
"green": 0.7411765,
|
||||
"blue": 0.7764706
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"colorType": "LINK",
|
||||
"color": {
|
||||
"rgbColor": {
|
||||
"red": 0.06666667,
|
||||
"green": 0.33333334,
|
||||
"blue": 0.8
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"sheets": [
|
||||
{
|
||||
"properties": {
|
||||
"sheetId": 0,
|
||||
"title": "Feuille 1",
|
||||
"index": 0,
|
||||
"sheetType": "GRID",
|
||||
"gridProperties": {
|
||||
"rowCount": 1000,
|
||||
"columnCount": 26
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"properties": {
|
||||
"sheetId": 831511404,
|
||||
"title": "Feuille 2",
|
||||
"index": 1,
|
||||
"sheetType": "GRID",
|
||||
"gridProperties": {
|
||||
"rowCount": 1000,
|
||||
"columnCount": 26
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"spreadsheetUrl": "https://docs.google.com/spreadsheets/d/1iuGLyUDgw6mCjyXnaqNtpXS2-ALJbZ4wq1cWBuCfTRg/edit?ouid=109055472267306456451"
|
||||
}
|
||||
10711
libs/langchain/tests/unit_tests/utilities/examples/gdrive.gslide
Normal file
10711
libs/langchain/tests/unit_tests/utilities/examples/gdrive.gslide
Normal file
File diff suppressed because it is too large
Load Diff
13
libs/langchain/tests/unit_tests/utilities/examples/gdrive_credentials.json
Executable file
13
libs/langchain/tests/unit_tests/utilities/examples/gdrive_credentials.json
Executable file
@@ -0,0 +1,13 @@
|
||||
{
|
||||
"installed": {
|
||||
"client_id": "",
|
||||
"project_id": "",
|
||||
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
|
||||
"token_uri": "https://oauth2.googleapis.com/token",
|
||||
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
|
||||
"client_secret": "",
|
||||
"redirect_uris": [
|
||||
"http://localhost"
|
||||
]
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,12 @@
|
||||
{
|
||||
"type": "service_account",
|
||||
"project_id": "lanchain",
|
||||
"private_key_id": "",
|
||||
"private_key": "",
|
||||
"client_email": "a@a.com",
|
||||
"client_id": "",
|
||||
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
|
||||
"token_uri": "https://oauth2.googleapis.com/token",
|
||||
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
|
||||
"client_x509_cert_url": ""
|
||||
}
|
||||
@@ -0,0 +1 @@
|
||||
The body of a text file
|
||||
@@ -0,0 +1,12 @@
|
||||
{
|
||||
"token": "MockToken",
|
||||
"refresh_token": "",
|
||||
"token_uri": "https://oauth2.googleapis.com/token",
|
||||
"client_id": "",
|
||||
"client_secret": "",
|
||||
"scopes": [
|
||||
"https://www.googleapis.com/auth/drive.readonly",,
|
||||
"https://www.googleapis.com/auth/drive"
|
||||
],
|
||||
"expiry": "9999-01-01T00:00:00.0Z"
|
||||
}
|
||||
1098
libs/langchain/tests/unit_tests/utilities/test_google_drive.py
Normal file
1098
libs/langchain/tests/unit_tests/utilities/test_google_drive.py
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user