wip

Merge branch 'master' into pprados/google_drive
Resync in 3 august
2026-01-21 21:56:38 +00:00 · 2023-08-03 13:26:56 -07:00 · 2023-08-03 10:43:43 -07:00 · 2023-08-03 17:07:47 +02:00 · 2023-08-03 12:48:54 +02:00
27 changed files with 16532 additions and 489 deletions
--- a/docs/extras/integrations/document_loaders/google_drive.ipynb
+++ b/docs/extras/integrations/document_loaders/google_drive.ipynb
@@ -2,14 +2,11 @@
 "cells": [
  {
   "cell_type": "markdown",
-   "id": "b0ed136e-6983-4893-ae1b-b75753af05f8",
+   "id": "0b02f34c",
   "metadata": {},
   "source": [
-    "# Google Drive\n",
-    "\n",
-    ">[Google Drive](https://en.wikipedia.org/wiki/Google_Drive) is a file storage and synchronization service developed by Google.\n",
-    "\n",
-    "This notebook covers how to load documents from `Google Drive`. Currently, only `Google Docs` are supported.\n",
+    "# Google Drive Loader\n",
+    "This notebook covers how to retrieve documents from Google Drive.\n",
    "\n",
    "## Prerequisites\n",
    "\n",
@@ -18,12 +15,21 @@
    "1. [Authorize credentials for desktop app](https://developers.google.com/drive/api/quickstart/python#authorize_credentials_for_a_desktop_application)\n",
    "1. `pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib`\n",
    "\n",
-    "## 🧑 Instructions for ingesting your Google Docs data\n",
-    "By default, the `GoogleDriveLoader` expects the `credentials.json` file to be `~/.credentials/credentials.json`, but this is configurable using the `credentials_path` keyword argument. Same thing with `token.json` - `token_path`. Note that `token.json` will be created automatically the first time you use the loader.\n",
-    "\n",
-    "`GoogleDriveLoader` can load from a list of Google Docs document ids or a folder id. You can obtain your folder and document id from the URL:\n",
+    "## Instructions for retrieving your Google Docs data\n",
+    "By default, the `GoogleDriveLoader` expects the `credentials.json` file to be `~/.credentials/credentials.json`, but this is configurable using the `GOOGLE_ACCOUNT_FILE` environment variable. \n",
+    "The location of `token.json` use the same directory (or use the parameter `token_path`). Note that `token.json` will be created automatically the first time you use the loader.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a03b9067",
+   "metadata": {},
+   "source": [
+    "You can obtain your folder and document id from the URL:\n",
    "* Folder: https://drive.google.com/drive/u/0/folders/1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5 -> folder id is `\"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5\"`\n",
-    "* Document: https://docs.google.com/document/d/1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw/edit -> document id is `\"1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw\"`"
+    "* Document: https://docs.google.com/document/d/1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw/edit -> document id is `\"1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw\"`\n",
+    "\n",
+    "The special value `root` is for your personal home."
   ]
  },
  {
@@ -33,12 +39,23 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "!pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib"
+    "#!pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": null,
+   "id": "9bcb6cb1",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "folder_id='root'\n",
+    "#folder_id='1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5'"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
   "id": "878928a6-a5ae-4f74-b351-64e3b01733fe",
   "metadata": {
    "tags": []
@@ -50,7 +67,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": null,
   "id": "2216c83f-68e4-4d2f-8ea2-5878fb18bbe7",
   "metadata": {
    "tags": []
@@ -58,174 +75,215 @@
   "outputs": [],
   "source": [
    "loader = GoogleDriveLoader(\n",
-    "    folder_id=\"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5\",\n",
-    "    # Optional: configure whether to recursively fetch files from subfolders. Defaults to False.\n",
+    "    folder_id=folder_id,\n",
    "    recursive=False,\n",
+    "    num_results=2,  # Maximum number of file to load\n",
    ")"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "de5be5d4",
+   "metadata": {},
+   "source": [
+    "By default, all files with these mime-type can be converted to `Document`.\n",
+    "- text/text\n",
+    "- text/plain\n",
+    "- text/html\n",
+    "- text/csv\n",
+    "- text/markdown\n",
+    "- image/png\n",
+    "- image/jpeg\n",
+    "- application/epub+zip\n",
+    "- application/pdf\n",
+    "- application/rtf\n",
+    "- application/vnd.google-apps.document (GDoc)\n",
+    "- application/vnd.google-apps.presentation (GSlide)\n",
+    "- application/vnd.google-apps.spreadsheet (GSheet)\n",
+    "- application/vnd.google.colaboratory (Notebook colab)\n",
+    "- application/vnd.openxmlformats-officedocument.presentationml.presentation (PPTX)\n",
+    "- application/vnd.openxmlformats-officedocument.wordprocessingml.document (DOCX)\n",
+    "\n",
+    "It's possible to update or customize this. See the documentation of `GDriveLoader`.\n",
+    "\n",
+    "But, the corresponding packages must be installed."
+   ]
+  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": null,
+   "id": "1bca45c9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install unstructured"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
   "id": "8f3b6aa0-b45d-4e37-8c50-5bebe70fdb9d",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
-    "docs = loader.load()"
+    "for doc in loader.load():\n",
+    "    print(\"---\")\n",
+    "    print(doc.page_content.strip()[:60]+\"...\")"
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "2721ba8a",
+   "id": "31170e71",
   "metadata": {},
   "source": [
-    "When you pass a `folder_id` by default all files of type document, sheet and pdf are loaded. You can modify this behaviour by passing a `file_types` argument "
+    "# Customize the search pattern\n",
+    "\n",
+    "All parameter compatible with Google [`list()`](https://developers.google.com/drive/api/v3/reference/files/list)\n",
+    "API can be set.\n",
+    "\n",
+    "To specify the new pattern of the Google request, you can use a `PromptTemplate()`.\n",
+    "The variables for the prompt can be set with `kwargs` in the constructor.\n",
+    "Some pre-formated request are proposed (use `{query}`, `{folder_id}` and/or `{mime_type}`):\n",
+    "\n",
+    "You can customize the criteria to select the files. A set of predefined filter are proposed:\n",
+    "| template                               | description                                                           |\n",
+    "| -------------------------------------- | --------------------------------------------------------------------- |\n",
+    "| gdrive-all-in-folder                   | Return all compatible files from a `folder_id`                        |\n",
+    "| gdrive-query                           | Search `query` in all drives                                          |\n",
+    "| gdrive-by-name                         | Search file with name `query`                                        |\n",
+    "| gdrive-query-in-folder                 | Search `query` in `folder_id` (and sub-folders in `_recursive=true`)  |\n",
+    "| gdrive-mime-type                       | Search a specific `mime_type`                                         |\n",
+    "| gdrive-mime-type-in-folder             | Search a specific `mime_type` in `folder_id`                          |\n",
+    "| gdrive-query-with-mime-type            | Search `query` with a specific `mime_type`                            |\n",
+    "| gdrive-query-with-mime-type-and-folder | Search `query` with a specific `mime_type` and in `folder_id`         |\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
-   "id": "2ff83b4c",
+   "id": "0a47175f",
   "metadata": {},
   "outputs": [],
   "source": [
-    "loader = GoogleDriveLoader(\n",
-    "    folder_id=\"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5\",\n",
-    "    file_types=[\"document\", \"sheet\"]\n",
-    "    recursive=False\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "d6b80931",
-   "metadata": {},
-   "source": [
-    "## Passing in Optional File Loaders\n",
-    "\n",
-    "When processing files other than Google Docs and Google Sheets, it can be helpful to pass an optional file loader to `GoogleDriveLoader`. If you pass in a file loader, that file loader will be used on documents that do not have a Google Docs or Google Sheets MIME type. Here is an example of how to load an Excel document from Google Drive using a file loader. "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "94207e39",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain.document_loaders import GoogleDriveLoader\n",
-    "from langchain.document_loaders import UnstructuredFileIOLoader"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "id": "a15fbee0",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "file_id = \"1x9WBtFPWMEAdjcJzPScRsjpjQvpSo_kz\"\n",
-    "loader = GoogleDriveLoader(\n",
-    "    file_ids=[file_id],\n",
-    "    file_loader_cls=UnstructuredFileIOLoader,\n",
-    "    file_loader_kwargs={\"mode\": \"elements\"},\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "id": "98410bda",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "docs = loader.load()"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "id": "e3e72221",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "Document(page_content='\\n  \\n    \\n      Team\\n      Location\\n      Stanley Cups\\n    \\n    \\n      Blues\\n      STL\\n      1\\n    \\n    \\n      Flyers\\n      PHI\\n      2\\n    \\n    \\n      Maple Leafs\\n      TOR\\n      13\\n    \\n  \\n', metadata={'filetype': 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet', 'page_number': 1, 'page_name': 'Stanley Cups', 'text_as_html': '<table border=\"1\" class=\"dataframe\">\\n  <tbody>\\n    <tr>\\n      <td>Team</td>\\n      <td>Location</td>\\n      <td>Stanley Cups</td>\\n    </tr>\\n    <tr>\\n      <td>Blues</td>\\n      <td>STL</td>\\n      <td>1</td>\\n    </tr>\\n    <tr>\\n      <td>Flyers</td>\\n      <td>PHI</td>\\n      <td>2</td>\\n    </tr>\\n    <tr>\\n      <td>Maple Leafs</td>\\n      <td>TOR</td>\\n      <td>13</td>\\n    </tr>\\n  </tbody>\\n</table>', 'category': 'Table', 'source': 'https://drive.google.com/file/d/1aA6L2AR3g0CR-PW03HEZZo4NaVlKpaP7/view'})"
-      ]
-     },
-     "execution_count": 4,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "docs[0]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "238cd06f",
-   "metadata": {},
-   "source": [
-    "You can also process a folder with a mix of files and Google Docs/Sheets using the following pattern:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "id": "0e2d093f",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "folder_id = \"1asMOHY1BqBS84JcRbOag5LOJac74gpmD\"\n",
    "loader = GoogleDriveLoader(\n",
    "    folder_id=folder_id,\n",
-    "    file_loader_cls=UnstructuredFileIOLoader,\n",
-    "    file_loader_kwargs={\"mode\": \"elements\"},\n",
+    "    recursive=False,\n",
+    "    template=\"gdrive-query\",  # Default template to use\n",
+    "    query=\"machine learning\",\n",
+    "    num_results=2,            # Maximum number of file to load\n",
+    "    supportsAllDrives=False,  # GDrive `list()` parameter\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
-   "id": "b35ddcc6",
+   "execution_count": null,
+   "id": "100cf361",
   "metadata": {},
   "outputs": [],
   "source": [
-    "docs = loader.load()"
+    "for doc in loader.load():\n",
+    "    print(\"---\")\n",
+    "    print(doc.page_content.strip()[:60]+\"...\")"
   ]
  },
  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "id": "3cc141e0",
+   "cell_type": "markdown",
+   "id": "74e6e3aa",
   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "Document(page_content='\\n  \\n    \\n      Team\\n      Location\\n      Stanley Cups\\n    \\n    \\n      Blues\\n      STL\\n      1\\n    \\n    \\n      Flyers\\n      PHI\\n      2\\n    \\n    \\n      Maple Leafs\\n      TOR\\n      13\\n    \\n  \\n', metadata={'filetype': 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet', 'page_number': 1, 'page_name': 'Stanley Cups', 'text_as_html': '<table border=\"1\" class=\"dataframe\">\\n  <tbody>\\n    <tr>\\n      <td>Team</td>\\n      <td>Location</td>\\n      <td>Stanley Cups</td>\\n    </tr>\\n    <tr>\\n      <td>Blues</td>\\n      <td>STL</td>\\n      <td>1</td>\\n    </tr>\\n    <tr>\\n      <td>Flyers</td>\\n      <td>PHI</td>\\n      <td>2</td>\\n    </tr>\\n    <tr>\\n      <td>Maple Leafs</td>\\n      <td>TOR</td>\\n      <td>13</td>\\n    </tr>\\n  </tbody>\\n</table>', 'category': 'Table', 'source': 'https://drive.google.com/file/d/1aA6L2AR3g0CR-PW03HEZZo4NaVlKpaP7/view'})"
-      ]
-     },
-     "execution_count": 7,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
   "source": [
-    "docs[0]"
+    "You can customize your pattern."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
-   "id": "e312268a",
+   "id": "dcf07ff7",
   "metadata": {},
   "outputs": [],
-   "source": []
+   "source": [
+    "from langchain.prompts.prompt import PromptTemplate\n",
+    "loader = GoogleDriveLoader(\n",
+    "    folder_id=folder_id,\n",
+    "    recursive=False,\n",
+    "    template=PromptTemplate(\n",
+    "        input_variables=[\"query\", \"query_name\"],\n",
+    "        template=\"fullText contains '{query}' and name contains '{query_name}' and trashed=false\",\n",
+    "        ),  # Default template to use\n",
+    "    query=\"machine learning\",\n",
+    "    query_name=\"ML\",    \n",
+    "    num_results=2,  # Maximum number of file to load\n",
+    ")\n",
+    "for doc in loader.load():\n",
+    "    print(\"---\")\n",
+    "    print(doc.page_content.strip()[:60]+\"...\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8e404472",
+   "metadata": {},
+   "source": [
+    "# Modes for GSlide and GSheet\n",
+    "\n",
+    "The parameter `mode` accept differents values:\n",
+    "- `\"document\"`: return the body of each documents\n",
+    "- `\"snippets\"`: return the `description` of each files.\n",
+    "\n",
+    "\n",
+    "The parameter `gslide_mode` accept differents values:\n",
+    "- `\"single\"`   : one document with `<PAGE BREAK>`\n",
+    "- `\"slide\"`    : one document by slide\n",
+    "- `\"elements\"` : one document for each `elements`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b33d1a53",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = GoogleDriveLoader(\n",
+    "    template=\"gdrive-mime-type\",\n",
+    "    mime_type=\"application/vnd.google-apps.presentation\", # Only GSlide files\n",
+    "    gslide_mode=\"slide\",\n",
+    "    num_results=2,  # Maximum number of file to load\n",
+    ")\n",
+    "for doc in loader.load():\n",
+    "    print(\"---\")\n",
+    "    print(doc.page_content.strip()[:60]+\"...\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "498f0451",
+   "metadata": {},
+   "source": [
+    "The parameter `gsheet_mode` accept differents values:\n",
+    "- `\"single\"`: Generate one document by line\n",
+    "- `\"elements\"` : one document with markdown array and `<PAGE BREAK>` tags."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "884c4ca6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = GoogleDriveLoader(\n",
+    "    template=\"gdrive-mime-type\",\n",
+    "    mime_type=\"application/vnd.google-apps.spreadsheet\", # Only GSheet files\n",
+    "    gsheet_mode=\"elements\",\n",
+    "    num_results=2,  # Maximum number of file to load\n",
+    ")\n",
+    "for doc in loader.load():\n",
+    "    print(\"---\")\n",
+    "    print(doc.page_content.strip()[:60]+\"...\")"
+   ]
  }
 ],
 "metadata": {
@@ -244,7 +302,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.8.13"
+   "version": "3.10.9"
  }
 },
 "nbformat": 4,
--- a/docs/extras/integrations/providers/google_drive.mdx
+++ b/docs/extras/integrations/providers/google_drive.mdx
@@ -2,7 +2,7 @@

 >[Google Drive](https://en.wikipedia.org/wiki/Google_Drive) is a file storage and synchronization service developed by Google.

-Currently, only `Google Docs` are supported.
+All Google Drive API is supported, with integration with Google Doc, Google Sheet and Google Slide.

 ## Installation and Setup

@@ -20,3 +20,22 @@ See a [usage example and authorizing instructions](/docs/integrations/document_l
 ```python
 from langchain.document_loaders import GoogleDriveLoader
 ```
+
+## Retriever
+
+See a [usage example and authorizing instructions](/docs/modules/data_connection/retrievers/integrations/google_drive.html).
+
+```python
+from langchain.retrievers import GoogleDriveRetriever
+```
+
+## Tools
+
+See a [usage example and authorizing instructions](/docs/modules/agents/tools/integrations/google_drive.html).
+
+```python
+from langchain.tools import GoogleDriveSearchTool
+from langchain.utilities import GoogleDriveAPIWrapper
+```
+
+
--- a/docs/extras/integrations/retrievers/google_drive.ipynb
+++ b/docs/extras/integrations/retrievers/google_drive.ipynb
@@ -0,0 +1,279 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "b0ed136e-6983-4893-ae1b-b75753af05f8",
+   "metadata": {},
+   "source": [
+    "# Google Drive Retriever\n",
+    "This notebook covers how to retrieve documents from Google Drive.\n",
+    "\n",
+    "## Prerequisites\n",
+    "\n",
+    "1. Create a Google Cloud project or use an existing project\n",
+    "1. Enable the [Google Drive API](https://console.cloud.google.com/flows/enableapi?apiid=drive.googleapis.com)\n",
+    "1. [Authorize credentials for desktop app](https://developers.google.com/drive/api/quickstart/python#authorize_credentials_for_a_desktop_application)\n",
+    "1. `pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib`\n",
+    "\n",
+    "## Instructions for retrieving your Google Docs data\n",
+    "By default, the `GoogleDriveRetriever` expects the `credentials.json` file to be `~/.credentials/credentials.json`, but this is configurable using the `GOOGLE_ACCOUNT_FILE` environment variable. \n",
+    "The location of `token.json` use the same directory (or use the parameter `token_path`). Note that `token.json` will be created automatically the first time you use the retriever.\n",
+    "\n",
+    "`GoogleDriveRetriever` can retrieve a selection of files with some requests. \n",
+    "\n",
+    "By default, If you use a `folder_id`, all the files inside this folder can be retrieved to `Document`.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "35b94a93-97de-4af8-9cca-de9ffb7930c3",
+   "metadata": {},
+   "source": [
+    "You can obtain your folder and document id from the URL:\n",
+    "* Folder: https://drive.google.com/drive/u/0/folders/1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5 -> folder id is `\"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5\"`\n",
+    "* Document: https://docs.google.com/document/d/1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw/edit -> document id is `\"1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw\"`\n",
+    "\n",
+    "The special value `root` is for your personal home."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9c9665c9-a023-4078-9d95-e43021cecb6f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "878928a6-a5ae-4f74-b351-64e3b01733fe",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-05-09T10:45:59.438650905Z",
+     "start_time": "2023-05-09T10:45:57.955900302Z"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain.retrievers import GoogleDriveRetriever"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "755907c2-145d-4f0f-9b15-07a628a2d2d2",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-05-09T10:45:59.442890834Z",
+     "start_time": "2023-05-09T10:45:59.440941528Z"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "folder_id=\"root\"\n",
+    "#folder_id='1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5'"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2216c83f-68e4-4d2f-8ea2-5878fb18bbe7",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-05-09T10:45:59.795842403Z",
+     "start_time": "2023-05-09T10:45:59.445262457Z"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "retriever = GoogleDriveRetriever(\n",
+    "    num_results=2,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fa339ca0-f478-440c-ba80-0e5f41a19ce1",
+   "metadata": {},
+   "source": [
+    "By default, all files with these mime-type can be converted to `Document`.\n",
+    "- text/text\n",
+    "- text/plain\n",
+    "- text/html\n",
+    "- text/csv\n",
+    "- text/markdown\n",
+    "- image/png\n",
+    "- image/jpeg\n",
+    "- application/epub+zip\n",
+    "- application/pdf\n",
+    "- application/rtf\n",
+    "- application/vnd.google-apps.document (GDoc)\n",
+    "- application/vnd.google-apps.presentation (GSlide)\n",
+    "- application/vnd.google-apps.spreadsheet (GSheet)\n",
+    "- application/vnd.google.colaboratory (Notebook colab)\n",
+    "- application/vnd.openxmlformats-officedocument.presentationml.presentation (PPTX)\n",
+    "- application/vnd.openxmlformats-officedocument.wordprocessingml.document (DOCX)\n",
+    "\n",
+    "It's possible to update or customize this. See the documentation of `GDriveRetriever`.\n",
+    "\n",
+    "But, the corresponding packages must be installed."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9dadec48",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip install unstructured"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8f3b6aa0-b45d-4e37-8c50-5bebe70fdb9d",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-05-09T10:46:00.990310466Z",
+     "start_time": "2023-05-09T10:45:59.798774595Z"
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "retriever.get_relevant_documents(\"machine learning\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8ff33817-8619-4897-8742-2216b9934d2a",
+   "metadata": {},
+   "source": [
+    "You can customize the criteria to select the files. A set of predefined filter are proposed:\n",
+    "| template                               | description                                                           |\n",
+    "| -------------------------------------- | --------------------------------------------------------------------- |\n",
+    "| gdrive-all-in-folder                   | Return all compatible files from a `folder_id`                        |\n",
+    "| gdrive-query                           | Search `query` in all drives                                          |\n",
+    "| gdrive-by-name                         | Search file with name `query`)                                        |\n",
+    "| gdrive-query-in-folder                 | Search `query` in `folder_id` (and sub-folders in `_recursive=true`)  |\n",
+    "| gdrive-mime-type                       | Search a specific `mime_type`                                         |\n",
+    "| gdrive-mime-type-in-folder             | Search a specific `mime_type` in `folder_id`                          |\n",
+    "| gdrive-query-with-mime-type            | Search `query` with a specific `mime_type`                            |\n",
+    "| gdrive-query-with-mime-type-and-folder | Search `query` with a specific `mime_type` and in `folder_id`         |"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9977c712-9659-4959-b508-f59cc7d49d44",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "retriever = GoogleDriveRetriever(\n",
+    "    template=\"gdrive-query\", # Search everywhere\n",
+    "    num_results=2,  # But take only 2 documents\n",
+    ")\n",
+    "for doc in retriever.get_relevant_documents(\"machine learning\"):\n",
+    "    print(\"---\")\n",
+    "    print(doc.page_content.strip()[:60]+\"...\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a5a0f3ef-26fb-4a5c-85f0-5aba90b682b1",
+   "metadata": {},
+   "source": [
+    "Else, you can customize the prompt with a specialized `PromptTemplate`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b0bbebde-0487-4d20-9d77-8070e4f0e0d6",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain import PromptTemplate\n",
+    "retriever = GoogleDriveRetriever(\n",
+    "    template=PromptTemplate(input_variables=['query'],\n",
+    "                            # See https://developers.google.com/drive/api/guides/search-files\n",
+    "                            template=\"(fullText contains '{query}') \"\n",
+    "                              \"and mimeType='application/vnd.google-apps.document' \"\n",
+    "                              \"and modifiedTime > '2000-01-01T00:00:00' \"\n",
+    "                              \"and trashed=false\"),\n",
+    "    num_results=2,\n",
+    "    # See https://developers.google.com/drive/api/v3/reference/files/list\n",
+    "    includeItemsFromAllDrives=False,\n",
+    "    supportsAllDrives=False,\n",
+    ")\n",
+    "for doc in retriever.get_relevant_documents(\"machine learning\"):\n",
+    "    print(f\"{doc.metadata['name']}:\")\n",
+    "    print(\"---\")\n",
+    "    print(doc.page_content.strip()[:60]+\"...\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9b6fed29-1666-452e-b677-401613270388",
+   "metadata": {},
+   "source": [
+    "# Use GDrive 'description' metadata\n",
+    "Each Google Drive has a `description` field in metadata (see the *details of a file*).\n",
+    "Use the `snippets` mode to return the description of selected files.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "342dbe12-ed83-40f4-8957-0cc8c4609542",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "retriever = GoogleDriveRetriever(\n",
+    "    template='gdrive-mime-type-in-folder',\n",
+    "    folder_id=folder_id,\n",
+    "    mime_type='application/vnd.google-apps.document',  # Only Google Docs\n",
+    "    num_results=2,\n",
+    "    mode='snippets',\n",
+    "    includeItemsFromAllDrives=False,\n",
+    "    supportsAllDrives=False,\n",
+    ")\n",
+    "retriever.get_relevant_documents(\"machine learning\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.9"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/toolkits/google_drive.ipynb
+++ b/docs/extras/integrations/toolkits/google_drive.ipynb
@@ -0,0 +1,215 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Google Drive tool\n",
+    "\n",
+    "This notebook walks through connecting a LangChain to the Google Drive API.\n",
+    "\n",
+    "## Prerequisites\n",
+    "\n",
+    "1. Create a Google Cloud project or use an existing project\n",
+    "1. Enable the [Google Drive API](https://console.cloud.google.com/flows/enableapi?apiid=drive.googleapis.com)\n",
+    "1. [Authorize credentials for desktop app](https://developers.google.com/drive/api/quickstart/python#authorize_credentials_for_a_desktop_application)\n",
+    "1. `pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib`\n",
+    "\n",
+    "## Instructions for retrieving your Google Docs data\n",
+    "By default, the `GoogleDriveTools` and `GoogleDriveWrapper` expects the `credentials.json` file to be `~/.credentials/credentials.json`, but this is configurable using the `GOOGLE_ACCOUNT_FILE` environment variable. \n",
+    "The location of `token.json` use the same directory (or use the parameter `token_path`). Note that `token.json` will be created automatically the first time you use the tool.\n",
+    "\n",
+    "`GoogleDriveSearchTool` can retrieve a selection of files with some requests. \n",
+    "\n",
+    "By default, If you use a `folder_id`, all the files inside this folder can be retrieved to `Document`, if the name match the query.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can obtain your folder and document id from the URL:\n",
+    "* Folder: https://drive.google.com/drive/u/0/folders/1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5 -> folder id is `\"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5\"`\n",
+    "* Document: https://docs.google.com/document/d/1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw/edit -> document id is `\"1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw\"`\n",
+    "\n",
+    "The special value `root` is for your personal home."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "folder_id=\"root\"\n",
+    "#folder_id='1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5'"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "By default, all files with these mime-type can be converted to `Document`.\n",
+    "- text/text\n",
+    "- text/plain\n",
+    "- text/html\n",
+    "- text/csv\n",
+    "- text/markdown\n",
+    "- image/png\n",
+    "- image/jpeg\n",
+    "- application/epub+zip\n",
+    "- application/pdf\n",
+    "- application/rtf\n",
+    "- application/vnd.google-apps.document (GDoc)\n",
+    "- application/vnd.google-apps.presentation (GSlide)\n",
+    "- application/vnd.google-apps.spreadsheet (GSheet)\n",
+    "- application/vnd.google.colaboratory (Notebook colab)\n",
+    "- application/vnd.openxmlformats-officedocument.presentationml.presentation (PPTX)\n",
+    "- application/vnd.openxmlformats-officedocument.wordprocessingml.document (DOCX)\n",
+    "\n",
+    "It's possible to update or customize this. See the documentation of `GoogleDriveAPIWrapper`.\n",
+    "\n",
+    "But, the corresponding packages must installed."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip install unstructured"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain.utilities.google_drive import GoogleDriveAPIWrapper\n",
+    "from langchain.tools.google_drive.tool import GoogleDriveSearchTool\n",
+    "\n",
+    "# By default, search only in the filename.\n",
+    "tool = GoogleDriveSearchTool(\n",
+    "    api_wrapper=GoogleDriveAPIWrapper(\n",
+    "        folder_id=folder_id,\n",
+    "        num_results=2,\n",
+    "        template=\"gdrive-query-in-folder\", # Search in the body of documents\n",
+    "    )\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import logging\n",
+    "logging.basicConfig(level=logging.INFO)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tool.run(\"machine learning\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tool.description"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.agents import load_tools\n",
+    "tools = load_tools([\"google-drive-search\"],\n",
+    "                   folder_id=folder_id,\n",
+    "                   template=\"gdrive-query-in-folder\",\n",
+    "                  )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Use within an Agent"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain import OpenAI\n",
+    "from langchain.agents import initialize_agent, AgentType\n",
+    "llm = OpenAI(temperature=0)\n",
+    "agent = initialize_agent(\n",
+    "    tools=tools,\n",
+    "    llm=llm,\n",
+    "    agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "agent.run(\n",
+    "    \"Search in google drive, who is 'Yann LeCun' ?\"\n",
+    ")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.9"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/libs/langchain/langchain/agents/load_tools.py
+++ b/libs/langchain/langchain/agents/load_tools.py
@@ -57,6 +57,10 @@ from langchain.utilities.wikipedia import WikipediaAPIWrapper
 from langchain.utilities.wolfram_alpha import WolframAlphaAPIWrapper
 from langchain.utilities.openweathermap import OpenWeatherMapAPIWrapper
 from langchain.utilities.dataforseo_api_search import DataForSeoAPIWrapper
+from langchain.tools.google_drive.tool import (
+    GoogleDriveSearchTool,
+    GoogleDriveAPIWrapper,
+)


 def _get_python_repl() -> BaseTool:
@@ -180,6 +184,10 @@ def _get_wolfram_alpha(**kwargs: Any) -> BaseTool:
    return WolframAlphaQueryRun(api_wrapper=WolframAlphaAPIWrapper(**kwargs))


+def _get_google_drive_search(**kwargs: Any) -> BaseTool:
+    return GoogleDriveSearchTool(api_wrapper=GoogleDriveAPIWrapper(**kwargs))
+
+
 def _get_google_search(**kwargs: Any) -> BaseTool:
    return GoogleSearchRun(api_wrapper=GoogleSearchAPIWrapper(**kwargs))

@@ -287,6 +295,15 @@ _EXTRA_LLM_TOOLS: Dict[

 _EXTRA_OPTIONAL_TOOLS: Dict[str, Tuple[Callable[[KwArg(Any)], BaseTool], List[str]]] = {
    "wolfram-alpha": (_get_wolfram_alpha, ["wolfram_alpha_appid"]),
+    "google-drive-search": (
+        _get_google_drive_search,
+        [
+            "gdrive_api_file",
+            "folder_id",
+            "mime_type",
+            "template",
+        ],
+    ),
    "google-search": (_get_google_search, ["google_api_key", "google_cse_id"]),
    "google-search-results-json": (
        _get_google_search_results_json,
--- a/libs/langchain/langchain/document_loaders/init.py
+++ b/libs/langchain/langchain/document_loaders/init.py
@@ -74,7 +74,7 @@ from langchain.document_loaders.geodataframe import GeoDataFrameLoader
 from langchain.document_loaders.git import GitLoader
 from langchain.document_loaders.gitbook import GitbookLoader
 from langchain.document_loaders.github import GitHubIssuesLoader
-from langchain.document_loaders.googledrive import GoogleDriveLoader
+from langchain.document_loaders.google_drive import GoogleDriveLoader
 from langchain.document_loaders.gutenberg import GutenbergLoader
 from langchain.document_loaders.hn import HNLoader
 from langchain.document_loaders.html import UnstructuredHTMLLoader
--- a/libs/langchain/langchain/document_loaders/google_drive.py
+++ b/libs/langchain/langchain/document_loaders/google_drive.py
@@ -0,0 +1,216 @@
+"""Loads data from Google Drive.
+
+Prerequisites:
+    1. Create a Google Cloud project
+    2. Enable the Google Drive API:
+        https://console.cloud.google.com/flows/enableapi?apiid=drive.googleapis.com
+    3. Authorize credentials for desktop app:
+        https://developers.google.com/drive/api/quickstart/python#authorize_credentials_for_a_desktop_application
+    4. For service accounts visit
+        https://cloud.google.com/iam/docs/service-accounts-create
+"""  # noqa: E501
+
+import itertools
+import logging
+import os
+import warnings
+from pathlib import Path
+from typing import (
+    Any,
+    Dict,
+    Iterator,
+    List,
+    Optional,
+    Sequence,
+)
+
+from pydantic.class_validators import root_validator
+
+from langchain.base_language import BaseLanguageModel
+from langchain.chains.summarize import load_summarize_chain
+from langchain.document_loaders.base import BaseLoader
+from langchain.prompts import PromptTemplate
+from langchain.schema import Document
+from langchain.utilities.google_drive import (
+    GoogleDriveUtilities,
+    get_template,
+)
+
+logger = logging.getLogger(__name__)
+
+
+class GoogleDriveLoader(BaseLoader, GoogleDriveUtilities):
+    """Loads data from Google Drive."""
+
+    document_ids: Optional[Sequence[str]] = None
+    """ A list of ids of google drive documents to load."""
+
+    file_ids: Optional[Sequence[str]] = None
+    """A list of ids of google drive files to load."""
+
+    @root_validator(pre=True)
+    def validate_older_api_and_new_environment_variable(
+        cls, v: Dict[str, Any]
+    ) -> Dict[str, Any]:
+        service_account_key = v.get("service_account_key")
+        credentials_path = v.get("credentials_path")
+        api_file = v.get("gdrive_api_file")
+
+        if service_account_key:
+            warnings.warn(
+                "service_account_key was deprecated. Use GOOGLE_ACCOUNT_FILE env "
+                "variable.",
+                DeprecationWarning,
+            )
+        if credentials_path:
+            warnings.warn(
+                "service_account_key was deprecated. Use GOOGLE_ACCOUNT_FILE env "
+                "variable.",
+                DeprecationWarning,
+            )
+        if service_account_key and credentials_path:
+            raise ValueError("Select only service_account_key or service_account_key")
+
+        folder_id = v.get("folder_id")
+        document_ids = v.get("document_ids")
+        file_ids = v.get("file_ids")
+
+        if folder_id and (document_ids or file_ids):
+            raise ValueError(
+                "Cannot specify both folder_id and document_ids nor "
+                "folder_id and file_ids"
+            )
+
+        # To be compatible with the old approach
+        if not api_file:
+            api_file = (
+                Path(os.environ["GOOGLE_ACCOUNT_FILE"])
+                if "GOOGLE_ACCOUNT_FILE" in os.environ
+                else None
+            )
+            # Deprecated: To be compatible with the old approach of authentication
+            if service_account_key:
+                api_file = service_account_key
+            elif credentials_path:
+                api_file = credentials_path
+            elif not api_file:
+                api_file = Path.home() / ".credentials" / "keys.json"
+            v["gdrive_api_file"] = api_file
+
+        if not v.get("template"):
+            if folder_id:
+                template = get_template("gdrive-all-in-folder")
+            elif "document_ids" in v or "file_ids" in v:
+                template = PromptTemplate(input_variables=[], template="")
+            else:
+                raise ValueError("Use a template")
+            v["template"] = template
+        return v
+
+    def lazy_load(self) -> Iterator[Document]:
+        ids = self.document_ids or self.file_ids
+        if ids:
+            yield from (self.load_document_from_id(_id) for _id in ids)
+        else:
+            return self.lazy_get_relevant_documents()
+
+    def load(self) -> List[Document]:
+        return list(self.lazy_load())
+
+
+def lazy_update_description_with_summary(
+    loader: GoogleDriveLoader,
+    llm: BaseLanguageModel,
+    *,
+    force: bool = False,
+    query: str = "",
+    **kwargs: Any,
+) -> Iterator[Document]:
+    """Summarize all documents, and update the GDrive metadata `description`.
+
+    Need `write` access: set scope=["https://www.googleapis.com/auth/drive"].
+
+    Note: Update the description of shortcut without touch the target
+    file description.
+
+    Args:
+        llm: Language model to use.
+        force: true to update all files. Else, update only if the description
+            is empty.
+        query: If possible, the query request.
+        kwargs: Others parameters for the template (verbose, prompt, etc).
+    """
+    try:
+        from googleapiclient.errors import HttpError
+    except ImportError as e:
+        raise ImportError("""Could not import""") from e
+
+    if "https://www.googleapis.com/auth/drive" not in loader._creds.scopes:
+        raise ValueError(
+            f"Remove the file 'token.json' and "
+            f"initialize the {loader.__class__.__name__} with "
+            f"scopes=['https://www.googleapis.com/auth/drive']"
+        )
+
+    chain = load_summarize_chain(llm, chain_type="stuff", **kwargs)
+    updated_files = set()  # Never update two time the same document (if it's split)
+    for document in loader.lazy_get_relevant_documents(query, **kwargs):
+        try:
+            file_id = document.metadata["gdriveId"]
+            if file_id not in updated_files:
+                file = loader.files.get(
+                    fileId=file_id,
+                    fields=loader.fields,
+                    supportsAllDrives=True,
+                ).execute()
+                if force or not file.get("description", "").strip():
+                    summary = chain.run([document]).strip()
+                    if summary:
+                        loader.files.update(
+                            fileId=file_id,
+                            supportsAllDrives=True,
+                            body={"description": summary},
+                        ).execute()
+                        logger.info(
+                            f"For the file '{file['name']}', add description "
+                            f"'{summary[:40]}...'"
+                        )
+                        metadata = loader._extract_meta_data(file)
+                        if "summary" in metadata:
+                            del metadata["summary"]
+                        yield Document(page_content=summary, metadata=metadata)
+                updated_files.add(file_id)
+        except HttpError:
+            logger.warning(
+                f"Impossible to update the description of file "
+                f"'{document.metadata['name']}'"
+            )
+
+
+def update_description_with_summary(
+    loader: GoogleDriveLoader,
+    llm: BaseLanguageModel,
+    *,
+    force: bool = False,
+    query: str = "",
+    **kwargs: Any,
+) -> List[Document]:
+    """Summarize all documents, and update the GDrive metadata `description`.
+
+    Need `write` access: set scope=["https://www.googleapis.com/auth/drive"].
+
+    Note: Update the description of shortcut without touch the target
+    file description.
+
+    Args:
+        llm: Language model to use.
+        force: true to update all files. Else, update only if the description
+            is empty.
+        query: If possible, the query request.
+        kwargs: Others parameters for the template (verbose, prompt, etc).
+    """
+    return list(
+        lazy_update_description_with_summary(
+            loader, llm, force=force, query=query, **kwargs
+        )
+    )
--- a/libs/langchain/langchain/document_loaders/googledrive.py
+++ b/libs/langchain/langchain/document_loaders/googledrive.py
@@ -1,353 +1,4 @@
-"""Loads data from Google Drive."""
+"""DEPRECATED: Kept for backwards compatibility."""
+from langchain.document_loaders.google_drive import GoogleDriveLoader

-# Prerequisites:
-# 1. Create a Google Cloud project
-# 2. Enable the Google Drive API:
-#   https://console.cloud.google.com/flows/enableapi?apiid=drive.googleapis.com
-# 3. Authorize credentials for desktop app:
-#   https://developers.google.com/drive/api/quickstart/python#authorize_credentials_for_a_desktop_application # noqa: E501
-# 4. For service accounts visit
-#   https://cloud.google.com/iam/docs/service-accounts-create
-
-import os
-from pathlib import Path
-from typing import Any, Dict, List, Optional, Sequence, Union
-
-from pydantic import BaseModel, root_validator, validator
-
-from langchain.docstore.document import Document
-from langchain.document_loaders.base import BaseLoader
-
-SCOPES = ["https://www.googleapis.com/auth/drive.readonly"]
-
-
-class GoogleDriveLoader(BaseLoader, BaseModel):
-    """Loads Google Docs from Google Drive."""
-
-    service_account_key: Path = Path.home() / ".credentials" / "keys.json"
-    """Path to the service account key file."""
-    credentials_path: Path = Path.home() / ".credentials" / "credentials.json"
-    """Path to the credentials file."""
-    token_path: Path = Path.home() / ".credentials" / "token.json"
-    """Path to the token file."""
-    folder_id: Optional[str] = None
-    """The folder id to load from."""
-    document_ids: Optional[List[str]] = None
-    """The document ids to load from."""
-    file_ids: Optional[List[str]] = None
-    """The file ids to load from."""
-    recursive: bool = False
-    """Whether to load recursively. Only applies when folder_id is given."""
-    file_types: Optional[Sequence[str]] = None
-    """The file types to load. Only applies when folder_id is given."""
-    load_trashed_files: bool = False
-    """Whether to load trashed files. Only applies when folder_id is given."""
-    # NOTE(MthwRobinson) - changing the file_loader_cls to type here currently
-    # results in pydantic validation errors
-    file_loader_cls: Any = None
-    """The file loader class to use."""
-    file_loader_kwargs: Dict["str", Any] = {}
-    """The file loader kwargs to use."""
-
-    @root_validator
-    def validate_inputs(cls, values: Dict[str, Any]) -> Dict[str, Any]:
-        """Validate that either folder_id or document_ids is set, but not both."""
-        if values.get("folder_id") and (
-            values.get("document_ids") or values.get("file_ids")
-        ):
-            raise ValueError(
-                "Cannot specify both folder_id and document_ids nor "
-                "folder_id and file_ids"
-            )
-        if (
-            not values.get("folder_id")
-            and not values.get("document_ids")
-            and not values.get("file_ids")
-        ):
-            raise ValueError("Must specify either folder_id, document_ids, or file_ids")
-
-        file_types = values.get("file_types")
-        if file_types:
-            if values.get("document_ids") or values.get("file_ids"):
-                raise ValueError(
-                    "file_types can only be given when folder_id is given,"
-                    " (not when document_ids or file_ids are given)."
-                )
-            type_mapping = {
-                "document": "application/vnd.google-apps.document",
-                "sheet": "application/vnd.google-apps.spreadsheet",
-                "pdf": "application/pdf",
-            }
-            allowed_types = list(type_mapping.keys()) + list(type_mapping.values())
-            short_names = ", ".join([f"'{x}'" for x in type_mapping.keys()])
-            full_names = ", ".join([f"'{x}'" for x in type_mapping.values()])
-            for file_type in file_types:
-                if file_type not in allowed_types:
-                    raise ValueError(
-                        f"Given file type {file_type} is not supported. "
-                        f"Supported values are: {short_names}; and "
-                        f"their full-form names: {full_names}"
-                    )
-
-            # replace short-form file types by full-form file types
-            def full_form(x: str) -> str:
-                return type_mapping[x] if x in type_mapping else x
-
-            values["file_types"] = [full_form(file_type) for file_type in file_types]
-        return values
-
-    @validator("credentials_path")
-    def validate_credentials_path(cls, v: Any, **kwargs: Any) -> Any:
-        """Validate that credentials_path exists."""
-        if not v.exists():
-            raise ValueError(f"credentials_path {v} does not exist")
-        return v
-
-    def _load_credentials(self) -> Any:
-        """Load credentials."""
-        # Adapted from https://developers.google.com/drive/api/v3/quickstart/python
-        try:
-            from google.auth import default
-            from google.auth.transport.requests import Request
-            from google.oauth2 import service_account
-            from google.oauth2.credentials import Credentials
-            from google_auth_oauthlib.flow import InstalledAppFlow
-        except ImportError:
-            raise ImportError(
-                "You must run "
-                "`pip install --upgrade "
-                "google-api-python-client google-auth-httplib2 "
-                "google-auth-oauthlib` "
-                "to use the Google Drive loader."
-            )
-
-        creds = None
-        if self.service_account_key.exists():
-            return service_account.Credentials.from_service_account_file(
-                str(self.service_account_key), scopes=SCOPES
-            )
-
-        if self.token_path.exists():
-            creds = Credentials.from_authorized_user_file(str(self.token_path), SCOPES)
-
-        if not creds or not creds.valid:
-            if creds and creds.expired and creds.refresh_token:
-                creds.refresh(Request())
-            elif "GOOGLE_APPLICATION_CREDENTIALS" not in os.environ:
-                creds, project = default()
-                creds = creds.with_scopes(SCOPES)
-                # no need to write to file
-                if creds:
-                    return creds
-            else:
-                flow = InstalledAppFlow.from_client_secrets_file(
-                    str(self.credentials_path), SCOPES
-                )
-                creds = flow.run_local_server(port=0)
-            with open(self.token_path, "w") as token:
-                token.write(creds.to_json())
-
-        return creds
-
-    def _load_sheet_from_id(self, id: str) -> List[Document]:
-        """Load a sheet and all tabs from an ID."""
-
-        from googleapiclient.discovery import build
-
-        creds = self._load_credentials()
-        sheets_service = build("sheets", "v4", credentials=creds)
-        spreadsheet = sheets_service.spreadsheets().get(spreadsheetId=id).execute()
-        sheets = spreadsheet.get("sheets", [])
-
-        documents = []
-        for sheet in sheets:
-            sheet_name = sheet["properties"]["title"]
-            result = (
-                sheets_service.spreadsheets()
-                .values()
-                .get(spreadsheetId=id, range=sheet_name)
-                .execute()
-            )
-            values = result.get("values", [])
-
-            header = values[0]
-            for i, row in enumerate(values[1:], start=1):
-                metadata = {
-                    "source": (
-                        f"https://docs.google.com/spreadsheets/d/{id}/"
-                        f"edit?gid={sheet['properties']['sheetId']}"
-                    ),
-                    "title": f"{spreadsheet['properties']['title']} - {sheet_name}",
-                    "row": i,
-                }
-                content = []
-                for j, v in enumerate(row):
-                    title = header[j].strip() if len(header) > j else ""
-                    content.append(f"{title}: {v.strip()}")
-
-                page_content = "\n".join(content)
-                documents.append(Document(page_content=page_content, metadata=metadata))
-
-        return documents
-
-    def _load_document_from_id(self, id: str) -> Document:
-        """Load a document from an ID."""
-        from io import BytesIO
-
-        from googleapiclient.discovery import build
-        from googleapiclient.errors import HttpError
-        from googleapiclient.http import MediaIoBaseDownload
-
-        creds = self._load_credentials()
-        service = build("drive", "v3", credentials=creds)
-
-        file = service.files().get(fileId=id, supportsAllDrives=True).execute()
-        request = service.files().export_media(fileId=id, mimeType="text/plain")
-        fh = BytesIO()
-        downloader = MediaIoBaseDownload(fh, request)
-        done = False
-        try:
-            while done is False:
-                status, done = downloader.next_chunk()
-
-        except HttpError as e:
-            if e.resp.status == 404:
-                print("File not found: {}".format(id))
-            else:
-                print("An error occurred: {}".format(e))
-
-        text = fh.getvalue().decode("utf-8")
-        metadata = {
-            "source": f"https://docs.google.com/document/d/{id}/edit",
-            "title": f"{file.get('name')}",
-        }
-        return Document(page_content=text, metadata=metadata)
-
-    def _load_documents_from_folder(
-        self, folder_id: str, *, file_types: Optional[Sequence[str]] = None
-    ) -> List[Document]:
-        """Load documents from a folder."""
-        from googleapiclient.discovery import build
-
-        creds = self._load_credentials()
-        service = build("drive", "v3", credentials=creds)
-        files = self._fetch_files_recursive(service, folder_id)
-        # If file types filter is provided, we'll filter by the file type.
-        if file_types:
-            _files = [f for f in files if f["mimeType"] in file_types]  # type: ignore
-        else:
-            _files = files
-
-        returns = []
-        for file in _files:
-            if file["trashed"] and not self.load_trashed_files:
-                continue
-            elif file["mimeType"] == "application/vnd.google-apps.document":
-                returns.append(self._load_document_from_id(file["id"]))  # type: ignore
-            elif file["mimeType"] == "application/vnd.google-apps.spreadsheet":
-                returns.extend(self._load_sheet_from_id(file["id"]))  # type: ignore
-            elif (
-                file["mimeType"] == "application/pdf"
-                or self.file_loader_cls is not None
-            ):
-                returns.extend(self._load_file_from_id(file["id"]))  # type: ignore
-            else:
-                pass
-        return returns
-
-    def _fetch_files_recursive(
-        self, service: Any, folder_id: str
-    ) -> List[Dict[str, Union[str, List[str]]]]:
-        """Fetch all files and subfolders recursively."""
-        results = (
-            service.files()
-            .list(
-                q=f"'{folder_id}' in parents",
-                pageSize=1000,
-                includeItemsFromAllDrives=True,
-                supportsAllDrives=True,
-                fields="nextPageToken, files(id, name, mimeType, parents, trashed)",
-            )
-            .execute()
-        )
-        files = results.get("files", [])
-        returns = []
-        for file in files:
-            if file["mimeType"] == "application/vnd.google-apps.folder":
-                if self.recursive:
-                    returns.extend(self._fetch_files_recursive(service, file["id"]))
-            else:
-                returns.append(file)
-
-        return returns
-
-    def _load_documents_from_ids(self) -> List[Document]:
-        """Load documents from a list of IDs."""
-        if not self.document_ids:
-            raise ValueError("document_ids must be set")
-
-        return [self._load_document_from_id(doc_id) for doc_id in self.document_ids]
-
-    def _load_file_from_id(self, id: str) -> List[Document]:
-        """Load a file from an ID."""
-        from io import BytesIO
-
-        from googleapiclient.discovery import build
-        from googleapiclient.http import MediaIoBaseDownload
-
-        creds = self._load_credentials()
-        service = build("drive", "v3", credentials=creds)
-
-        file = service.files().get(fileId=id, supportsAllDrives=True).execute()
-        request = service.files().get_media(fileId=id)
-        fh = BytesIO()
-        downloader = MediaIoBaseDownload(fh, request)
-        done = False
-        while done is False:
-            status, done = downloader.next_chunk()
-
-        if self.file_loader_cls is not None:
-            fh.seek(0)
-            loader = self.file_loader_cls(file=fh, **self.file_loader_kwargs)
-            docs = loader.load()
-            for doc in docs:
-                doc.metadata["source"] = f"https://drive.google.com/file/d/{id}/view"
-            return docs
-
-        else:
-            from PyPDF2 import PdfReader
-
-            content = fh.getvalue()
-            pdf_reader = PdfReader(BytesIO(content))
-
-            return [
-                Document(
-                    page_content=page.extract_text(),
-                    metadata={
-                        "source": f"https://drive.google.com/file/d/{id}/view",
-                        "title": f"{file.get('name')}",
-                        "page": i,
-                    },
-                )
-                for i, page in enumerate(pdf_reader.pages)
-            ]
-
-    def _load_file_from_ids(self) -> List[Document]:
-        """Load files from a list of IDs."""
-        if not self.file_ids:
-            raise ValueError("file_ids must be set")
-        docs = []
-        for file_id in self.file_ids:
-            docs.extend(self._load_file_from_id(file_id))
-        return docs
-
-    def load(self) -> List[Document]:
-        """Load documents."""
-        if self.folder_id:
-            return self._load_documents_from_folder(
-                self.folder_id, file_types=self.file_types
-            )
-        elif self.document_ids:
-            return self._load_documents_from_ids()
-        else:
-            return self._load_file_from_ids()
+__all__ = ["GoogleDriveLoader"]
--- a/libs/langchain/langchain/retrievers/init.py
+++ b/libs/langchain/langchain/retrievers/init.py
@@ -30,6 +30,7 @@ from langchain.retrievers.ensemble import EnsembleRetriever
 from langchain.retrievers.google_cloud_enterprise_search import (
    GoogleCloudEnterpriseSearchRetriever,
 )
+from langchain.retrievers.google_drive import GoogleDriveRetriever
 from langchain.retrievers.kendra import AmazonKendraRetriever
 from langchain.retrievers.knn import KNNRetriever
 from langchain.retrievers.llama_index import (
@@ -65,6 +66,7 @@ __all__ = [
    "ChaindeskRetriever",
    "ElasticSearchBM25Retriever",
    "GoogleCloudEnterpriseSearchRetriever",
+    "GoogleDriveRetriever",
    "KNNRetriever",
    "LlamaIndexGraphRetriever",
    "LlamaIndexRetriever",
--- a/libs/langchain/langchain/retrievers/google_drive.py
+++ b/libs/langchain/langchain/retrievers/google_drive.py
@@ -0,0 +1,92 @@
+from typing import Any, Dict, List, Literal, Optional
+
+from pydantic.class_validators import root_validator
+from pydantic.config import Extra
+
+from langchain.callbacks.manager import Callbacks
+from langchain.schema import BaseRetriever, Document
+
+from ..utilities.google_drive import (
+    GoogleDriveUtilities,
+    get_template,
+)
+
+
+class GoogleDriveRetriever(GoogleDriveUtilities, BaseRetriever):
+    """Wrapper around Google Drive API.
+
+    The application must be authenticated with a json file.
+    The format may be for a user or for an application via a service account.
+    The environment variable `GOOGLE_ACCOUNT_FILE` may be set to reference this file.
+    For more information, see [here]
+    (https://developers.google.com/workspace/guides/auth-overview).
+    """
+
+    class Config:
+        extra = Extra.allow
+        allow_mutation = False
+        underscore_attrs_are_private = True
+
+    mode: Literal[
+        "snippets", "snippets-markdown", "documents", "documents-markdown"
+    ] = "snippets-markdown"
+
+    @root_validator(pre=True)
+    def validate_template(cls, v: Dict[str, Any]) -> Dict[str, Any]:
+        folder_id = v.get("folder_id")
+
+        if not v.get("template"):
+            if folder_id:
+                template = get_template("gdrive-query-in-folder")
+            else:
+                template = get_template("gdrive-query")
+            v["template"] = template
+        return v
+
+    def get_relevant_documents(
+        self,
+        query: str,
+        *,
+        callbacks: Callbacks = None,
+        tags: Optional[List[str]] = None,
+        metadata: Optional[Dict[str, Any]] = None,
+        **kwargs: Any,
+    ) -> List[Document]:
+        """Get documents relevant for a query.
+
+        Args:
+            query: string to find relevant documents for
+
+        Returns:
+            List of relevant documents
+        """
+        return list(
+            self.lazy_get_relevant_documents(
+                query=query,
+                callbacks=callbacks,
+                tags=tags,
+                metadata=metadata,
+                **kwargs,
+            )
+        )
+
+    async def aget_relevant_documents(
+        self,
+        query: str,
+        *,
+        callbacks: Callbacks = None,
+        tags: Optional[List[str]] = None,
+        metadata: Optional[Dict[str, Any]] = None,
+        **kwargs: Any,
+    ) -> List[Document]:
+        """Get documents relevant for a query.
+
+        NOT IMPLEMENTED
+
+        Args:
+            query: string to find relevant documents for
+
+        Returns:
+            List of relevant documents
+        """
+        raise NotImplementedError("GoogleSearchRun does not support async")
--- a/libs/langchain/langchain/tools/init.py
+++ b/libs/langchain/langchain/tools/init.py
@@ -45,6 +45,7 @@ from langchain.tools.gmail import (
    GmailSearch,
    GmailSendMessage,
 )
+from langchain.tools.google_drive.tool import GoogleDriveSearchTool
 from langchain.tools.google_places.tool import GooglePlacesTool
 from langchain.tools.google_search.tool import GoogleSearchResults, GoogleSearchRun
 from langchain.tools.google_serper.tool import GoogleSerperResults, GoogleSerperRun
@@ -148,6 +149,7 @@ __all__ = [
    "GmailGetThread",
    "GmailSearch",
    "GmailSendMessage",
+    "GoogleDriveSearchTool",
    "GooglePlacesTool",
    "GoogleSearchResults",
    "GoogleSearchRun",
--- a/libs/langchain/langchain/tools/google_drive/init.py
+++ b/libs/langchain/langchain/tools/google_drive/init.py
--- a/libs/langchain/langchain/tools/google_drive/tool.py
+++ b/libs/langchain/langchain/tools/google_drive/tool.py
@@ -0,0 +1,41 @@
+import logging
+from typing import Optional
+
+from langchain.callbacks.manager import (
+    AsyncCallbackManagerForToolRun,
+    CallbackManagerForToolRun,
+)
+from langchain.tools import BaseTool
+
+from ...utilities.google_drive import FORMAT_INSTRUCTION, GoogleDriveAPIWrapper
+
+logger = logging.getLogger(__name__)
+
+
+class GoogleDriveSearchTool(BaseTool):
+    """Tool that adds the capability to query the Google Drive search API."""
+
+    name = "Google Drive Search"
+    description = (
+        "A wrapper around Google Drive Search. "
+        "Useful for when you need to find a document in google drive. "
+        f"{FORMAT_INSTRUCTION}"
+    )
+    api_wrapper: GoogleDriveAPIWrapper
+
+    def _run(
+        self,
+        query: str,
+        run_manager: Optional[CallbackManagerForToolRun] = None,
+    ) -> str:
+        """Use the tool."""
+        logger.info(f"{query=}")
+        return self.api_wrapper.run(query)
+
+    async def _arun(
+        self,
+        query: str,
+        run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
+    ) -> str:
+        """Use the tool asynchronously."""
+        raise NotImplementedError("GoogleSearchRun does not support async")
--- a/libs/langchain/langchain/utilities/init.py
+++ b/libs/langchain/langchain/utilities/init.py
@@ -11,6 +11,7 @@ from langchain.utilities.bing_search import BingSearchAPIWrapper
 from langchain.utilities.brave_search import BraveSearchWrapper
 from langchain.utilities.duckduckgo_search import DuckDuckGoSearchAPIWrapper
 from langchain.utilities.golden_query import GoldenQueryAPIWrapper
+from langchain.utilities.google_drive import GoogleDriveAPIWrapper
 from langchain.utilities.google_places_api import GooglePlacesAPIWrapper
 from langchain.utilities.google_search import GoogleSearchAPIWrapper
 from langchain.utilities.google_serper import GoogleSerperAPIWrapper
@@ -42,6 +43,7 @@ __all__ = [
    "BraveSearchWrapper",
    "DuckDuckGoSearchAPIWrapper",
    "GoldenQueryAPIWrapper",
+    "GoogleDriveAPIWrapper",
    "GooglePlacesAPIWrapper",
    "GoogleSearchAPIWrapper",
    "GoogleSerperAPIWrapper",
--- a/libs/langchain/langchain/utilities/google_drive.py
+++ b/libs/langchain/langchain/utilities/google_drive.py
--- a/libs/langchain/tests/unit_tests/document_loaders/test_google_drive.py
+++ b/libs/langchain/tests/unit_tests/document_loaders/test_google_drive.py
@@ -0,0 +1,161 @@
+import unittest
+from pathlib import Path
+from unittest.mock import MagicMock
+
+import pytest
+from pytest_mock import MockerFixture
+
+from langchain.document_loaders.google_drive import GoogleDriveLoader
+from tests.unit_tests.llms.fake_llm import FakeLLM
+from tests.unit_tests.utilities.test_google_drive import (
+    gdrive_docs,
+    google_workspace_installed,
+    patch_google_workspace,
+)
+
+
+@pytest.fixture
+def google_workspace(mocker: MockerFixture) -> MagicMock:
+    return patch_google_workspace(
+        mocker, [{"nextPageToken": None, "files": gdrive_docs}]
+    )
+
+
+@unittest.skipIf(not google_workspace_installed, "Google api not installed")
+def test_load_returns_list_of_google_documents_single(
+    google_workspace: MagicMock,
+) -> None:
+    loader = GoogleDriveLoader(
+        api_file=Path(__file__).parent.parent
+        / "utilities"
+        / "examples"
+        / "gdrive_credentials.json",
+        folder_id="999",
+    )
+    assert loader.mode == "documents"  # Check default value
+    assert loader.gsheet_mode == "single"  # Check default value
+    assert loader.gslide_mode == "single"  # Check default value
+
+
+@unittest.skipIf(not google_workspace_installed, "Google api not installed")
+def test_service_account_key(google_workspace: MagicMock) -> None:
+    loader = GoogleDriveLoader(
+        service_account_key=Path(__file__).parent.parent
+        / "utilities"
+        / "examples"
+        / "gdrive_service.json",
+        template="gdrive-all-in-folder",
+    )
+    assert (
+        loader.gdrive_api_file
+        == Path(__file__).parent.parent
+        / "utilities"
+        / "examples"
+        / "gdrive_service.json"
+    )
+
+
+# @unittest.skipIf(not google_workspace_installed, "Google api not installed")
+# def test_no_path(mocker,google_workspace) -> None:
+#     import os
+#     mocker.patch.dict(os.environ,{},clear=True)
+#     loader = GoogleDriveLoader(
+#         template="gdrive-all-in-folder",
+#     )
+#     assert loader.gdrive_api_file == Path.home() / ".credentials" / "keys.json"
+
+
+@unittest.skipIf(not google_workspace_installed, "Google api not installed")
+def test_credentials_path(mocker: MockerFixture, google_workspace: MagicMock) -> None:
+    loader = GoogleDriveLoader(
+        credentials_path=Path(__file__).parent.parent
+        / "utilities"
+        / "examples"
+        / "gdrive_credentials.json",
+        template="gdrive-all-in-folder",
+    )
+    assert (
+        loader.gdrive_api_file
+        == Path(__file__).parent.parent
+        / "utilities"
+        / "examples"
+        / "gdrive_credentials.json"
+    )
+
+
+@unittest.skipIf(not google_workspace_installed, "Google api not installed")
+def test_folder_id(google_workspace: MagicMock) -> None:
+    loader = GoogleDriveLoader(
+        api_file=Path(__file__).parent.parent
+        / "utilities"
+        / "examples"
+        / "gdrive_credentials.json",
+        folder_id="999",
+    )
+    docs = loader.load()
+    assert len(docs) == 3
+
+
+@unittest.skipIf(not google_workspace_installed, "Google api not installed")
+def test_query(google_workspace: MagicMock) -> None:
+    loader = GoogleDriveLoader(
+        api_file=Path(__file__).parent.parent
+        / "utilities"
+        / "examples"
+        / "gdrive_credentials.json",
+        query="",
+        template="gdrive-query",
+    )
+    docs = loader.load()
+    assert len(docs) == 3
+
+
+@unittest.skipIf(not google_workspace_installed, "Google api not installed")
+def test_document_ids(google_workspace: MagicMock) -> None:
+    loader = GoogleDriveLoader(
+        api_file=Path(__file__).parent.parent
+        / "utilities"
+        / "examples"
+        / "gdrive_credentials.json",
+        document_ids=["1", "1"],
+    )
+    docs = loader.load()
+    assert len(docs) == 2
+
+
+@unittest.skipIf(not google_workspace_installed, "Google api not installed")
+def test_files_ids(google_workspace: MagicMock) -> None:
+    loader = GoogleDriveLoader(
+        api_file=Path(__file__).parent.parent
+        / "utilities"
+        / "examples"
+        / "gdrive_credentials.json",
+        file_ids=["1", "2"],
+    )
+    docs = loader.load()
+    assert len(docs) == 2
+
+
+@unittest.skipIf(not google_workspace_installed, "Google api not installed")
+def test_update_description_with_summary(google_workspace: MagicMock) -> None:
+    loader = GoogleDriveLoader(
+        api_file=Path(__file__).parent.parent
+        / "utilities"
+        / "examples"
+        / "gdrive_credentials.json",
+        file_ids=["1", "2"],
+        scopes=["https://www.googleapis.com/auth/drive"],
+    )
+    result = list(
+        loader.lazy_update_description_with_summary(
+            llm=FakeLLM(), force=True, prompt=None, verbose=True, query=""
+        )
+    )
+    assert len(result) == 2
+
+    result = list(
+        loader.lazy_update_description_with_summary(
+            llm=FakeLLM(), force=False, prompt=None, query=""
+        )
+    )
+    assert len(result) == 0
--- a/libs/langchain/tests/unit_tests/retrievers/test_google_drive.py
+++ b/libs/langchain/tests/unit_tests/retrievers/test_google_drive.py
@@ -0,0 +1,53 @@
+import unittest
+from pathlib import Path
+from unittest.mock import MagicMock
+
+import pytest
+from pytest_mock import MockerFixture
+
+from langchain.retrievers.google_drive import GoogleDriveRetriever
+from tests.unit_tests.utilities.test_google_drive import (
+    _text_text,
+    gdrive_docs,
+    google_workspace_installed,
+    patch_google_workspace,
+)
+
+
+@pytest.fixture
+def google_workspace(mocker: MockerFixture) -> MagicMock:
+    return patch_google_workspace(
+        mocker, [{"nextPageToken": None, "files": gdrive_docs}]
+    )
+
+
+@unittest.skipIf(not google_workspace_installed, "Google api not installed")
+def test_get_relevant_documents(
+    mocker: MockerFixture,
+) -> None:
+    patch_google_workspace(mocker, [{"nextPageToken": None, "files": [_text_text]}])
+    retriever = GoogleDriveRetriever(
+        api_file=Path(__file__).parent.parent
+        / "utilities"
+        / "examples"
+        / "gdrive_credentials.json",
+    )
+    docs = retriever.get_relevant_documents("machine learning")
+    assert len(docs) == 1
+
+
+@unittest.skipIf(not google_workspace_installed, "Google api not installed")
+def test_extra_parameters(
+    mocker: MockerFixture,
+) -> None:
+    patch_google_workspace(mocker, [{"nextPageToken": None, "files": [_text_text]}])
+    retriever = GoogleDriveRetriever(
+        template="gdrive-mime-type-in-folders",
+        folder_id="root",
+        mime_type="application/vnd.google-apps.document",  # Only Google Docs
+        num_results=2,
+        mode="snippets",
+        includeItemsFromAllDrives=False,
+        supportsAllDrives=False,
+    )
+    retriever.get_relevant_documents("machine learning")
--- a/libs/langchain/tests/unit_tests/tools/test_google_drive.py
+++ b/libs/langchain/tests/unit_tests/tools/test_google_drive.py
@@ -0,0 +1,40 @@
+import unittest
+from pathlib import Path
+from unittest.mock import MagicMock
+
+import pytest as pytest
+from pytest_mock import MockerFixture
+
+from langchain.tools.google_drive.tool import GoogleDriveSearchTool
+from langchain.utilities import GoogleDriveAPIWrapper
+from tests.unit_tests.utilities.test_google_drive import (
+    gdrive_docs,
+    google_workspace_installed,
+    patch_google_workspace,
+)
+
+
+@pytest.fixture
+def google_workspace(mocker: MockerFixture) -> MagicMock:
+    return patch_google_workspace(
+        mocker, [{"nextPageToken": None, "files": gdrive_docs}]
+    )
+
+
+@unittest.skipIf(not google_workspace_installed, "Google api not installed")
+def test_run(google_workspace: MagicMock) -> None:
+    tool = GoogleDriveSearchTool(
+        api_wrapper=GoogleDriveAPIWrapper(
+            api_file=(
+                Path(__file__).parent.parent
+                / "utilities"
+                / "examples"
+                / "gdrive_credentials.json"
+            )
+        )
+    )
+    result = tool._run("machine learning")
+    assert result.startswith(
+        "[vnd.google-apps.document](https://docs.google.com/document/d/1/edit?usp=drivesdk)<br/>\n"
+        "It is a doc summary\n\n"
+    )
--- a/libs/langchain/tests/unit_tests/tools/test_public_api.py
+++ b/libs/langchain/tests/unit_tests/tools/test_public_api.py
@@ -32,6 +32,7 @@ _EXPECTED = [
    "GmailGetThread",
    "GmailSearch",
    "GmailSendMessage",
+    "GoogleDriveSearchTool",
    "GooglePlacesTool",
    "GoogleSearchResults",
    "GoogleSearchRun",
--- a/libs/langchain/tests/unit_tests/utilities/examples/gdrive.gdoc
+++ b/libs/langchain/tests/unit_tests/utilities/examples/gdrive.gdoc
--- a/libs/langchain/tests/unit_tests/utilities/examples/gdrive.gsheet
+++ b/libs/langchain/tests/unit_tests/utilities/examples/gdrive.gsheet
@@ -0,0 +1,161 @@
+{
+  "spreadsheetId": "1iuGLyUDgw6mCjyXnaqNtpXS2-ALJbZ4wq1cWBuCfTRg",
+  "properties": {
+    "title": "vnd.google-apps.spreadsheet",
+    "locale": "fr_FR",
+    "autoRecalc": "ON_CHANGE",
+    "timeZone": "Europe/Paris",
+    "defaultFormat": {
+      "backgroundColor": {
+        "red": 1,
+        "green": 1,
+        "blue": 1
+      },
+      "padding": {
+        "top": 2,
+        "right": 3,
+        "bottom": 2,
+        "left": 3
+      },
+      "verticalAlignment": "BOTTOM",
+      "wrapStrategy": "OVERFLOW_CELL",
+      "textFormat": {
+        "foregroundColor": {},
+        "fontFamily": "arial,sans,sans-serif",
+        "fontSize": 10,
+        "bold": false,
+        "italic": false,
+        "strikethrough": false,
+        "underline": false,
+        "foregroundColorStyle": {
+          "rgbColor": {}
+        }
+      },
+      "backgroundColorStyle": {
+        "rgbColor": {
+          "red": 1,
+          "green": 1,
+          "blue": 1
+        }
+      }
+    },
+    "spreadsheetTheme": {
+      "primaryFontFamily": "Arial",
+      "themeColors": [
+        {
+          "colorType": "TEXT",
+          "color": {
+            "rgbColor": {}
+          }
+        },
+        {
+          "colorType": "BACKGROUND",
+          "color": {
+            "rgbColor": {
+              "red": 1,
+              "green": 1,
+              "blue": 1
+            }
+          }
+        },
+        {
+          "colorType": "ACCENT1",
+          "color": {
+            "rgbColor": {
+              "red": 0.25882354,
+              "green": 0.52156866,
+              "blue": 0.95686275
+            }
+          }
+        },
+        {
+          "colorType": "ACCENT2",
+          "color": {
+            "rgbColor": {
+              "red": 0.91764706,
+              "green": 0.2627451,
+              "blue": 0.20784314
+            }
+          }
+        },
+        {
+          "colorType": "ACCENT3",
+          "color": {
+            "rgbColor": {
+              "red": 0.9843137,
+              "green": 0.7372549,
+              "blue": 0.015686275
+            }
+          }
+        },
+        {
+          "colorType": "ACCENT4",
+          "color": {
+            "rgbColor": {
+              "red": 0.20392157,
+              "green": 0.65882355,
+              "blue": 0.3254902
+            }
+          }
+        },
+        {
+          "colorType": "ACCENT5",
+          "color": {
+            "rgbColor": {
+              "red": 1,
+              "green": 0.42745098,
+              "blue": 0.003921569
+            }
+          }
+        },
+        {
+          "colorType": "ACCENT6",
+          "color": {
+            "rgbColor": {
+              "red": 0.27450982,
+              "green": 0.7411765,
+              "blue": 0.7764706
+            }
+          }
+        },
+        {
+          "colorType": "LINK",
+          "color": {
+            "rgbColor": {
+              "red": 0.06666667,
+              "green": 0.33333334,
+              "blue": 0.8
+            }
+          }
+        }
+      ]
+    }
+  },
+  "sheets": [
+    {
+      "properties": {
+        "sheetId": 0,
+        "title": "Feuille 1",
+        "index": 0,
+        "sheetType": "GRID",
+        "gridProperties": {
+          "rowCount": 1000,
+          "columnCount": 26
+        }
+      }
+    },
+    {
+      "properties": {
+        "sheetId": 831511404,
+        "title": "Feuille 2",
+        "index": 1,
+        "sheetType": "GRID",
+        "gridProperties": {
+          "rowCount": 1000,
+          "columnCount": 26
+        }
+      }
+    }
+  ],
+  "spreadsheetUrl": "https://docs.google.com/spreadsheets/d/1iuGLyUDgw6mCjyXnaqNtpXS2-ALJbZ4wq1cWBuCfTRg/edit?ouid=109055472267306456451"
+}
--- a/libs/langchain/tests/unit_tests/utilities/examples/gdrive.gslide
+++ b/libs/langchain/tests/unit_tests/utilities/examples/gdrive.gslide
--- a/libs/langchain/tests/unit_tests/utilities/examples/gdrive_credentials.json
+++ b/libs/langchain/tests/unit_tests/utilities/examples/gdrive_credentials.json
@@ -0,0 +1,13 @@
+{
+  "installed": {
+    "client_id": "",
+    "project_id": "",
+    "auth_uri": "https://accounts.google.com/o/oauth2/auth",
+    "token_uri": "https://oauth2.googleapis.com/token",
+    "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
+    "client_secret": "",
+    "redirect_uris": [
+      "http://localhost"
+    ]
+  }
+}
--- a/libs/langchain/tests/unit_tests/utilities/examples/gdrive_service.json
+++ b/libs/langchain/tests/unit_tests/utilities/examples/gdrive_service.json
@@ -0,0 +1,12 @@
+{
+  "type": "service_account",
+  "project_id": "lanchain",
+  "private_key_id": "",
+  "private_key": "",
+  "client_email": "a@a.com",
+  "client_id": "",
+  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
+  "token_uri": "https://oauth2.googleapis.com/token",
+  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
+  "client_x509_cert_url": ""
+}
--- a/libs/langchain/tests/unit_tests/utilities/examples/text.txt
+++ b/libs/langchain/tests/unit_tests/utilities/examples/text.txt
@@ -0,0 +1 @@
+The body of a text file
--- a/libs/langchain/tests/unit_tests/utilities/examples/token.json
+++ b/libs/langchain/tests/unit_tests/utilities/examples/token.json
@@ -0,0 +1,12 @@
+{
+  "token": "MockToken",
+  "refresh_token": "",
+  "token_uri": "https://oauth2.googleapis.com/token",
+  "client_id": "",
+  "client_secret": "",
+  "scopes": [
+    "https://www.googleapis.com/auth/drive.readonly",,
+    "https://www.googleapis.com/auth/drive"
+  ],
+  "expiry": "9999-01-01T00:00:00.0Z"
+}
--- a/libs/langchain/tests/unit_tests/utilities/test_google_drive.py
+++ b/libs/langchain/tests/unit_tests/utilities/test_google_drive.py
Author	SHA1	Message	Date
Bagatur	15951239df	wip	2023-08-03 13:26:56 -07:00
Bagatur	6def0a4ed0	Merge branch 'master' into pprados/google_drive	2023-08-03 10:43:43 -07:00
Philippe Prados	80f5e05181	Resync in 3 august	2023-08-03 17:07:47 +02:00
Philippe Prados	7fe77245af	Resynch in 3 august	2023-08-03 12:48:54 +02:00