Vwp/docs improved document loaders (#4006)

Huge thanks to @leo-gan for improving the document loaders notebooks

---------

Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com>
This commit is contained in:
Zander Chase
2023-05-02 15:24:53 -07:00
committed by GitHub
parent 1c68cbdb28
commit aa38355999
57 changed files with 1227 additions and 779 deletions

View File

@@ -5,7 +5,22 @@
"id": "1f3a5ebf",
"metadata": {},
"source": [
"# Airbyte JSON\n",
"# Airbyte JSON"
]
},
{
"cell_type": "markdown",
"id": "35ac77b1-449b-44f7-b8f3-3494d55c286e",
"metadata": {},
"source": [
">[Airbyte](https://github.com/airbytehq/airbyte) is a data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes. It has the largest catalog of ELT connectors to data warehouses and databases."
]
},
{
"cell_type": "markdown",
"id": "1fe72234-3110-4c07-a766-3dc505dd25cc",
"metadata": {},
"source": [
"This covers how to load any source from Airbyte into a local JSON file that can be read in as a document\n",
"\n",
"Prereqs:\n",
@@ -25,7 +40,7 @@
"\n",
"6) Set destination as Local JSON, with specified destination path - lets say `/json_data`. Set up manual sync.\n",
"\n",
"7) Run the connection!\n",
"7) Run the connection.\n",
"\n",
"7) To see what files are create, you can navigate to: `file:///tmp/airbyte_local`\n",
"\n",
@@ -52,7 +67,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"_airbyte_raw_pokemon.jsonl\r\n"
"_airbyte_raw_pokemon.jsonl\n"
]
}
],