mirror of
https://github.com/hwchase17/langchain.git
synced 2026-04-04 19:35:08 +00:00
Use docusaurus versioning with a callout, merged master as well @hwchase17 @baskaryan --------- Signed-off-by: Weichen Xu <weichen.xu@databricks.com> Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com> Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru> Co-authored-by: Averi Kitsch <akitsch@google.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Nuno Campos <nuno@langchain.dev> Co-authored-by: Nuno Campos <nuno@boringbits.io> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Martín Gotelli Ferenaz <martingotelliferenaz@gmail.com> Co-authored-by: Fayfox <admin@fayfox.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Dawson Bauer <105886620+djbauer2@users.noreply.github.com> Co-authored-by: Ravindu Somawansa <ravindu.somawansa@gmail.com> Co-authored-by: Dhruv Chawla <43818888+Dominastorm@users.noreply.github.com> Co-authored-by: ccurme <chester.curme@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: WeichenXu <weichen.xu@databricks.com> Co-authored-by: Benito Geordie <89472452+benitoThree@users.noreply.github.com> Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com> Co-authored-by: Kartik Sarangmath <kartik@thirdai.com> Co-authored-by: Sevin F. Varoglu <sfvaroglu@octoml.ai> Co-authored-by: MacanPN <martin.triska@gmail.com> Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com> Co-authored-by: Hyeongchan Kim <kozistr@gmail.com> Co-authored-by: sdan <git@sdan.io> Co-authored-by: Guangdong Liu <liugddx@gmail.com> Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: pjb157 <84070455+pjb157@users.noreply.github.com> Co-authored-by: Eun Hye Kim <ehkim1440@gmail.com> Co-authored-by: kaijietti <43436010+kaijietti@users.noreply.github.com> Co-authored-by: Pengcheng Liu <pcliu.fd@gmail.com> Co-authored-by: Tomer Cagan <tomer@tomercagan.com> Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
216 lines
6.0 KiB
Plaintext
216 lines
6.0 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "1f3a5ebf",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Airbyte Gong (Deprecated)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "35ac77b1-449b-44f7-b8f3-3494d55c286e",
|
|
"metadata": {},
|
|
"source": [
|
|
"Note: This connector-specific loader is deprecated. Please use [`AirbyteLoader`](/docs/integrations/document_loaders/airbyte) instead.\n",
|
|
"\n",
|
|
">[Airbyte](https://github.com/airbytehq/airbyte) is a data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes. It has the largest catalog of ELT connectors to data warehouses and databases.\n",
|
|
"\n",
|
|
"This loader exposes the Gong connector as a document loader, allowing you to load various Gong objects as documents."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "6847a40c",
|
|
"metadata": {},
|
|
"source": []
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "3b06fbde",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Installation"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "e3e9dc79",
|
|
"metadata": {},
|
|
"source": [
|
|
"First, you need to install the `airbyte-source-gong` python package."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "4d35e4e0",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"%pip install --upgrade --quiet airbyte-source-gong"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ae855210",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Example"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "02208f52",
|
|
"metadata": {},
|
|
"source": [
|
|
"Check out the [Airbyte documentation page](https://docs.airbyte.com/integrations/sources/gong/) for details about how to configure the reader.\n",
|
|
"The JSON schema the config object should adhere to can be found on Github: [https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-gong/source_gong/spec.yaml](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-gong/source_gong/spec.yaml).\n",
|
|
"\n",
|
|
"The general shape looks like this:\n",
|
|
"```python\n",
|
|
"{\n",
|
|
" \"access_key\": \"<access key name>\",\n",
|
|
" \"access_key_secret\": \"<access key secret>\",\n",
|
|
" \"start_date\": \"<date from which to start retrieving records from in ISO format, e.g. 2020-10-20T00:00:00Z>\",\n",
|
|
"}\n",
|
|
"```\n",
|
|
"\n",
|
|
"By default all fields are stored as metadata in the documents and the text is set to an empty string. Construct the text of the document by transforming the documents returned by the reader."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "89a99e58",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain_community.document_loaders.airbyte import AirbyteGongLoader\n",
|
|
"\n",
|
|
"config = {\n",
|
|
" # your gong configuration\n",
|
|
"}\n",
|
|
"\n",
|
|
"loader = AirbyteGongLoader(\n",
|
|
" config=config, stream_name=\"calls\"\n",
|
|
") # check the documentation linked above for a list of all streams"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "2cea23fc",
|
|
"metadata": {},
|
|
"source": [
|
|
"Now you can load documents the usual way"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "dae75cdb",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"docs = loader.load()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "4a93dc2a",
|
|
"metadata": {},
|
|
"source": [
|
|
"As `load` returns a list, it will block until all documents are loaded. To have better control over this process, you can also you the `lazy_load` method which returns an iterator instead:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "1782db09",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"docs_iterator = loader.lazy_load()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "3a124086",
|
|
"metadata": {},
|
|
"source": [
|
|
"Keep in mind that by default the page content is empty and the metadata object contains all the information from the record. To process documents, create a class inheriting from the base loader and implement the `_handle_records` method yourself:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "5671395d",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain.docstore.document import Document\n",
|
|
"\n",
|
|
"\n",
|
|
"def handle_record(record, id):\n",
|
|
" return Document(page_content=record.data[\"title\"], metadata=record.data)\n",
|
|
"\n",
|
|
"\n",
|
|
"loader = AirbyteGongLoader(\n",
|
|
" config=config, record_handler=handle_record, stream_name=\"calls\"\n",
|
|
")\n",
|
|
"docs = loader.load()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "223eb8bc",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Incremental loads\n",
|
|
"\n",
|
|
"Some streams allow incremental loading, this means the source keeps track of synced records and won't load them again. This is useful for sources that have a high volume of data and are updated frequently.\n",
|
|
"\n",
|
|
"To take advantage of this, store the `last_state` property of the loader and pass it in when creating the loader again. This will ensure that only new records are loaded."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "7061e735",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"last_state = loader.last_state # store safely\n",
|
|
"\n",
|
|
"incremental_loader = AirbyteGongLoader(\n",
|
|
" config=config, stream_name=\"calls\", state=last_state\n",
|
|
")\n",
|
|
"\n",
|
|
"new_docs = incremental_loader.load()"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.6"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|