Files
langchain/docs/versioned_docs/version-0.2.x/integrations/toolkits/airbyte_structured_qa.ipynb
Jacob Lee aff771923a Jacob/new docs (#20570)
Use docusaurus versioning with a callout, merged master as well

@hwchase17 @baskaryan

---------

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com>
Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru>
Co-authored-by: Averi Kitsch <akitsch@google.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Martín Gotelli Ferenaz <martingotelliferenaz@gmail.com>
Co-authored-by: Fayfox <admin@fayfox.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Dawson Bauer <105886620+djbauer2@users.noreply.github.com>
Co-authored-by: Ravindu Somawansa <ravindu.somawansa@gmail.com>
Co-authored-by: Dhruv Chawla <43818888+Dominastorm@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: WeichenXu <weichen.xu@databricks.com>
Co-authored-by: Benito Geordie <89472452+benitoThree@users.noreply.github.com>
Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com>
Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>
Co-authored-by: Sevin F. Varoglu <sfvaroglu@octoml.ai>
Co-authored-by: MacanPN <martin.triska@gmail.com>
Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com>
Co-authored-by: Hyeongchan Kim <kozistr@gmail.com>
Co-authored-by: sdan <git@sdan.io>
Co-authored-by: Guangdong Liu <liugddx@gmail.com>
Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: pjb157 <84070455+pjb157@users.noreply.github.com>
Co-authored-by: Eun Hye Kim <ehkim1440@gmail.com>
Co-authored-by: kaijietti <43436010+kaijietti@users.noreply.github.com>
Co-authored-by: Pengcheng Liu <pcliu.fd@gmail.com>
Co-authored-by: Tomer Cagan <tomer@tomercagan.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
2024-04-18 11:10:55 -07:00

122 lines
2.9 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Airbyte Question Answering\n",
"This notebook shows how to do question answering over structured data, in this case using the `AirbyteStripeLoader`.\n",
"\n",
"Vectorstores often have a hard time answering questions that requires computing, grouping and filtering structured data so the high level idea is to use a `pandas` dataframe to help with these types of questions. "
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"1. Load data from Stripe using Airbyte. user the `record_handler` paramater to return a JSON from the data loader."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"import pandas as pd\n",
"from langchain.agents import AgentType\n",
"from langchain_community.document_loaders.airbyte import AirbyteStripeLoader\n",
"from langchain_experimental.agents import create_pandas_dataframe_agent\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"stream_name = \"customers\"\n",
"config = {\n",
" \"client_secret\": os.getenv(\"STRIPE_CLIENT_SECRET\"),\n",
" \"account_id\": os.getenv(\"STRIPE_ACCOUNT_D\"),\n",
" \"start_date\": \"2023-01-20T00:00:00Z\",\n",
"}\n",
"\n",
"\n",
"def handle_record(record: dict, _id: str):\n",
" return record.data\n",
"\n",
"\n",
"loader = AirbyteStripeLoader(\n",
" config=config,\n",
" record_handler=handle_record,\n",
" stream_name=stream_name,\n",
")\n",
"data = loader.load()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"2. Pass the data to `pandas` dataframe."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df = pd.DataFrame(data)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"3. Pass the dataframe `df` to the `create_pandas_dataframe_agent` and invoke\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"agent = create_pandas_dataframe_agent(\n",
" ChatOpenAI(temperature=0, model=\"gpt-4\"),\n",
" df,\n",
" verbose=True,\n",
" agent_type=AgentType.OPENAI_FUNCTIONS,\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"4. Run the agent"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"output = agent.run(\"How many rows are there?\")"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}