Files
langchain/docs/versioned_docs/version-0.2.x/integrations/toolkits/csv.ipynb
Jacob Lee aff771923a Jacob/new docs (#20570)
Use docusaurus versioning with a callout, merged master as well

@hwchase17 @baskaryan

---------

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com>
Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru>
Co-authored-by: Averi Kitsch <akitsch@google.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Martín Gotelli Ferenaz <martingotelliferenaz@gmail.com>
Co-authored-by: Fayfox <admin@fayfox.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Dawson Bauer <105886620+djbauer2@users.noreply.github.com>
Co-authored-by: Ravindu Somawansa <ravindu.somawansa@gmail.com>
Co-authored-by: Dhruv Chawla <43818888+Dominastorm@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: WeichenXu <weichen.xu@databricks.com>
Co-authored-by: Benito Geordie <89472452+benitoThree@users.noreply.github.com>
Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com>
Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>
Co-authored-by: Sevin F. Varoglu <sfvaroglu@octoml.ai>
Co-authored-by: MacanPN <martin.triska@gmail.com>
Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com>
Co-authored-by: Hyeongchan Kim <kozistr@gmail.com>
Co-authored-by: sdan <git@sdan.io>
Co-authored-by: Guangdong Liu <liugddx@gmail.com>
Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: pjb157 <84070455+pjb157@users.noreply.github.com>
Co-authored-by: Eun Hye Kim <ehkim1440@gmail.com>
Co-authored-by: kaijietti <43436010+kaijietti@users.noreply.github.com>
Co-authored-by: Pengcheng Liu <pcliu.fd@gmail.com>
Co-authored-by: Tomer Cagan <tomer@tomercagan.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
2024-04-18 11:10:55 -07:00

302 lines
7.3 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "7094e328",
"metadata": {},
"source": [
"# CSV\n",
"\n",
"This notebook shows how to use agents to interact with data in `CSV` format. It is mostly optimized for question answering.\n",
"\n",
"**NOTE: this agent calls the Pandas DataFrame agent under the hood, which in turn calls the Python agent, which executes LLM generated Python code - this can be bad if the LLM generated Python code is harmful. Use cautiously.**\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "caae0bec",
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents.agent_types import AgentType\n",
"from langchain_experimental.agents.agent_toolkits import create_csv_agent\n",
"from langchain_openai import ChatOpenAI, OpenAI"
]
},
{
"cell_type": "markdown",
"id": "bd806175",
"metadata": {},
"source": [
"## Using `ZERO_SHOT_REACT_DESCRIPTION`\n",
"\n",
"This shows how to initialize the agent using the `ZERO_SHOT_REACT_DESCRIPTION` agent type."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "a1717204",
"metadata": {},
"outputs": [],
"source": [
"agent = create_csv_agent(\n",
" OpenAI(temperature=0),\n",
" \"titanic.csv\",\n",
" verbose=True,\n",
" agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "c31bb8a6",
"metadata": {},
"source": [
"## Using OpenAI Functions\n",
"\n",
"This shows how to initialize the agent using the OPENAI_FUNCTIONS agent type. Note that this is an alternative to the above."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "16c4dc59",
"metadata": {},
"outputs": [],
"source": [
"agent = create_csv_agent(\n",
" ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo-0613\"),\n",
" \"titanic.csv\",\n",
" verbose=True,\n",
" agent_type=AgentType.OPENAI_FUNCTIONS,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "46b9489d",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Error in on_chain_start callback: 'name'\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m\n",
"Invoking: `python_repl_ast` with `df.shape[0]`\n",
"\n",
"\n",
"\u001b[0m\u001b[36;1m\u001b[1;3m891\u001b[0m\u001b[32;1m\u001b[1;3mThere are 891 rows in the dataframe.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'There are 891 rows in the dataframe.'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"how many rows are there?\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "a96309be",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Error in on_chain_start callback: 'name'\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m\n",
"Invoking: `python_repl_ast` with `df[df['SibSp'] > 3]['PassengerId'].count()`\n",
"\n",
"\n",
"\u001b[0m\u001b[36;1m\u001b[1;3m30\u001b[0m\u001b[32;1m\u001b[1;3mThere are 30 people in the dataframe who have more than 3 siblings.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'There are 30 people in the dataframe who have more than 3 siblings.'"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"how many people have more than 3 siblings\")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "964a09f7",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Error in on_chain_start callback: 'name'\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m\n",
"Invoking: `python_repl_ast` with `import pandas as pd\n",
"import math\n",
"\n",
"# Create a dataframe\n",
"data = {'Age': [22, 38, 26, 35, 35]}\n",
"df = pd.DataFrame(data)\n",
"\n",
"# Calculate the average age\n",
"average_age = df['Age'].mean()\n",
"\n",
"# Calculate the square root of the average age\n",
"square_root = math.sqrt(average_age)\n",
"\n",
"square_root`\n",
"\n",
"\n",
"\u001b[0m\u001b[36;1m\u001b[1;3m5.585696017507576\u001b[0m\u001b[32;1m\u001b[1;3mThe square root of the average age is approximately 5.59.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The square root of the average age is approximately 5.59.'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"whats the square root of the average age?\")"
]
},
{
"cell_type": "markdown",
"id": "09539c18",
"metadata": {},
"source": [
"### Multi CSV Example\n",
"\n",
"This next part shows how the agent can interact with multiple csv files passed in as a list."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "15f11fbd",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Error in on_chain_start callback: 'name'\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32;1m\u001b[1;3m\n",
"Invoking: `python_repl_ast` with `df1['Age'].nunique() - df2['Age'].nunique()`\n",
"\n",
"\n",
"\u001b[0m\u001b[36;1m\u001b[1;3m-1\u001b[0m\u001b[32;1m\u001b[1;3mThere is 1 row in the age column that is different between the two dataframes.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'There is 1 row in the age column that is different between the two dataframes.'"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent = create_csv_agent(\n",
" ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo-0613\"),\n",
" [\"titanic.csv\", \"titanic_age_fillna.csv\"],\n",
" verbose=True,\n",
" agent_type=AgentType.OPENAI_FUNCTIONS,\n",
")\n",
"agent.run(\"how many rows in the age column are different between the two dfs?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f2909808",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}