mirror of
				https://github.com/hwchase17/langchain.git
				synced 2025-10-31 07:41:40 +00:00 
			
		
		
		
	# Fixed typos (issues #4818 & #4668 & more typos) - At some places, it said `model = ChatOpenAI(model='gpt-3.5-turbo')` but should be `model = ChatOpenAI(model_name='gpt-3.5-turbo')` - Fixes some other typos Fixes #4818, #4668 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot
		
			
				
	
	
		
			776 lines
		
	
	
		
			23 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			776 lines
		
	
	
		
			23 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| {
 | |
|  "cells": [
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "5e3cb542-933d-4bf3-a82b-d9d6395a7832",
 | |
|    "metadata": {
 | |
|     "tags": []
 | |
|    },
 | |
|    "source": [
 | |
|     "# Wikibase Agent\n",
 | |
|     "\n",
 | |
|     "This notebook demonstrates a very simple wikibase agent that uses sparql generation. Although this code is intended to work against any\n",
 | |
|     "wikibase instance, we use http://wikidata.org for testing.\n",
 | |
|     "\n",
 | |
|     "If you are interested in wikibases and sparql, please consider helping to improve this agent. Look [here](https://github.com/donaldziff/langchain-wikibase) for more details and open questions.\n"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "07d42966-7e99-4157-90dc-6704977dcf1b",
 | |
|    "metadata": {
 | |
|     "tags": []
 | |
|    },
 | |
|    "source": [
 | |
|     "## Preliminaries"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "9132f093-c61e-4b8d-abef-91ebef3fc85f",
 | |
|    "metadata": {
 | |
|     "tags": []
 | |
|    },
 | |
|    "source": [
 | |
|     "### API keys and other secrats\n",
 | |
|     "\n",
 | |
|     "We use an `.ini` file, like this: \n",
 | |
|     "```\n",
 | |
|     "[OPENAI]\n",
 | |
|     "OPENAI_API_KEY=xyzzy\n",
 | |
|     "[WIKIDATA]\n",
 | |
|     "WIKIDATA_USER_AGENT_HEADER=argle-bargle\n",
 | |
|     "```"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 1,
 | |
|    "id": "99567dfd-05a7-412f-abf0-9b9f4424acbd",
 | |
|    "metadata": {
 | |
|     "tags": []
 | |
|    },
 | |
|    "outputs": [
 | |
|     {
 | |
|      "data": {
 | |
|       "text/plain": [
 | |
|        "['./secrets.ini']"
 | |
|       ]
 | |
|      },
 | |
|      "execution_count": 1,
 | |
|      "metadata": {},
 | |
|      "output_type": "execute_result"
 | |
|     }
 | |
|    ],
 | |
|    "source": [
 | |
|     "import configparser\n",
 | |
|     "config = configparser.ConfigParser()\n",
 | |
|     "config.read('./secrets.ini')"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "332b6658-c978-41ca-a2be-4f8677fecaef",
 | |
|    "metadata": {
 | |
|     "tags": []
 | |
|    },
 | |
|    "source": [
 | |
|     "### OpenAI API Key\n",
 | |
|     "\n",
 | |
|     "An OpenAI API key is required unless you modify the code below to use another LLM provider."
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 2,
 | |
|    "id": "dd328ee2-33cc-4e1e-aff7-cc0a2e05e2e6",
 | |
|    "metadata": {
 | |
|     "tags": []
 | |
|    },
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "openai_api_key = config['OPENAI']['OPENAI_API_KEY']\n",
 | |
|     "import os\n",
 | |
|     "os.environ.update({'OPENAI_API_KEY': openai_api_key})"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "42a9311b-600d-42bc-b000-2692ef87a213",
 | |
|    "metadata": {
 | |
|     "tags": []
 | |
|    },
 | |
|    "source": [
 | |
|     "### Wikidata user-agent header\n",
 | |
|     "\n",
 | |
|     "Wikidata policy requires a user-agent header. See https://meta.wikimedia.org/wiki/User-Agent_policy. However, at present this policy is not strictly enforced."
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 3,
 | |
|    "id": "17ba657e-789d-40e1-b4b7-4f29ba06fe79",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "wikidata_user_agent_header = None if not config.has_section('WIKIDATA') else config['WIKIDATA']['WIKIDAtA_USER_AGENT_HEADER']"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "db08d308-050a-4fc8-93c9-8de4ae977ac3",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "### Enable tracing if desired"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 4,
 | |
|    "id": "77d2da08-fccd-4676-b77e-c0e89bf343cb",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "#import os\n",
 | |
|     "#os.environ[\"LANGCHAIN_HANDLER\"] = \"langchain\"\n",
 | |
|     "#os.environ[\"LANGCHAIN_SESSION\"] = \"default\" # Make sure this session actually exists. "
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "3dbc5bfc-48ce-4f90-873c-7336b21300c6",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "# Tools\n",
 | |
|     "\n",
 | |
|     "Three tools are provided for this simple agent:\n",
 | |
|     "* `ItemLookup`: for finding the q-number of an item\n",
 | |
|     "* `PropertyLookup`: for finding the p-number of a property\n",
 | |
|     "* `SparqlQueryRunner`: for running a sparql query"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "1f801b4e-6576-4914-aa4f-6f4c4e3c7924",
 | |
|    "metadata": {
 | |
|     "tags": []
 | |
|    },
 | |
|    "source": [
 | |
|     "## Item and Property lookup\n",
 | |
|     "\n",
 | |
|     "Item and Property lookup are implemented in a single method, using an elastic search endpoint. Not all wikibase instances have it, but wikidata does, and that's where we'll start."
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 5,
 | |
|    "id": "42d23f0a-1c74-4c9c-85f2-d0e24204e96a",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "def get_nested_value(o: dict, path: list) -> any:\n",
 | |
|     "    current = o\n",
 | |
|     "    for key in path:\n",
 | |
|     "        try:\n",
 | |
|     "            current = current[key]\n",
 | |
|     "        except:\n",
 | |
|     "            return None\n",
 | |
|     "    return current\n",
 | |
|     "\n",
 | |
|     "import requests\n",
 | |
|     "\n",
 | |
|     "from typing import Optional\n",
 | |
|     "\n",
 | |
|     "def vocab_lookup(search: str, entity_type: str = \"item\",\n",
 | |
|     "                 url: str = \"https://www.wikidata.org/w/api.php\",\n",
 | |
|     "                 user_agent_header: str = wikidata_user_agent_header,\n",
 | |
|     "                 srqiprofile: str = None,\n",
 | |
|     "                ) -> Optional[str]:    \n",
 | |
|     "    headers = {\n",
 | |
|     "        'Accept': 'application/json'\n",
 | |
|     "    }\n",
 | |
|     "    if wikidata_user_agent_header is not None:\n",
 | |
|     "        headers['User-Agent'] = wikidata_user_agent_header\n",
 | |
|     "    \n",
 | |
|     "    if entity_type == \"item\":\n",
 | |
|     "        srnamespace = 0\n",
 | |
|     "        srqiprofile = \"classic_noboostlinks\" if srqiprofile is None else srqiprofile\n",
 | |
|     "    elif entity_type == \"property\":\n",
 | |
|     "        srnamespace = 120\n",
 | |
|     "        srqiprofile = \"classic\" if srqiprofile is None else srqiprofile\n",
 | |
|     "    else:\n",
 | |
|     "        raise ValueError(\"entity_type must be either 'property' or 'item'\")          \n",
 | |
|     "    \n",
 | |
|     "    params = {\n",
 | |
|     "        \"action\": \"query\",\n",
 | |
|     "        \"list\": \"search\",\n",
 | |
|     "        \"srsearch\": search,\n",
 | |
|     "        \"srnamespace\": srnamespace,\n",
 | |
|     "        \"srlimit\": 1,\n",
 | |
|     "        \"srqiprofile\": srqiprofile,\n",
 | |
|     "        \"srwhat\": 'text',\n",
 | |
|     "        \"format\": \"json\"\n",
 | |
|     "    }\n",
 | |
|     "    \n",
 | |
|     "    response = requests.get(url, headers=headers, params=params)\n",
 | |
|     "        \n",
 | |
|     "    if response.status_code == 200:\n",
 | |
|     "        title = get_nested_value(response.json(), ['query', 'search', 0, 'title'])\n",
 | |
|     "        if title is None:\n",
 | |
|     "            return f\"I couldn't find any {entity_type} for '{search}'. Please rephrase your request and try again\"\n",
 | |
|     "        # if there is a prefix, strip it off\n",
 | |
|     "        return title.split(':')[-1]\n",
 | |
|     "    else:\n",
 | |
|     "        return \"Sorry, I got an error. Please try again.\""
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 6,
 | |
|    "id": "e52060fa-3614-43fb-894e-54e9b75d1e9f",
 | |
|    "metadata": {},
 | |
|    "outputs": [
 | |
|     {
 | |
|      "name": "stdout",
 | |
|      "output_type": "stream",
 | |
|      "text": [
 | |
|       "Q4180017\n"
 | |
|      ]
 | |
|     }
 | |
|    ],
 | |
|    "source": [
 | |
|     "print(vocab_lookup(\"Malin 1\"))"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 7,
 | |
|    "id": "b23ab322-b2cf-404e-b36f-2bfc1d79b0d3",
 | |
|    "metadata": {},
 | |
|    "outputs": [
 | |
|     {
 | |
|      "name": "stdout",
 | |
|      "output_type": "stream",
 | |
|      "text": [
 | |
|       "P31\n"
 | |
|      ]
 | |
|     }
 | |
|    ],
 | |
|    "source": [
 | |
|     "print(vocab_lookup(\"instance of\", entity_type=\"property\"))"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 8,
 | |
|    "id": "89020cc8-104e-42d0-ac32-885e590de515",
 | |
|    "metadata": {},
 | |
|    "outputs": [
 | |
|     {
 | |
|      "name": "stdout",
 | |
|      "output_type": "stream",
 | |
|      "text": [
 | |
|       "I couldn't find any item for 'Ceci n'est pas un q-item'. Please rephrase your request and try again\n"
 | |
|      ]
 | |
|     }
 | |
|    ],
 | |
|    "source": [
 | |
|     "print(vocab_lookup(\"Ceci n'est pas un q-item\"))"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "78d66d8b-0e34-4d3f-a18d-c7284840ac76",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "## Sparql runner "
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "c6f60069-fbe0-4015-87fb-0e487cd914e7",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "This tool runs sparql - by default, wikidata is used."
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 9,
 | |
|    "id": "b5b97a4d-2a39-4993-88d9-e7818c0a2853",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "import requests\n",
 | |
|     "from typing import List, Dict, Any\n",
 | |
|     "import json\n",
 | |
|     "\n",
 | |
|     "def run_sparql(query: str, url='https://query.wikidata.org/sparql',\n",
 | |
|     "               user_agent_header: str = wikidata_user_agent_header) -> List[Dict[str, Any]]:\n",
 | |
|     "    headers = {\n",
 | |
|     "        'Accept': 'application/json'\n",
 | |
|     "    }\n",
 | |
|     "    if wikidata_user_agent_header is not None:\n",
 | |
|     "        headers['User-Agent'] = wikidata_user_agent_header\n",
 | |
|     "\n",
 | |
|     "    response = requests.get(url, headers=headers, params={'query': query, 'format': 'json'})\n",
 | |
|     "\n",
 | |
|     "    if response.status_code != 200:\n",
 | |
|     "        return \"That query failed. Perhaps you could try a different one?\"\n",
 | |
|     "    results = get_nested_value(response.json(),['results', 'bindings'])\n",
 | |
|     "    return json.dumps(results)"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 10,
 | |
|    "id": "149722ec-8bc1-4d4f-892b-e4ddbe8444c1",
 | |
|    "metadata": {},
 | |
|    "outputs": [
 | |
|     {
 | |
|      "data": {
 | |
|       "text/plain": [
 | |
|        "'[{\"count\": {\"datatype\": \"http://www.w3.org/2001/XMLSchema#integer\", \"type\": \"literal\", \"value\": \"20\"}}]'"
 | |
|       ]
 | |
|      },
 | |
|      "execution_count": 10,
 | |
|      "metadata": {},
 | |
|      "output_type": "execute_result"
 | |
|     }
 | |
|    ],
 | |
|    "source": [
 | |
|     "run_sparql(\"SELECT (COUNT(?children) as ?count) WHERE { wd:Q1339 wdt:P40 ?children . }\")"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "9f0302fd-ba35-4acc-ba32-1d7c9295c898",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "# Agent"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "3122a961-9673-4a52-b1cd-7d62fbdf8d96",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "## Wrap the tools"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 11,
 | |
|    "id": "cc41ae88-2e53-4363-9878-28b26430cb1e",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "from langchain.agents import Tool, AgentExecutor, LLMSingleActionAgent, AgentOutputParser\n",
 | |
|     "from langchain.prompts import StringPromptTemplate\n",
 | |
|     "from langchain import OpenAI, LLMChain\n",
 | |
|     "from typing import List, Union\n",
 | |
|     "from langchain.schema import AgentAction, AgentFinish\n",
 | |
|     "import re"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 12,
 | |
|    "id": "2810a3ce-b9c6-47ee-8068-12ca967cd0ea",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "# Define which tools the agent can use to answer user queries\n",
 | |
|     "tools = [\n",
 | |
|     "    Tool(\n",
 | |
|     "        name = \"ItemLookup\",\n",
 | |
|     "        func=(lambda x: vocab_lookup(x, entity_type=\"item\")),\n",
 | |
|     "        description=\"useful for when you need to know the q-number for an item\"\n",
 | |
|     "    ),\n",
 | |
|     "    Tool(\n",
 | |
|     "        name = \"PropertyLookup\",\n",
 | |
|     "        func=(lambda x: vocab_lookup(x, entity_type=\"property\")),\n",
 | |
|     "        description=\"useful for when you need to know the p-number for a property\"\n",
 | |
|     "    ),\n",
 | |
|     "    Tool(\n",
 | |
|     "        name = \"SparqlQueryRunner\",\n",
 | |
|     "        func=run_sparql,\n",
 | |
|     "        description=\"useful for getting results from a wikibase\"\n",
 | |
|     "    )    \n",
 | |
|     "]"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "ab0f2778-a195-4a4a-a5b4-c1e809e1fb7b",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "## Prompts"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 13,
 | |
|    "id": "7bd4ba4f-57d6-4ceb-b932-3cb0d0509a24",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "# Set up the base template\n",
 | |
|     "template = \"\"\"\n",
 | |
|     "Answer the following questions by running a sparql query against a wikibase where the p and q items are \n",
 | |
|     "completely unknown to you. You will need to discover the p and q items before you can generate the sparql.\n",
 | |
|     "Do not assume you know the p and q items for any concepts. Always use tools to find all p and q items.\n",
 | |
|     "After you generate the sparql, you should run it. The results will be returned in json. \n",
 | |
|     "Summarize the json results in natural language.\n",
 | |
|     "\n",
 | |
|     "You may assume the following prefixes:\n",
 | |
|     "PREFIX wd: <http://www.wikidata.org/entity/>\n",
 | |
|     "PREFIX wdt: <http://www.wikidata.org/prop/direct/>\n",
 | |
|     "PREFIX p: <http://www.wikidata.org/prop/>\n",
 | |
|     "PREFIX ps: <http://www.wikidata.org/prop/statement/>\n",
 | |
|     "\n",
 | |
|     "When generating sparql:\n",
 | |
|     "* Try to avoid \"count\" and \"filter\" queries if possible\n",
 | |
|     "* Never enclose the sparql in back-quotes\n",
 | |
|     "\n",
 | |
|     "You have access to the following tools:\n",
 | |
|     "\n",
 | |
|     "{tools}\n",
 | |
|     "\n",
 | |
|     "Use the following format:\n",
 | |
|     "\n",
 | |
|     "Question: the input question for which you must provide a natural language answer\n",
 | |
|     "Thought: you should always think about what to do\n",
 | |
|     "Action: the action to take, should be one of [{tool_names}]\n",
 | |
|     "Action Input: the input to the action\n",
 | |
|     "Observation: the result of the action\n",
 | |
|     "... (this Thought/Action/Action Input/Observation can repeat N times)\n",
 | |
|     "Thought: I now know the final answer\n",
 | |
|     "Final Answer: the final answer to the original input question\n",
 | |
|     "\n",
 | |
|     "Question: {input}\n",
 | |
|     "{agent_scratchpad}\"\"\"\n"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 14,
 | |
|    "id": "7e8d771a-64bb-4ec8-b472-6a9a40c6dd38",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "# Set up a prompt template\n",
 | |
|     "class CustomPromptTemplate(StringPromptTemplate):\n",
 | |
|     "    # The template to use\n",
 | |
|     "    template: str\n",
 | |
|     "    # The list of tools available\n",
 | |
|     "    tools: List[Tool]\n",
 | |
|     "    \n",
 | |
|     "    def format(self, **kwargs) -> str:\n",
 | |
|     "        # Get the intermediate steps (AgentAction, Observation tuples)\n",
 | |
|     "        # Format them in a particular way\n",
 | |
|     "        intermediate_steps = kwargs.pop(\"intermediate_steps\")\n",
 | |
|     "        thoughts = \"\"\n",
 | |
|     "        for action, observation in intermediate_steps:\n",
 | |
|     "            thoughts += action.log\n",
 | |
|     "            thoughts += f\"\\nObservation: {observation}\\nThought: \"\n",
 | |
|     "        # Set the agent_scratchpad variable to that value\n",
 | |
|     "        kwargs[\"agent_scratchpad\"] = thoughts\n",
 | |
|     "        # Create a tools variable from the list of tools provided\n",
 | |
|     "        kwargs[\"tools\"] = \"\\n\".join([f\"{tool.name}: {tool.description}\" for tool in self.tools])\n",
 | |
|     "        # Create a list of tool names for the tools provided\n",
 | |
|     "        kwargs[\"tool_names\"] = \", \".join([tool.name for tool in self.tools])\n",
 | |
|     "        return self.template.format(**kwargs)"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 15,
 | |
|    "id": "f97dca78-fdde-4a70-9137-e34a21d14e64",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "prompt = CustomPromptTemplate(\n",
 | |
|     "    template=template,\n",
 | |
|     "    tools=tools,\n",
 | |
|     "    # This omits the `agent_scratchpad`, `tools`, and `tool_names` variables because those are generated dynamically\n",
 | |
|     "    # This includes the `intermediate_steps` variable because that is needed\n",
 | |
|     "    input_variables=[\"input\", \"intermediate_steps\"]\n",
 | |
|     ")"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "12c57d77-3c1e-4cde-9a83-7d2134392479",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "## Output parser \n",
 | |
|     "This is unchanged from langchain docs"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 16,
 | |
|    "id": "42da05eb-c103-4649-9d20-7143a8880721",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "class CustomOutputParser(AgentOutputParser):\n",
 | |
|     "    \n",
 | |
|     "    def parse(self, llm_output: str) -> Union[AgentAction, AgentFinish]:\n",
 | |
|     "        # Check if agent should finish\n",
 | |
|     "        if \"Final Answer:\" in llm_output:\n",
 | |
|     "            return AgentFinish(\n",
 | |
|     "                # Return values is generally always a dictionary with a single `output` key\n",
 | |
|     "                # It is not recommended to try anything else at the moment :)\n",
 | |
|     "                return_values={\"output\": llm_output.split(\"Final Answer:\")[-1].strip()},\n",
 | |
|     "                log=llm_output,\n",
 | |
|     "            )\n",
 | |
|     "        # Parse out the action and action input\n",
 | |
|     "        regex = r\"Action: (.*?)[\\n]*Action Input:[\\s]*(.*)\"\n",
 | |
|     "        match = re.search(regex, llm_output, re.DOTALL)\n",
 | |
|     "        if not match:\n",
 | |
|     "            raise ValueError(f\"Could not parse LLM output: `{llm_output}`\")\n",
 | |
|     "        action = match.group(1).strip()\n",
 | |
|     "        action_input = match.group(2)\n",
 | |
|     "        # Return the action and action input\n",
 | |
|     "        return AgentAction(tool=action, tool_input=action_input.strip(\" \").strip('\"'), log=llm_output)"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 17,
 | |
|    "id": "d2b4d710-8cc9-4040-9269-59cf6c5c22be",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "output_parser = CustomOutputParser()"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "48a758cb-93a7-4555-b69a-896d2d43c6f0",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "## Specify the LLM model"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 18,
 | |
|    "id": "72988c79-8f60-4b0f-85ee-6af32e8de9c2",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "from langchain.chat_models import ChatOpenAI\n",
 | |
|     "llm = ChatOpenAI(model_name=\"gpt-4\", temperature=0)"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "95685d14-647a-4e24-ae2c-a8dd1e364921",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "## Agent and agent executor"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 19,
 | |
|    "id": "13d55765-bfa1-43b3-b7cb-00f52ebe7747",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "# LLM chain consisting of the LLM and a prompt\n",
 | |
|     "llm_chain = LLMChain(llm=llm, prompt=prompt)"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 20,
 | |
|    "id": "b3f7ac3c-398e-49f9-baed-554f49a191c3",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "tool_names = [tool.name for tool in tools]\n",
 | |
|     "agent = LLMSingleActionAgent(\n",
 | |
|     "    llm_chain=llm_chain, \n",
 | |
|     "    output_parser=output_parser,\n",
 | |
|     "    stop=[\"\\nObservation:\"], \n",
 | |
|     "    allowed_tools=tool_names\n",
 | |
|     ")"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 21,
 | |
|    "id": "65740577-272e-4853-8d47-b87784cfaba0",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "agent_executor = AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True)"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "attachments": {},
 | |
|    "cell_type": "markdown",
 | |
|    "id": "66e3d13b-77cf-41d3-b541-b54535c14459",
 | |
|    "metadata": {},
 | |
|    "source": [
 | |
|     "## Run it!"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 22,
 | |
|    "id": "6e97a07c-d7bf-4a35-9ab2-b59ae865c62c",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": [
 | |
|     "# If you prefer in-line tracing, uncomment this line\n",
 | |
|     "# agent_executor.agent.llm_chain.verbose = True"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 23,
 | |
|    "id": "a11ca60d-f57b-4fe8-943e-a258e37463c7",
 | |
|    "metadata": {},
 | |
|    "outputs": [
 | |
|     {
 | |
|      "name": "stdout",
 | |
|      "output_type": "stream",
 | |
|      "text": [
 | |
|       "\n",
 | |
|       "\n",
 | |
|       "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
 | |
|       "\u001b[32;1m\u001b[1;3mThought: I need to find the Q number for J.S. Bach.\n",
 | |
|       "Action: ItemLookup\n",
 | |
|       "Action Input: J.S. Bach\u001b[0m\n",
 | |
|       "\n",
 | |
|       "Observation:\u001b[36;1m\u001b[1;3mQ1339\u001b[0m\u001b[32;1m\u001b[1;3mI need to find the P number for children.\n",
 | |
|       "Action: PropertyLookup\n",
 | |
|       "Action Input: children\u001b[0m\n",
 | |
|       "\n",
 | |
|       "Observation:\u001b[33;1m\u001b[1;3mP1971\u001b[0m\u001b[32;1m\u001b[1;3mNow I can query the number of children J.S. Bach had.\n",
 | |
|       "Action: SparqlQueryRunner\n",
 | |
|       "Action Input: SELECT ?children WHERE { wd:Q1339 wdt:P1971 ?children }\u001b[0m\n",
 | |
|       "\n",
 | |
|       "Observation:\u001b[38;5;200m\u001b[1;3m[{\"children\": {\"datatype\": \"http://www.w3.org/2001/XMLSchema#decimal\", \"type\": \"literal\", \"value\": \"20\"}}]\u001b[0m\u001b[32;1m\u001b[1;3mI now know the final answer.\n",
 | |
|       "Final Answer: J.S. Bach had 20 children.\u001b[0m\n",
 | |
|       "\n",
 | |
|       "\u001b[1m> Finished chain.\u001b[0m\n"
 | |
|      ]
 | |
|     },
 | |
|     {
 | |
|      "data": {
 | |
|       "text/plain": [
 | |
|        "'J.S. Bach had 20 children.'"
 | |
|       ]
 | |
|      },
 | |
|      "execution_count": 23,
 | |
|      "metadata": {},
 | |
|      "output_type": "execute_result"
 | |
|     }
 | |
|    ],
 | |
|    "source": [
 | |
|     "agent_executor.run(\"How many children did J.S. Bach have?\")"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": 24,
 | |
|    "id": "d0b42a41-996b-4156-82e4-f0651a87ee34",
 | |
|    "metadata": {},
 | |
|    "outputs": [
 | |
|     {
 | |
|      "name": "stdout",
 | |
|      "output_type": "stream",
 | |
|      "text": [
 | |
|       "\n",
 | |
|       "\n",
 | |
|       "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
 | |
|       "\u001b[32;1m\u001b[1;3mThought: To find Hakeem Olajuwon's Basketball-Reference.com NBA player ID, I need to first find his Wikidata item (Q-number) and then query for the relevant property (P-number).\n",
 | |
|       "Action: ItemLookup\n",
 | |
|       "Action Input: Hakeem Olajuwon\u001b[0m\n",
 | |
|       "\n",
 | |
|       "Observation:\u001b[36;1m\u001b[1;3mQ273256\u001b[0m\u001b[32;1m\u001b[1;3mNow that I have Hakeem Olajuwon's Wikidata item (Q273256), I need to find the P-number for the Basketball-Reference.com NBA player ID property.\n",
 | |
|       "Action: PropertyLookup\n",
 | |
|       "Action Input: Basketball-Reference.com NBA player ID\u001b[0m\n",
 | |
|       "\n",
 | |
|       "Observation:\u001b[33;1m\u001b[1;3mP2685\u001b[0m\u001b[32;1m\u001b[1;3mNow that I have both the Q-number for Hakeem Olajuwon (Q273256) and the P-number for the Basketball-Reference.com NBA player ID property (P2685), I can run a SPARQL query to get the ID value.\n",
 | |
|       "Action: SparqlQueryRunner\n",
 | |
|       "Action Input: \n",
 | |
|       "SELECT ?playerID WHERE {\n",
 | |
|       "  wd:Q273256 wdt:P2685 ?playerID .\n",
 | |
|       "}\u001b[0m\n",
 | |
|       "\n",
 | |
|       "Observation:\u001b[38;5;200m\u001b[1;3m[{\"playerID\": {\"type\": \"literal\", \"value\": \"o/olajuha01\"}}]\u001b[0m\u001b[32;1m\u001b[1;3mI now know the final answer\n",
 | |
|       "Final Answer: Hakeem Olajuwon's Basketball-Reference.com NBA player ID is \"o/olajuha01\".\u001b[0m\n",
 | |
|       "\n",
 | |
|       "\u001b[1m> Finished chain.\u001b[0m\n"
 | |
|      ]
 | |
|     },
 | |
|     {
 | |
|      "data": {
 | |
|       "text/plain": [
 | |
|        "'Hakeem Olajuwon\\'s Basketball-Reference.com NBA player ID is \"o/olajuha01\".'"
 | |
|       ]
 | |
|      },
 | |
|      "execution_count": 24,
 | |
|      "metadata": {},
 | |
|      "output_type": "execute_result"
 | |
|     }
 | |
|    ],
 | |
|    "source": [
 | |
|     "agent_executor.run(\"What is the Basketball-Reference.com NBA player ID of Hakeem Olajuwon?\")"
 | |
|    ]
 | |
|   },
 | |
|   {
 | |
|    "cell_type": "code",
 | |
|    "execution_count": null,
 | |
|    "id": "05fb3a3e-8a9f-482d-bd54-4c6e60ef60dd",
 | |
|    "metadata": {},
 | |
|    "outputs": [],
 | |
|    "source": []
 | |
|   }
 | |
|  ],
 | |
|  "metadata": {
 | |
|   "kernelspec": {
 | |
|    "display_name": "conda210",
 | |
|    "language": "python",
 | |
|    "name": "conda210"
 | |
|   },
 | |
|   "language_info": {
 | |
|    "codemirror_mode": {
 | |
|     "name": "ipython",
 | |
|     "version": 3
 | |
|    },
 | |
|    "file_extension": ".py",
 | |
|    "mimetype": "text/x-python",
 | |
|    "name": "python",
 | |
|    "nbconvert_exporter": "python",
 | |
|    "pygments_lexer": "ipython3",
 | |
|    "version": "3.9.16"
 | |
|   }
 | |
|  },
 | |
|  "nbformat": 4,
 | |
|  "nbformat_minor": 5
 | |
| }
 |