{ "cells": [ { "cell_type": "markdown", "id": "e78b7bb1", "metadata": {}, "source": [ "# Data Augmented Question Answering\n", "\n", "This notebook uses some generic prompts/language models to evaluate an question answering system that uses other sources of data besides what is in the model. For example, this can be used to evaluate a question answering system over your propritary data.\n", "\n", "## Setup\n", "Let's set up an example with our favorite example - the state of the union address." ] }, { "cell_type": "code", "execution_count": 1, "id": "ab4a6931", "metadata": {}, "outputs": [], "source": [ "from langchain.embeddings.openai import OpenAIEmbeddings\n", "from langchain.vectorstores.faiss import FAISS\n", "from langchain.text_splitter import CharacterTextSplitter\n", "from langchain import OpenAI, VectorDBQA" ] }, { "cell_type": "code", "execution_count": 2, "id": "4fdc211d", "metadata": {}, "outputs": [], "source": [ "with open('../state_of_the_union.txt') as f:\n", " state_of_the_union = f.read()\n", "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n", "texts = text_splitter.split_text(state_of_the_union)\n", "\n", "embeddings = OpenAIEmbeddings()\n", "docsearch = FAISS.from_texts(texts, embeddings)\n", "qa = VectorDBQA.from_llm(llm=OpenAI(), vectorstore=docsearch)" ] }, { "cell_type": "markdown", "id": "30fd72f2", "metadata": {}, "source": [ "## Examples\n", "Now we need some examples to evaluate. We can do this in two ways:\n", "\n", "1. Hard code some examples ourselves\n", "2. Generate examples automatically, using a language model" ] }, { "cell_type": "code", "execution_count": 3, "id": "3459b001", "metadata": {}, "outputs": [], "source": [ "# Hard-coded examples\n", "examples = [\n", " {\n", " \"query\": \"What did the president say about Ketanji Brown Jackson\",\n", " \"answer\": \"He praised her legal ability and said he nominated her for the supreme court.\"\n", " },\n", " {\n", " \"query\": \"What did the president say about Michael Jackson\",\n", " \"answer\": \"Nothing\"\n", " }\n", "]" ] }, { "cell_type": "code", "execution_count": 4, "id": "b9c3fa75", "metadata": {}, "outputs": [], "source": [ "# Generated examples\n", "from langchain.evaluation.qa import QAGenerateChain\n", "example_gen_chain = QAGenerateChain.from_llm(OpenAI())" ] }, { "cell_type": "code", "execution_count": 5, "id": "c24543a9", "metadata": {}, "outputs": [], "source": [ "new_examples = example_gen_chain.apply_and_parse([{\"doc\": t} for t in texts[:5]])" ] }, { "cell_type": "code", "execution_count": 6, "id": "a2d27560", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'query': 'What did Vladimir Putin seek to do according to the document?',\n", " 'answer': 'Vladimir Putin sought to shake the foundations of the free world and make it bend to his menacing ways.'},\n", " {'query': 'What did President Zelenskyy say in his speech to the European Parliament?',\n", " 'answer': 'President Zelenskyy said \"Light will win over darkness.\"'},\n", " {'query': \"How many countries joined the European Union in opposing Putin's attack on Ukraine?\",\n", " 'answer': '27'},\n", " {'query': 'What is the U.S. Department of Justice assembling in response to the Russian oligarchs?',\n", " 'answer': 'A dedicated task force.'},\n", " {'query': 'How much direct assistance is the US providing to Ukraine?',\n", " 'answer': 'The US is providing more than $1 Billion in direct assistance to Ukraine.'}]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "new_examples" ] }, { "cell_type": "code", "execution_count": 7, "id": "558da6f3", "metadata": {}, "outputs": [], "source": [ "# Combine examples\n", "examples += new_examples" ] }, { "cell_type": "markdown", "id": "443dc34e", "metadata": {}, "source": [ "## Evaluate\n", "Now that we have examples, we can use the question answering evaluator to evaluate our question answering chain." ] }, { "cell_type": "code", "execution_count": 8, "id": "782169a5", "metadata": {}, "outputs": [], "source": [ "from langchain.evaluation.qa import QAEvalChain" ] }, { "cell_type": "code", "execution_count": 9, "id": "1bb77416", "metadata": {}, "outputs": [], "source": [ "predictions = qa.apply(examples)" ] }, { "cell_type": "code", "execution_count": 10, "id": "bcd0ad7f", "metadata": {}, "outputs": [], "source": [ "llm = OpenAI(temperature=0)\n", "eval_chain = QAEvalChain.from_llm(llm)" ] }, { "cell_type": "code", "execution_count": 11, "id": "2e6af79a", "metadata": {}, "outputs": [], "source": [ "graded_outputs = eval_chain.evaluate(examples, predictions)" ] }, { "cell_type": "code", "execution_count": 12, "id": "32fac2dc", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'text': ' CORRECT'},\n", " {'text': ' CORRECT'},\n", " {'text': ' INCORRECT'},\n", " {'text': ' CORRECT'},\n", " {'text': ' CORRECT'},\n", " {'text': ' CORRECT'},\n", " {'text': ' CORRECT'}]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "graded_outputs" ] }, { "cell_type": "code", "execution_count": null, "id": "0bb9bc7e", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.8" } }, "nbformat": 4, "nbformat_minor": 5 }