langchain/docs/examples/evaluation/data_augmented_question_answering.ipynb
2022-12-26 11:36:49 -05:00

287 lines
10 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "e78b7bb1",
"metadata": {},
"source": [
"# Data Augmented Question Answering\n",
"\n",
"This notebook uses some generic prompts/language models to evaluate an question answering system that uses other sources of data besides what is in the model. For example, this can be used to evaluate a question answering system over your propritary data.\n",
"\n",
"## Setup\n",
"Let's set up an example with our favorite example - the state of the union address."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "ab4a6931",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.vectorstores.faiss import FAISS\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain import OpenAI, VectorDBQA"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "4fdc211d",
"metadata": {},
"outputs": [],
"source": [
"with open('../state_of_the_union.txt') as f:\n",
" state_of_the_union = f.read()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_text(state_of_the_union)\n",
"\n",
"embeddings = OpenAIEmbeddings()\n",
"docsearch = FAISS.from_texts(texts, embeddings)\n",
"qa = VectorDBQA.from_llm(llm=OpenAI(), vectorstore=docsearch)"
]
},
{
"cell_type": "markdown",
"id": "30fd72f2",
"metadata": {},
"source": [
"## Examples\n",
"Now we need some examples to evaluate. We can do this in two ways:\n",
"\n",
"1. Hard code some examples ourselves\n",
"2. Generate examples automatically, using a language model"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "3459b001",
"metadata": {},
"outputs": [],
"source": [
"# Hard-coded examples\n",
"examples = [\n",
" {\n",
" \"query\": \"What did the president say about Ketanji Brown Jackson\",\n",
" \"answer\": \"He praised her legal ability and said he nominated her for the supreme court.\"\n",
" },\n",
" {\n",
" \"query\": \"What did the president say about Michael Jackson\",\n",
" \"answer\": \"Nothing\"\n",
" }\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "b9c3fa75",
"metadata": {},
"outputs": [],
"source": [
"# Generated examples\n",
"from langchain.evaluation.qa import QAGenerateChain\n",
"example_gen_chain = QAGenerateChain.from_llm(OpenAI())"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "c24543a9",
"metadata": {},
"outputs": [],
"source": [
"new_examples = example_gen_chain.apply_and_parse([{\"doc\": t} for t in texts[:5]])"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "a2d27560",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'query': \"Who did Russia's Vladimir Putin miscalculate when he attempted to shake the foundations of the free world?\",\n",
" 'answer': 'Putin miscalculated the Ukrainian people.'},\n",
" {'query': 'What did President Zelenskyy say in his speech to the European Parliament?',\n",
" 'answer': '\"Light will win over darkness.\"'},\n",
" {'query': 'What countries joined the coalition to confront Putin?',\n",
" 'answer': 'France, Germany, Italy, the United Kingdom, Canada, Japan, Korea, Australia, New Zealand, and many others, even Switzerland.'},\n",
" {'query': 'What measures is the US taking to punish Russia and its oligarchs?',\n",
" 'answer': \"The US is enforcing powerful economic sanctions, cutting off Russia's largest banks from the international financial system, preventing the Russian Ruble from being defended, choking off Russia's access to technology, and joining with European allies to seize yachts, luxury apartments, and private jets.\"},\n",
" {'query': 'What is the total amount of direct assistance being provided to Ukraine?',\n",
" 'answer': '$1 Billion'}]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"new_examples"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "558da6f3",
"metadata": {},
"outputs": [],
"source": [
"# Combine examples\n",
"examples += new_examples"
]
},
{
"cell_type": "markdown",
"id": "443dc34e",
"metadata": {},
"source": [
"## Evaluate\n",
"Now that we have examples, we can use the question answering evaluator to evaluate our question answering chain."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "782169a5",
"metadata": {},
"outputs": [],
"source": [
"from langchain.evaluation.qa import QAEvalChain"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "1bb77416",
"metadata": {},
"outputs": [],
"source": [
"predictions = qa.apply(examples)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "bcd0ad7f",
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)\n",
"eval_chain = QAEvalChain.from_llm(llm)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "2e6af79a",
"metadata": {},
"outputs": [],
"source": [
"graded_outputs = eval_chain.evaluate(examples, predictions)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "32fac2dc",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Example 0:\n",
"Question: What did the president say about Ketanji Brown Jackson\n",
"Real Answer: He praised her legal ability and said he nominated her for the supreme court.\n",
"Predicted Answer: The president said that Ketanji Brown Jackson is one of the nation's top legal minds and is a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He said she is a consensus builder and has received a broad range of support since being nominated.\n",
"Predicted Grade: CORRECT\n",
"\n",
"Example 1:\n",
"Question: What did the president say about Michael Jackson\n",
"Real Answer: Nothing\n",
"Predicted Answer: The president did not mention Michael Jackson.\n",
"Predicted Grade: CORRECT\n",
"\n",
"Example 2:\n",
"Question: Who did Russia's Vladimir Putin miscalculate when he attempted to shake the foundations of the free world?\n",
"Real Answer: Putin miscalculated the Ukrainian people.\n",
"Predicted Answer: Putin miscalculated the strength of the Ukrainian people, the support of freedom-loving nations from Europe and the Americas to Asia and Africa, and the resolve of the United States and its allies.\n",
"Predicted Grade: CORRECT\n",
"\n",
"Example 3:\n",
"Question: What did President Zelenskyy say in his speech to the European Parliament?\n",
"Real Answer: \"Light will win over darkness.\"\n",
"Predicted Answer: President Zelenskyy said in his speech to the European Parliament “Light will win over darkness.”\n",
"Predicted Grade: CORRECT\n",
"\n",
"Example 4:\n",
"Question: What countries joined the coalition to confront Putin?\n",
"Real Answer: France, Germany, Italy, the United Kingdom, Canada, Japan, Korea, Australia, New Zealand, and many others, even Switzerland.\n",
"Predicted Answer: The coalition included members of the European Union, including France, Germany, Italy, as well as countries like the United Kingdom, Canada, Japan, Korea, Australia, New Zealand, and many others, even Switzerland.\n",
"Predicted Grade: CORRECT\n",
"\n",
"Example 5:\n",
"Question: What measures is the US taking to punish Russia and its oligarchs?\n",
"Real Answer: The US is enforcing powerful economic sanctions, cutting off Russia's largest banks from the international financial system, preventing the Russian Ruble from being defended, choking off Russia's access to technology, and joining with European allies to seize yachts, luxury apartments, and private jets.\n",
"Predicted Answer: The US is enforcing economic sanctions, cutting off Russia's access to international financial systems, preventing Russia's central bank from defending the Ruble and making Putin's \"war fund\" worthless, and targeting the crimes of Russian oligarchs by assembling a dedicated task force to seize their yachts, luxury apartments, and private jets. The US is also closing off American airspace to all Russian flights, providing aid to Ukraine, and mobilizing ground forces, air squadrons, and ship deployments to protect NATO countries.\n",
"Predicted Grade: CORRECT\n",
"\n",
"Example 6:\n",
"Question: What is the total amount of direct assistance being provided to Ukraine?\n",
"Real Answer: $1 Billion\n",
"Predicted Answer: The total amount of direct assistance being provided to Ukraine is $1 billion.\n",
"Predicted Grade: CORRECT\n",
"\n"
]
}
],
"source": [
"for i, eg in enumerate(examples):\n",
" print(f\"Example {i}:\")\n",
" print(\"Question: \" + predictions[i]['query'])\n",
" print(\"Real Answer: \" + predictions[i]['answer'])\n",
" print(\"Predicted Answer: \" + predictions[i]['result'])\n",
" print(\"Predicted Grade: \" + graded_outputs[i]['text'])\n",
" print()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0bb9bc7e",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}