Minor updates

Add hallucination and doc relevance
RAG guide
2026-02-08 10:09:46 +00:00 · 2024-04-17 12:52:05 -07:00 · 2024-04-16 16:56:41 -07:00 · 2024-04-16 15:31:48 -07:00
5 changed files with 559 additions and 0 deletions
--- a/docs/docs/guides/productionization/evaluation/examples/rag.ipynb
+++ b/docs/docs/guides/productionization/evaluation/examples/rag.ipynb
@@ -0,0 +1,559 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "2e7db2b1-8f9c-46bd-9c50-b6cfb0a38a22",
+   "metadata": {},
+   "source": [
+    "# RAG Evaluation\n",
+    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs/guides/evaluation/examples/rag.ipynb)\n",
+    "\n",
+    "RAG (Retrieval Augmented Generation) is one of the most popular LLM applications.\n",
+    "\n",
+    "For an in-depth review, see our RAG series of notebooks and videos [here](https://github.com/langchain-ai/rag-from-scratch)).\n",
+    "\n",
+    "## Types of RAG eval\n",
+    "\n",
+    "There are at least 4 types of RAG eval that users of typically interested in:\n",
+    "\n",
+    "![](../../../../../static/img/langsmith_rag_eval.png)\n",
+    "\n",
+    "\n",
+    "Each of these evals has something in common: it will compare text (e.g., answer vs reference answer, etc).\n",
+    "\n",
+    "We can use various built-in `LangChainStringEvaluator` types for this (see [here](https://docs.smith.langchain.com/evaluation/faq/evaluator-implementations#overview)).\n",
+    "\n",
+    "All `LangChainStringEvaluator` implementations can accept 3 inputs:\n",
+    "\n",
+    "```\n",
+    "prediction: The prediction string.\n",
+    "reference: The reference string.\n",
+    "input: The input string.\n",
+    "```\n",
+    "\n",
+    "Below, we will use this to perform eval.\n",
+    "\n",
+    "## RAG Chain \n",
+    "\n",
+    "To start, we build a RAG chain. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d809e9a0-44bc-4e9f-8eee-732ef077538c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! pip install langchain-community langchain chromdb tiktoken"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "760cab79-2d5e-4324-ba4a-54b6f4094cb0",
+   "metadata": {},
+   "source": [
+    "We build an `index` using a set of LangChain docs."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "6f7c0017-f4dd-4071-aa48-40957ffb4e9d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "### INDEX\n",
+    "\n",
+    "from bs4 import BeautifulSoup as Soup\n",
+    "from langchain_community.vectorstores import Chroma\n",
+    "from langchain_openai import OpenAIEmbeddings\n",
+    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
+    "from langchain_community.document_loaders.recursive_url_loader import RecursiveUrlLoader\n",
+    "\n",
+    "# Load\n",
+    "url = \"https://python.langchain.com/docs/expression_language/\"\n",
+    "loader = RecursiveUrlLoader(url=url, max_depth=20, extractor=lambda x: Soup(x, \"html.parser\").text)\n",
+    "docs = loader.load()\n",
+    "\n",
+    "# Split\n",
+    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)\n",
+    "splits = text_splitter.split_documents(docs)\n",
+    "\n",
+    "# Embed\n",
+    "vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())\n",
+    "\n",
+    "# Index\n",
+    "retriever = vectorstore.as_retriever()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c365fb82-78a6-40b6-bd59-daaa1e79d6c8",
+   "metadata": {},
+   "source": [
+    "Next, we build a `RAG chain` that returns an `answer` and the retrieved documents as `contexts`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "68e249d7-bc6c-4631-b099-6daaeeddf38a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "### RAG \n",
+    "\n",
+    "import openai\n",
+    "from langsmith import traceable\n",
+    "from langsmith.wrappers import wrap_openai\n",
+    "\n",
+    "class RagBot:\n",
+    "    def __init__(self, retriever, model: str = \"gpt-4-turbo-preview\"):\n",
+    "        self._retriever = retriever\n",
+    "        # Wrapping the client instruments the LLM\n",
+    "        self._client = wrap_openai(openai.Client())\n",
+    "        self._model = model\n",
+    "\n",
+    "    @traceable\n",
+    "    def get_answer(self, question: str):\n",
+    "        similar = self._retriever.invoke(question)\n",
+    "        response = self._client.chat.completions.create(\n",
+    "            model=self._model,\n",
+    "            messages=[\n",
+    "                {\n",
+    "                    \"role\": \"system\",\n",
+    "                    \"content\": \"You are a helpful AI assistant.\"\n",
+    "                    \" Use the following docs to help answer the user's question.\\n\\n\"\n",
+    "                    f\"## Docs\\n\\n{similar}\",\n",
+    "                },\n",
+    "                {\"role\": \"user\", \"content\": question},\n",
+    "            ],\n",
+    "        )\n",
+    "        \n",
+    "        # Evaluators will expect \"answer\" and \"contexts\"\n",
+    "        return {\n",
+    "            \"answer\": response.choices[0].message.content,\n",
+    "            \"contexts\": [str(doc) for doc in similar],\n",
+    "        }\n",
+    "\n",
+    "rag_bot = RagBot(retriever)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "6101d155-a1ab-460c-8c3e-f1f44e09a8b7",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'LangChain Expression Language (LCEL) is a declarative language that simplifies the composition of chains for working with language models and related '"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "response = rag_bot.get_answer(\"What is LCEL?\")\n",
+    "response[\"answer\"][:150]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "432e8ec7-a085-4224-ad38-0087e1d553f1",
+   "metadata": {},
+   "source": [
+    "## RAG Dataset \n",
+    "\n",
+    "Next, we build a dataset of QA pairs based upon the [documentation](https://python.langchain.com/docs/expression_language/) that we indexed."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "22f0daeb-6a61-4f8d-a4fc-4c7d22b6dc61",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "os.environ['LANGCHAIN_TRACING_V2'] = 'true'\n",
+    "os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'\n",
+    "os.environ['LANGCHAIN_API_KEY'] = <your-api-key>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "0f29304f-d79b-40e9-988a-343732102af9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langsmith import Client \n",
+    "\n",
+    "# QA\n",
+    "inputs = [\n",
+    "    \"How can I directly pass a string to a runnable and use it to construct the input needed for my prompt?\",\n",
+    "    \"How can I make the output of my LCEL chain a string?\",\n",
+    "    \"How can I apply a custom function to one of the inputs of an LCEL chain?\"\n",
+    "]\n",
+    "\n",
+    "outputs = [\n",
+    "    \"Use RunnablePassthrough. from langchain_core.runnables import RunnableParallel, RunnablePassthrough; from langchain_core.prompts import ChatPromptTemplate; from langchain_openai import ChatOpenAI; prompt = ChatPromptTemplate.from_template('Tell a joke about: {input}'); model = ChatOpenAI(); runnable = ({'input' : RunnablePassthrough()} | prompt | model); runnable.invoke('flowers')\",\n",
+    "    \"Use StrOutputParser. from langchain_openai import ChatOpenAI; from langchain_core.prompts import ChatPromptTemplate; from langchain_core.output_parsers import StrOutputParser; prompt = ChatPromptTemplate.from_template('Tell me a short joke about {topic}'); model = ChatOpenAI(model='gpt-3.5-turbo') #gpt-4 or other LLMs can be used here; output_parser = StrOutputParser(); chain = prompt | model | output_parser\",\n",
+    "    \"Use RunnableLambda with itemgetter to extract the relevant key. from operator import itemgetter; from langchain_core.prompts import ChatPromptTemplate; from langchain_core.runnables import RunnableLambda; from langchain_openai import ChatOpenAI; def length_function(text): return len(text); chain = ({'prompt_input': itemgetter('foo') | RunnableLambda(length_function),} | prompt | model); chain.invoke({'foo':'hello world'})\"\n",
+    "]\n",
+    "\n",
+    "qa_pairs = [{\"question\": q, \"answer\": a} for q, a in zip(inputs, outputs)]\n",
+    "\n",
+    "# Create dataset\n",
+    "client = Client()\n",
+    "dataset_name = \"RAG_test_LCEL\"\n",
+    "dataset = client.create_dataset(\n",
+    "    dataset_name=dataset_name,\n",
+    "    description=\"QA pairs about LCEL.\",\n",
+    ")\n",
+    "client.create_examples(\n",
+    "    inputs=[{\"question\": q} for q in inputs],\n",
+    "    outputs=[{\"answer\": a} for a in outputs],\n",
+    "    dataset_id=dataset.id,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "92cf3a0f-621f-468d-818d-a6f2d4b53823",
+   "metadata": {},
+   "source": [
+    "## RAG Evaluators\n",
+    "\n",
+    "### Type 1: Reference Answer\n",
+    "\n",
+    "First, lets consider the case in which we want to compare our RAG chain answer to a reference answer.\n",
+    "\n",
+    "This is shown on the far right (blue) in the top figure.\n",
+    "\n",
+    "#### Eval flow\n",
+    "\n",
+    "We will use a `LangChainStringEvaluator`, as mentioned above.\n",
+    "\n",
+    "For comparing questions and answers, common built-in `LangChainStringEvaluator` options are `QA` and `CoTQA` [here different evaluators](https://docs.smith.langchain.com/evaluation/faq/evaluator-implementations).\n",
+    "\n",
+    "We will use `CoT_QA` as an LLM-as-judge evaluator, which uses the eval prompt defined [here](https://smith.langchain.com/hub/langchain-ai/cot_qa).\n",
+    "\n",
+    "But, all `LangChainStringEvaluator` expose a common interface to pass your inputs:\n",
+    "\n",
+    "1. `question` from the dataset -> `input` \n",
+    "2. `answer` from the dataset -> `reference` \n",
+    "3. `answer` from the LLM -> `prediction` \n",
+    "\n",
+    "![](../../../../../static/img/langsmith_rag_flow.png)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "1cbe0b4a-2a30-4f40-b3aa-5cc67c6a7802",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# RAG chain\n",
+    "def predict_rag_answer(example: dict):\n",
+    "    \"\"\"Use this for answer evaluation\"\"\"\n",
+    "    response = rag_bot.get_answer(example[\"question\"])\n",
+    "    return {\"answer\": response[\"answer\"]}\n",
+    "\n",
+    "def predict_rag_answer_with_context(example: dict):\n",
+    "    \"\"\"Use this for evaluation of retrieved documents and hallucinations\"\"\"\n",
+    "    response = rag_bot.get_answer(example[\"question\"])\n",
+    "    return {\"answer\": response[\"answer\"], \"contexts\": response[\"contexts\"]}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "a7a3827d-a92f-4a7a-a572-5123fbd9c334",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "View the evaluation results for experiment: 'rag-qa-oai-e8604ab3' at:\n",
+      "https://smith.langchain.com/o/1fa8b1f4-fcb9-4072-9aa9-983e35ad61b8/datasets/368734fb-7c14-4e1f-b91a-50d52cb58a07/compare?selectedSessions=a176a91c-a5f0-42ab-b2f4-fedaa1cbf17d\n",
+      "\n",
+      "\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "e459fbab745f4ce4bb399609910a807f",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "0it [00:00, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "from langsmith.evaluation import LangChainStringEvaluator, evaluate\n",
+    "\n",
+    "# Evaluator \n",
+    "qa_evalulator = [LangChainStringEvaluator(\"cot_qa\",     \n",
+    "                                          prepare_data=lambda run, example: {\n",
+    "                                              \"prediction\": run.outputs[\"answer\"], \n",
+    "                                              \"reference\": run.outputs[\"contexts\"],\n",
+    "                                              \"input\": example.inputs[\"question\"],\n",
+    "                                          }  \n",
+    "                                         ))]\n",
+    "dataset_name = \"RAG_test_LCEL\"\n",
+    "experiment_results = evaluate(\n",
+    "    predict_rag_answer,\n",
+    "    data=dataset_name,\n",
+    "    evaluators=qa_evalulator,\n",
+    "    experiment_prefix=\"rag-qa-oai\",\n",
+    "    metadata={\"variant\": \"LCEL context, gpt-3.5-turbo\"},\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "60ba4123-c691-4aa0-ba76-e567e8aaf09f",
+   "metadata": {},
+   "source": [
+    "### Type 2: Answer Hallucination\n",
+    "\n",
+    "Second, lets consider the case in which we want to compare our RAG chain answer to the retrieved documents.\n",
+    "\n",
+    "This is shown in the red in the top figure.\n",
+    "\n",
+    "#### Eval flow\n",
+    "\n",
+    "We will use a `LangChainStringEvaluator`, as mentioned above.\n",
+    "\n",
+    "For comparing documents and answers, common built-in `LangChainStringEvaluator` options are `Criteria` [here](https://python.langchain.com/docs/guides/productionization/evaluation/string/criteria_eval_chain/#using-reference-labels) because we want to supply custom criteria.\n",
+    "\n",
+    "We will use `labeled_score_string` as an LLM-as-judge evaluator, which uses the eval prompt defined [here](https://smith.langchain.com/hub/wfh/labeled-score-string).\n",
+    "\n",
+    "Here, we only need to use two inputs of the `LangChainStringEvaluator` interface:\n",
+    "\n",
+    "1. `contexts` from  LLM chain -> `reference` \n",
+    "2. `answer` from the LLM chain -> `prediction` \n",
+    "\n",
+    "![](../../../../../static/img/langsmith_rag_flow_hallucination.png)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "id": "7f0872a5-e989-415d-9fed-5846efaa9488",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langsmith.evaluation import LangChainStringEvaluator, evaluate\n",
+    "\n",
+    "answer_hallucination_evaluator = LangChainStringEvaluator(\n",
+    "    \"labeled_score_string\", \n",
+    "    config={\n",
+    "        \"criteria\": { \n",
+    "            \"accuracy\": \"\"\"Is the Assistant's Answer grounded in the Ground Truth documentation? A score of 0 means that the\n",
+    "            Assistant answer contains is not at all based upon / grounded in the Groun Truth documentation. A score of 5 means \n",
+    "            that the Assistant answer contains some information (e.g., a hallucination) that is not captured in the Ground Truth \n",
+    "            documentation. A score of 10 means that the Assistant answer is fully based upon the in the Ground Truth documentation.\"\"\"\n",
+    "        },\n",
+    "        # If you want the score to be saved on a scale from 0 to 1\n",
+    "        \"normalize_by\": 10,\n",
+    "    },\n",
+    "    prepare_data=lambda run, example: {\n",
+    "        \"prediction\": run.outputs[\"answer\"], \n",
+    "        \"reference\": run.outputs[\"contexts\"],\n",
+    "        \"input\": example.inputs[\"question\"],\n",
+    "    }  \n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "id": "6d5bf61b-3903-4cde-9ecf-67f0e0874521",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "View the evaluation results for experiment: 'rag-qa-oai-hallucination-fad2e13c' at:\n",
+      "https://smith.langchain.com/o/1fa8b1f4-fcb9-4072-9aa9-983e35ad61b8/datasets/368734fb-7c14-4e1f-b91a-50d52cb58a07/compare?selectedSessions=9a1e9e7d-cf87-4b89-baf6-f5498a160627\n",
+      "\n",
+      "\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "891904d8d44444e98c6a03faa43e147a",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "0it [00:00, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "dataset_name = \"RAG_test_LCEL\"\n",
+    "    \n",
+    "experiment_results = evaluate(\n",
+    "    predict_rag_answer_with_context,\n",
+    "    data=dataset_name,\n",
+    "    evaluators=[answer_hallucination_evaluator],\n",
+    "    experiment_prefix=\"rag-qa-oai-hallucination\",\n",
+    "    # Any experiment metadata can be specified here\n",
+    "    metadata={\n",
+    "      \"variant\": \"LCEL context, gpt-3.5-turbo\",\n",
+    "    },\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "480a27cb-1a31-4194-b160-8cdcfbf24eea",
+   "metadata": {},
+   "source": [
+    "### Type 3: Document Relevance to Question\n",
+    "\n",
+    "Finally, lets consider the case in which we want to compare our RAG chain document retrieval to the question.\n",
+    "\n",
+    "This is shown in green in the top figure.\n",
+    "\n",
+    "#### Eval flow\n",
+    "\n",
+    "We will use a `LangChainStringEvaluator`, as mentioned above.\n",
+    "\n",
+    "For comparing documents and answers, common built-in `LangChainStringEvaluator` options are `Criteria` [here](https://python.langchain.com/docs/guides/productionization/evaluation/string/criteria_eval_chain/#using-reference-labels) because we want to supply custom criteria.\n",
+    "\n",
+    "We will use `labeled_score_string` as an LLM-as-judge evaluator, which uses the eval prompt defined [here](https://smith.langchain.com/hub/wfh/labeled-score-string).\n",
+    "\n",
+    "Here, we only need to use two inputs of the `LangChainStringEvaluator` interface:\n",
+    "\n",
+    "1. `question` from  LLM chain -> `reference` \n",
+    "2. `contexts` from the LLM chain -> `prediction` \n",
+    "\n",
+    "![](../../../../../static/img/langsmith_rag_flow_doc_relevance.png)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "id": "df247034-14ed-40b1-b313-b0fef7286546",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langsmith.evaluation import LangChainStringEvaluator, evaluate\n",
+    "\n",
+    "docs_relevance_evaluator = LangChainStringEvaluator(\n",
+    "    \"labeled_score_string\", \n",
+    "    config={\n",
+    "        \"criteria\": { \n",
+    "            \"accuracy\": \"\"\"The Assistant's Answer is a set of documents retrieved from a vectorstore. The Ground Truth is a question\n",
+    "            used for retrieval. You will score whether the Assistant's Answer (retrieved docs) are relevant to the Ground Truth \n",
+    "            question. A score of 0 means that the Assistant answer contains documents that are not at all relevant to the \n",
+    "            Ground Truth question. A score of 5 means that the Assistant answer contains some documents are relevant to the Ground Truth \n",
+    "            question. A score of 10 means that all of the Assistant answer documents are all relevant to the Ground Truth question\"\"\"\n",
+    "        },\n",
+    "        # If you want the score to be saved on a scale from 0 to 1\n",
+    "        \"normalize_by\": 10,\n",
+    "    },\n",
+    "    prepare_data=lambda run, example: {\n",
+    "        \"prediction\": run.outputs[\"contexts\"], \n",
+    "        \"reference\": example.inputs[\"question\"],\n",
+    "        \"input\": example.inputs[\"question\"],\n",
+    "    }  \n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "id": "cfe988dc-2aaa-42f4-93ff-c3c9fe6b3124",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "View the evaluation results for experiment: 'rag-qa-oai-doc-relevance-82244196' at:\n",
+      "https://smith.langchain.com/o/1fa8b1f4-fcb9-4072-9aa9-983e35ad61b8/datasets/368734fb-7c14-4e1f-b91a-50d52cb58a07/compare?selectedSessions=3bbf09c9-69de-47ba-9d3c-7bcedf5cd48f\n",
+      "\n",
+      "\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "4e4091f1053b4d34871aa87428297e12",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "0it [00:00, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "experiment_results = evaluate(\n",
+    "    predict_rag_answer_with_context,\n",
+    "    data=dataset_name,\n",
+    "    evaluators=[docs_relevance_evaluator],\n",
+    "    experiment_prefix=\"rag-qa-oai-doc-relevance\",\n",
+    "    # Any experiment metadata can be specified here\n",
+    "    metadata={\n",
+    "      \"variant\": \"LCEL context, gpt-3.5-turbo\",\n",
+    "    },\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c2f09b6e-667a-47fe-b3f9-8634783f7666",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.8"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/static/img/langsmith_rag_eval.png
+++ b/docs/static/img/langsmith_rag_eval.png
--- a/docs/static/img/langsmith_rag_flow.png
+++ b/docs/static/img/langsmith_rag_flow.png
--- a/docs/static/img/langsmith_rag_flow_doc_relevance.png
+++ b/docs/static/img/langsmith_rag_flow_doc_relevance.png
--- a/docs/static/img/langsmith_rag_flow_hallucination.png
+++ b/docs/static/img/langsmith_rag_flow_hallucination.png
Author	SHA1	Message	Date
Lance Martin	cc51af26b6	Minor updates	2024-04-17 12:52:05 -07:00
Lance Martin	16152b3cdd	Add hallucination and doc relevance	2024-04-16 16:56:41 -07:00
Lance Martin	c152fc5733	RAG guide	2024-04-16 15:31:48 -07:00