docs: templates updated titles (#25646)

Updated titles into a consistent format. 
Fixed links to the diagrams.
Fixed typos.
Note: The Templates menu in the navbar is now sorted by the file names.
I'll try sorting the navbar menus by the page titles, not the page file
names.
This commit is contained in:
Leonid Ganeline
2024-08-23 01:19:38 -07:00
committed by GitHub
parent 1b2ae40d45
commit 163ef35dd1
106 changed files with 366 additions and 344 deletions

View File

@@ -1,19 +1,20 @@
# Chat Bot Feedback Template
# Chatbot feedback
This template shows how to evaluate your chat bot without explicit user feedback. It defines a simple chat bot in [chain.py](https://github.com/langchain-ai/langchain/blob/master/templates/chat-bot-feedback/chat_bot_feedback/chain.py) and custom evaluator that scores bot response effectiveness based on the subsequent user response. You can apply this run evaluator to your own chat bot by calling `with_config` on the chat bot before serving. You can also directly deploy your chat app using this template.
This template shows how to evaluate your chatbot without explicit user feedback.
It defines a simple chatbot in [chain.py](https://github.com/langchain-ai/langchain/blob/master/templates/chat-bot-feedback/chat_bot_feedback/chain.py) and custom evaluator that scores bot response effectiveness based on the subsequent user response. You can apply this run evaluator to your own chat bot by calling `with_config` on the chat bot before serving. You can also directly deploy your chat app using this template.
[Chat bots](https://python.langchain.com/docs/use_cases/chatbots) are one of the most common interfaces for deploying LLMs. The quality of chat bots varies, making continuous development important. But users are wont to leave explicit feedback through mechanisms like thumbs-up or thumbs-down buttons. Furthermore, traditional analytics such as "session length" or "conversation length" often lack clarity. However, multi-turn conversations with a chat bot can provide a wealth of information, which we can transform into metrics for fine-tuning, evaluation, and product analytics.
[Chatbots](https://python.langchain.com/docs/use_cases/chatbots) are one of the most common interfaces for deploying LLMs. The quality of chat bots varies, making continuous development important. But users are wont to leave explicit feedback through mechanisms like thumbs-up or thumbs-down buttons. Furthermore, traditional analytics such as "session length" or "conversation length" often lack clarity. However, multi-turn conversations with a chat bot can provide a wealth of information, which we can transform into metrics for fine-tuning, evaluation, and product analytics.
Taking [Chat Langchain](https://chat.langchain.com/) as a case study, only about 0.04% of all queries receive explicit feedback. Yet, approximately 70% of the queries are follow-ups to previous questions. A significant portion of these follow-up queries continue useful information we can use to infer the quality of the previous AI response.
This template helps solve this "feedback scarcity" problem. Below is an example invocation of this chat bot:
[![Screenshot of a chat bot interaction where the AI responds in a pirate accent to a user asking where their keys are.](./static/chat_interaction.png "Chat Bot Interaction Example")](https://smith.langchain.com/public/3378daea-133c-4fe8-b4da-0a3044c5dbe8/r?runtab=1)
![Screenshot of a chat bot interaction where the AI responds in a pirate accent to a user asking where their keys are.](./static/chat_interaction.png)["Chat Bot Interaction Example"](https://smith.langchain.com/public/3378daea-133c-4fe8-b4da-0a3044c5dbe8/r?runtab=1)
When the user responds to this ([link](https://smith.langchain.com/public/a7e2df54-4194-455d-9978-cecd8be0df1e/r)), the response evaluator is invoked, resulting in the following evaluationrun:
When the user responds to this ([link](https://smith.langchain.com/public/a7e2df54-4194-455d-9978-cecd8be0df1e/r)), the response evaluator is invoked, resulting in the following evaluation run:
[![Screenshot of an evaluator run showing the AI's response effectiveness score based on the user's follow-up message expressing frustration.](./static/evaluator.png "Chat Bot Evaluator Run")](https://smith.langchain.com/public/534184ee-db8f-4831-a386-3f578145114c/r)
![Screenshot of an evaluator run showing the AI's response effectiveness score based on the user's follow-up message expressing frustration.](./static/evaluator.png) ["Chat Bot Evaluator Run"](https://smith.langchain.com/public/534184ee-db8f-4831-a386-3f578145114c/r)
As shown, the evaluator sees that the user is increasingly frustrated, indicating that the prior response was not effective