mirror of
https://github.com/hwchase17/langchain.git
synced 2026-04-04 19:35:08 +00:00
Use docusaurus versioning with a callout, merged master as well @hwchase17 @baskaryan --------- Signed-off-by: Weichen Xu <weichen.xu@databricks.com> Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com> Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru> Co-authored-by: Averi Kitsch <akitsch@google.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Nuno Campos <nuno@langchain.dev> Co-authored-by: Nuno Campos <nuno@boringbits.io> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Martín Gotelli Ferenaz <martingotelliferenaz@gmail.com> Co-authored-by: Fayfox <admin@fayfox.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Dawson Bauer <105886620+djbauer2@users.noreply.github.com> Co-authored-by: Ravindu Somawansa <ravindu.somawansa@gmail.com> Co-authored-by: Dhruv Chawla <43818888+Dominastorm@users.noreply.github.com> Co-authored-by: ccurme <chester.curme@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: WeichenXu <weichen.xu@databricks.com> Co-authored-by: Benito Geordie <89472452+benitoThree@users.noreply.github.com> Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com> Co-authored-by: Kartik Sarangmath <kartik@thirdai.com> Co-authored-by: Sevin F. Varoglu <sfvaroglu@octoml.ai> Co-authored-by: MacanPN <martin.triska@gmail.com> Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com> Co-authored-by: Hyeongchan Kim <kozistr@gmail.com> Co-authored-by: sdan <git@sdan.io> Co-authored-by: Guangdong Liu <liugddx@gmail.com> Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: pjb157 <84070455+pjb157@users.noreply.github.com> Co-authored-by: Eun Hye Kim <ehkim1440@gmail.com> Co-authored-by: kaijietti <43436010+kaijietti@users.noreply.github.com> Co-authored-by: Pengcheng Liu <pcliu.fd@gmail.com> Co-authored-by: Tomer Cagan <tomer@tomercagan.com> Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
82 lines
2.8 KiB
Plaintext
82 lines
2.8 KiB
Plaintext
# TruLens
|
|
|
|
>[TruLens](https://trulens.org) is an [open-source](https://github.com/truera/trulens) package that provides instrumentation and evaluation tools for large language model (LLM) based applications.
|
|
|
|
This page covers how to use [TruLens](https://trulens.org) to evaluate and track LLM apps built on langchain.
|
|
|
|
|
|
## Installation and Setup
|
|
|
|
Install the `trulens-eval` python package.
|
|
|
|
```bash
|
|
pip install trulens-eval
|
|
```
|
|
|
|
## Quickstart
|
|
|
|
See the integration details in the [TruLens documentation](https://www.trulens.org/trulens_eval/getting_started/quickstarts/langchain_quickstart/).
|
|
|
|
### Tracking
|
|
|
|
Once you've created your LLM chain, you can use TruLens for evaluation and tracking.
|
|
TruLens has a number of [out-of-the-box Feedback Functions](https://www.trulens.org/trulens_eval/evaluation/feedback_functions/),
|
|
and is also an extensible framework for LLM evaluation.
|
|
|
|
Create the feedback functions:
|
|
|
|
```python
|
|
from trulens_eval.feedback import Feedback, Huggingface,
|
|
|
|
# Initialize HuggingFace-based feedback function collection class:
|
|
hugs = Huggingface()
|
|
openai = OpenAI()
|
|
|
|
# Define a language match feedback function using HuggingFace.
|
|
lang_match = Feedback(hugs.language_match).on_input_output()
|
|
# By default this will check language match on the main app input and main app
|
|
# output.
|
|
|
|
# Question/answer relevance between overall question and answer.
|
|
qa_relevance = Feedback(openai.relevance).on_input_output()
|
|
# By default this will evaluate feedback on main app input and main app output.
|
|
|
|
# Toxicity of input
|
|
toxicity = Feedback(openai.toxicity).on_input()
|
|
```
|
|
|
|
### Chains
|
|
|
|
After you've set up Feedback Function(s) for evaluating your LLM, you can wrap your application with
|
|
TruChain to get detailed tracing, logging and evaluation of your LLM app.
|
|
|
|
Note: See code for the `chain` creation is in
|
|
the [TruLens documentation](https://www.trulens.org/trulens_eval/getting_started/quickstarts/langchain_quickstart/).
|
|
|
|
```python
|
|
from trulens_eval import TruChain
|
|
|
|
# wrap your chain with TruChain
|
|
truchain = TruChain(
|
|
chain,
|
|
app_id='Chain1_ChatApplication',
|
|
feedbacks=[lang_match, qa_relevance, toxicity]
|
|
)
|
|
# Note: any `feedbacks` specified here will be evaluated and logged whenever the chain is used.
|
|
truchain("que hora es?")
|
|
```
|
|
|
|
### Evaluation
|
|
|
|
Now you can explore your LLM-based application!
|
|
|
|
Doing so will help you understand how your LLM application is performing at a glance. As you iterate new versions of your LLM application, you can compare their performance across all of the different quality metrics you've set up. You'll also be able to view evaluations at a record level, and explore the chain metadata for each record.
|
|
|
|
```python
|
|
from trulens_eval import Tru
|
|
|
|
tru = Tru()
|
|
tru.run_dashboard() # open a Streamlit app to explore
|
|
```
|
|
|
|
For more information on TruLens, visit [trulens.org](https://www.trulens.org/) |