Library with langchain instrumentation to evaluate LLM based applications.
Project description
Welcome to TruLens-Eval!
Evaluate and track your LLM experiments with TruLens. As you work on your models and prompts TruLens-Eval supports the iterative development and of a wide range of LLM applications by wrapping your application to log key metadata across the entire chain (or off chain if your project does not use chains) on your local machine.
Using feedback functions, you can objectively evaluate the quality of the responses provided by an LLM to your requests. This is completed with minimal latency, as this is achieved in a sequential call for your application, and evaluations are logged to your local machine. Finally, we provide an easy to use Streamlit dashboard run locally on your machine for you to better understand your LLM’s performance.
Value Propositions
TruLens-Eval has two key value propositions:
- Evaluation:
- TruLens supports the evaluation of inputs, outputs and internals of your LLM application using any model (including LLMs).
- A number of feedback functions for evaluation are implemented out-of-the-box such as groundedness, relevance and toxicity. The framework is also easily extensible for custom evaluation requirements.
- Tracking:
- TruLens contains instrumentation for any LLM application including question answering, retrieval-augmented generation, agent-based applications and more. This instrumentation allows for the tracking of a wide variety of usage metrics and metadata. Read more in the instrumentation overview.
- TruLens' instrumentation can be applied to any LLM application without being tied down to a given framework. Additionally, deep integrations with LangChain and Llama-Index allow the capture of internal metadata and text.
- Anything that is tracked by the instrumentation can be evaluated!
The process for building your evaluated and tracked LLM application with TruLens is below 👇
Installation and Setup
Install the trulens-eval pip package from PyPI.
pip install trulens-eval
Setting Keys
In any of the quickstarts, you will need OpenAI and Huggingface keys. You can add keys by setting the environmental variables:
import os
os.environ["OPENAI_API_KEY"] = "..."
os.environ["HUGGINGFACE_API_KEY"] = "..."
Quick Usage
TruLens supports the evaluation of tracking for any LLM app framework. Choose a framework below to get started:
Langchain
Llama-Index
Custom Text to Text Apps
💡 Contributing
Interested in contributing? See our contribution guide for more details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for trulens_eval-0.18.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd497e428c00910352eeb1642e2957a52d22fbfc3e025a1fdad9e7f0c4817ac4 |
|
MD5 | a310e17df867cc584c2c25ee0dd03b9b |
|
BLAKE2b-256 | 1de3de866fd524dfd09d88ebbb266f6ef2a15c51cf09bf6af8d8dd61c5f959af |