Skip to main content

Library with langchain instrumentation to evaluate LLM based applications.

Reason this release was yanked:

Unstable release

Project description

Welcome to TruLens-Eval!

TruLens

Evaluate and track your LLM experiments with TruLens. As you work on your models and prompts TruLens-Eval supports the iterative development and of a wide range of LLM applications by wrapping your application to log key metadata across the entire chain (or off chain if your project does not use chains) on your local machine.

Using feedback functions, you can objectively evaluate the quality of the responses provided by an LLM to your requests. This is completed with minimal latency, as this is achieved in a sequential call for your application, and evaluations are logged to your local machine. Finally, we provide an easy to use Streamlit dashboard run locally on your machine for you to better understand your LLM’s performance.

Value Propositions

TruLens-Eval has two key value propositions:

  1. Evaluation:
    • TruLens supports the the evaluation of inputs, outputs and internals of your LLM application using any model (including LLMs).
    • A number of feedback functions for evaluation are implemented out-of-the-box such as groundedness, relevance and toxicity. The framework is also easily extensible for custom evaluation requirements.
  2. Tracking:
    • TruLens contains instrumentation for any LLM application including question answering, retrieval-augmented generation, agent-based applications and more. This instrumentation allows for the tracking of a wide variety of usage metrics and metadata. Read more in the instrumentation overview.
    • TruLens' instrumentation can be applied to any LLM application without being tied down to a given framework. Additionally, deep integrations with LangChain and Llama-Index allow the capture of internal metadata and text.
    • Anything that is tracked by the instrumentation can be evaluated!

The process for building your evaluated and tracked LLM application with TruLens is below 👇

Architecture Diagram

Installation and Setup

Install the trulens-eval pip package from PyPI.

    pip install trulens-eval

Setting Keys

In any of the quickstarts, you will need OpenAI and Huggingface keys. You can add keys by setting the environmental variables:

import os
os.environ["OPENAI_API_KEY"] = "..."
os.environ["HUGGINGFACE_API_KEY"] = "..."

Quick Usage

TruLens supports the evaluation of tracking for any LLM app framework. Choose a framework below to get started:

Langchain

langchain_quickstart.ipynb. Open In Colab

langchain_quickstart.py.

Llama-Index

llama_index_quickstart.ipynb. Open In Colab

llama_index_quickstart.py

No Framework

text2text_quickstart.ipynb. Open In Colab

text2text_quickstart.py

💡 Contributing

Interested in contributing? See our contribution guide for more details.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

trulens_eval-0.15.0-py3-none-any.whl (613.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page