Skip to main content

Open-source tools to analyze, monitor, and debug machine learning model in production.

Project description

Evidently

An open-source framework to evaluate, test and monitor ML and LLM-powered systems.

PyPi Downloads License PyPi

Evidently

Documentation | Discord Community | Blog | Twitter | Evidently Cloud

:bar_chart: What is Evidently?

Evidently is an open-source Python library to evaluate, test, and monitor ML and LLM systems—from experiments to production.

  • 🔡 Works with tabular and text data.
  • ✨ Supports evals for predictive and generative tasks, from classification to RAG.
  • 📚 100+ built-in metrics from data drift detection to LLM judges.
  • 🛠️ Python interface for custom metrics.
  • 🚦 Both offline evals and live monitoring.
  • 💻 Open architecture: easily export data and integrate with existing tools.

Evidently is very modular. You can start with one-off evaluations or host a full monitoring service.

1. Reports and Test Suites

Reports compute and summarize various data, ML and LLM quality evals.

  • Start with Presets and built-in metrics or customize.
  • Best for experiments, exploratory analysis and debugging.
  • View interactive Reports in Python or export as JSON, Python dictionary, HTML, or view in monitoring UI.

Turn any Report into a Test Suite by adding pass/fail conditions.

  • Best for regression testing, CI/CD checks, or data validation.
  • Zero setup option: auto-generate test conditions from the reference dataset.
  • Simple syntax to set test conditions as gt (greater than), lt (less than), etc.
Reports
Report example

2. Monitoring Dashboard

Monitoring UI service helps visualize metrics and test results over time.

You can choose:

Evidently Cloud offers a generous free tier and extra features like dataset and user management, alerting, and no-code evals. Compare OSS vs Cloud.

Dashboard
Dashboard example

:woman_technologist: Install Evidently

To install from PyPI:

pip install evidently

To install Evidently using the Conda installer, run:

conda install -c conda-forge evidently

:arrow_forward: Getting started

Reports

LLM evals

This is a simple Hello World. Check the Tutorials for more: LLM evaluation.

Import the necessary components:

import pandas as pd
from evidently import Report
from evidently import Dataset, DataDefinition
from evidently.descriptors import Sentiment, TextLength, Contains
from evidently.presets import TextEvals

Create a toy dataset with questions and answers.

eval_df = pd.DataFrame([
    ["What is the capital of Japan?", "The capital of Japan is Tokyo."],
    ["Who painted the Mona Lisa?", "Leonardo da Vinci."],
    ["Can you write an essay?", "I'm sorry, but I can't assist with homework."]],
                       columns=["question", "answer"])

Create an Evidently Dataset object and add descriptors: row-level evaluators. We'll check for sentiment of each response, its length and whether it contains words indicative of denial.

eval_dataset = Dataset.from_pandas(pd.DataFrame(eval_df),
data_definition=DataDefinition(),
descriptors=[
    Sentiment("answer", alias="Sentiment"),
    TextLength("answer", alias="Length"),
    Contains("answer", items=['sorry', 'apologize'], mode="any", alias="Denials")
])

You can view the dataframe with added scores:

eval_dataset.as_dataframe()

To get a summary Report to see the distribution of scores:

report = Report([
    TextEvals()
])

my_eval = report.run(eval_dataset)
my_eval
# my_eval.json()
# my_eval.dict()

You can also choose other evaluators, including LLM-as-a-judge and configure pass/fail conditions.

Data and ML evals

This is a simple Hello World. Check the Tutorials for more: Tabular data.

Import the Report, evaluation Preset and toy tabular dataset.

import pandas as pd
from sklearn import datasets

from evidently import Report
from evidently.presets import DataDriftPreset

iris_data = datasets.load_iris(as_frame=True)
iris_frame = iris_data.frame

Run the Data Drift evaluation preset that will test for shift in column distributions. Take the first 60 rows of the dataframe as "current" data and the following as reference. Get the output in Jupyter notebook:

report = Report([
    DataDriftPreset(method="psi")
],
include_tests="True")
my_eval = report.run(iris_frame.iloc[:60], iris_frame.iloc[60:])
my_eval

You can also save an HTML file. You'll need to open it from the destination folder.

my_eval.save_html("file.html")

To get the output as JSON or Python dictionary:

my_eval.json()
# my_eval.dict()

You can choose other Presets, create Reports from individual Metrics and configure pass/fail conditions.

Monitoring dashboard

This launches a demo project in the locally hosted Evidently UI. Sign up for Evidently Cloud to instantly get a managed version with additional features.

if you have uv you can run Evidently UI with a single command.

uv run --with evidently evidently ui --demo-projects all

If you haven’t installed uv, create a virtual environment using the standard approach.

pip install virtualenv
virtualenv venv
source venv/bin/activate

After installing Evidently (pip install evidently), run the Evidently UI with the demo projects:

evidently ui --demo-projects all

Visit localhost:8000 to access the UI.

🚦 What can you evaluate?

Evidently has 100+ built-in evals. You can also add custom ones.

Here are examples of things you can check:

🔡 Text descriptors 📝 LLM outputs
Length, sentiment, toxicity, language, special symbols, regular expression matches, etc. Semantic similarity, retrieval relevance, summarization quality, etc. with model- and LLM-based evals.
🛢 Data quality 📊 Data distribution drift
Missing values, duplicates, min-max ranges, new categorical values, correlations, etc. 20+ statistical tests and distance metrics to compare shifts in data distribution.
🎯 Classification 📈 Regression
Accuracy, precision, recall, ROC AUC, confusion matrix, bias, etc. MAE, ME, RMSE, error distribution, error normality, error bias, etc.
🗂 Ranking (inc. RAG) 🛒 Recommendations
NDCG, MAP, MRR, Hit Rate, etc. Serendipity, novelty, diversity, popularity bias, etc.

:computer: Contributions

We welcome contributions! Read the Guide to learn more.

:books: Documentation

For more examples, refer to the complete Documentation.

:white_check_mark: Discord Community

If you want to chat and connect, join our Discord community!

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evidently-0.7.20.tar.gz (7.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

evidently-0.7.20-py3-none-any.whl (11.7 MB view details)

Uploaded Python 3

File details

Details for the file evidently-0.7.20.tar.gz.

File metadata

  • Download URL: evidently-0.7.20.tar.gz
  • Upload date:
  • Size: 7.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for evidently-0.7.20.tar.gz
Algorithm Hash digest
SHA256 726d6ebdc4e29a94a27a2ee28546e076055aebcdb9d43b2e5126ae23483b8fbd
MD5 159b665ef9e5f804bd58be9e406e954c
BLAKE2b-256 7b79b04e7a51b31611e30a0704babd5381bfd0c5face6e33a839bbb603d7362d

See more details on using hashes here.

Provenance

The following attestation bundles were made for evidently-0.7.20.tar.gz:

Publisher: release.yml on evidentlyai/evidently

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evidently-0.7.20-py3-none-any.whl.

File metadata

  • Download URL: evidently-0.7.20-py3-none-any.whl
  • Upload date:
  • Size: 11.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for evidently-0.7.20-py3-none-any.whl
Algorithm Hash digest
SHA256 af75da94c321d642b9706e64841abb78421cfda110f375ecfa0f157dcbbb0bef
MD5 96dcdccd56d666c92e1bc0ab0816d054
BLAKE2b-256 326eac826471e76fe3ada1312561ffe6cb6ffaa4ea9d47fd6c42e1e307748141

See more details on using hashes here.

Provenance

The following attestation bundles were made for evidently-0.7.20-py3-none-any.whl:

Publisher: release.yml on evidentlyai/evidently

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page