Skip to main content

Open-source tools to analyze, monitor, and debug machine learning model in production.

Project description

Evidently

An open-source framework to evaluate, test and monitor ML and LLM-powered systems.

PyPi Downloads License PyPi

Evidently

Documentation | Discord Community | Blog | Twitter | Evidently Cloud

:new: New release

Evidently 0.4.25. LLM evaluation -> Tutorial

:bar_chart: What is Evidently?

Evidently is an open-source Python library for ML and LLM evaluation and observability. It helps evaluate, test, and monitor AI-powered systems and data pipelines from experimentation to production. 

  • 🔡 Works with tabular, text data, and embeddings.
  • ✨ Supports predictive and generative systems, from classification to RAG.
  • 📚 100+ built-in metrics from data drift detection to LLM judges.
  • 🛠️ Python interface for custom metrics and tests. 
  • 🚦 Both offline evals and live monitoring.
  • 💻 Open architecture: easily export data and integrate with existing tools. 

Evidently is very modular. You can start with one-off evaluations using Reports or Test Suites in Python or get a real-time monitoring Dashboard service.

1. Reports

Reports compute various data, ML and LLM quality metrics. You can start with Presets or customize.

  • Out-of-the-box interactive visuals.
  • Best for exploratory analysis and debugging.
  • Get results in Python, export as JSON, Python dictionary, HTML, DataFrame, or view in monitoring UI.
Reports
Report example

2. Test Suites

Test Suites check for defined conditions on metric values and return a pass or fail result.

  • Best for regression testing, CI/CD checks, or data validation pipelines.
  • Zero setup option: auto-generate test conditions from the reference dataset.
  • Simple syntax to set custom test conditions as gt (greater than), lt (less than), etc.
  • Get results in Python, export as JSON, Python dictionary, HTML, DataFrame, or view in monitoring UI.
Test Suite
Test example

3. Monitoring Dashboard

Monitoring UI service helps visualize metrics and test results over time.

You can choose:

Evidently Cloud offers a generous free tier and extra features like user management, alerting, and no-code evals.

Dashboard
Dashboard example

:woman_technologist: Install Evidently

Evidently is available as a PyPI package. To install it using pip package manager, run:

pip install evidently

To install Evidently using conda installer, run:

conda install -c conda-forge evidently

:arrow_forward: Getting started

Option 1: Test Suites

This is a simple Hello World. Check the Tutorials for more: Tabular data or LLM evaluation.

Import the Test Suite, evaluation Preset and toy tabular dataset.

import pandas as pd

from sklearn import datasets

from evidently.test_suite import TestSuite
from evidently.test_preset import DataStabilityTestPreset

iris_data = datasets.load_iris(as_frame=True)
iris_frame = iris_data.frame

Split the DataFrame into reference and current. Run the Data Stability Test Suite that will automatically generate checks on column value ranges, missing values, etc. from the reference. Get the output in Jupyter notebook:

data_stability= TestSuite(tests=[
    DataStabilityTestPreset(),
])
data_stability.run(current_data=iris_frame.iloc[:60], reference_data=iris_frame.iloc[60:], column_mapping=None)
data_stability

You can also save an HTML file. You'll need to open it from the destination folder.

data_stability.save_html("file.html")

To get the output as JSON:

data_stability.json()

You can choose other Presets, individual Tests and set conditions.

Option 2: Reports

Import the Report, evaluation Preset and toy tabular dataset.

import pandas as pd

from sklearn import datasets

from evidently.report import Report
from evidently.metric_preset import DataDriftPreset

iris_data = datasets.load_iris(as_frame=True)
iris_frame = iris_data.frame

Run the Data Drift Report that will compare column distributions between current and reference:

data_drift_report = Report(metrics=[
    DataDriftPreset(),
])

data_drift_report.run(current_data=iris_frame.iloc[:60], reference_data=iris_frame.iloc[60:], column_mapping=None)
data_drift_report

Save the report as HTML. You'll later need to open it from the destination folder.

data_drift_report.save_html("file.html")

To get the output as JSON:

data_drift_report.json()

You can choose other Presets and individual Metrics, including LLM evaluations for text data.

Option 3: ML monitoring dashboard

This launches a demo project in the Evidently UI. Check tutorials for Self-hosting or Evidently Cloud.

Recommended step: create a virtual environment and activate it.

pip install virtualenv
virtualenv venv
source venv/bin/activate

After installing Evidently (pip install evidently), run the Evidently UI with the demo projects:

evidently ui --demo-projects all

Access Evidently UI service in your browser. Go to the localhost:8000.

🚦 What can you evaluate?

Evidently has 100+ built-in evals. You can also add custom ones. Each metric has an optional visualization: you can use it in Reports, Test Suites, or plot on a Dashboard.

Here are examples of things you can check:

🔡 Text descriptors 📝 LLM outputs
Length, sentiment, toxicity, language, special symbols, regular expression matches, etc. Semantic similarity, retrieval relevance, summarization quality, etc. with model- and LLM-based evals.
🛢 Data quality 📊 Data distribution drift
Missing values, duplicates, min-max ranges, new categorical values, correlations, etc. 20+ statistical tests and distance metrics to compare shifts in data distribution.
🎯 Classification 📈 Regression
Accuracy, precision, recall, ROC AUC, confusion matrix, bias, etc. MAE, ME, RMSE, error distribution, error normality, error bias, etc.
🗂 Ranking (inc. RAG) 🛒 Recommendations
NDCG, MAP, MRR, Hit Rate, etc. Serendipity, novelty, diversity, popularity bias, etc.

:computer: Contributions

We welcome contributions! Read the Guide to learn more.

:books: Documentation

For more information, refer to a complete Documentation. You can start with the tutorials:

See more examples in the Docs.

How-to guides

Explore the How-to guides to understand specific features in Evidently.

:white_check_mark: Discord Community

If you want to chat and connect, join our Discord community!

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evidently-0.4.40.tar.gz (2.3 MB view details)

Uploaded Source

Built Distribution

evidently-0.4.40-py3-none-any.whl (3.5 MB view details)

Uploaded Python 3

File details

Details for the file evidently-0.4.40.tar.gz.

File metadata

  • Download URL: evidently-0.4.40.tar.gz
  • Upload date:
  • Size: 2.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for evidently-0.4.40.tar.gz
Algorithm Hash digest
SHA256 590cdc0eb6d6e4f3b42db5509562f15a04193e23a7c86252422b109057fb341b
MD5 16426f607a79c416b067a962aac795d3
BLAKE2b-256 6ff5e7d6de9e4988561a94bad33741788bbb20686cc1d4cb8f6a416aaaae2977

See more details on using hashes here.

Provenance

The following attestation bundles were made for evidently-0.4.40.tar.gz:

Publisher: release.yml on evidentlyai/evidently

Attestations:

File details

Details for the file evidently-0.4.40-py3-none-any.whl.

File metadata

  • Download URL: evidently-0.4.40-py3-none-any.whl
  • Upload date:
  • Size: 3.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for evidently-0.4.40-py3-none-any.whl
Algorithm Hash digest
SHA256 a19221b10dc3193ad96cd8ba18ebe015eeb96f62fdf450f584cd0d2ede358f5d
MD5 b36dc28cd1ca6b46c6caeec4e20836ce
BLAKE2b-256 a619f37d8c0db248b3e5f8e2f7a1d238debbc1052c89c6a1821f14567bd26e08

See more details on using hashes here.

Provenance

The following attestation bundles were made for evidently-0.4.40-py3-none-any.whl:

Publisher: release.yml on evidentlyai/evidently

Attestations:

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page