Skip to main content

DataEval companion package for plotting utilities

Project description

DataEval Plots

Multi-backend plotting utilities for DataEval outputs.

Installation

# Minimal - no plotting backend included
pip install dataeval-plots

# With matplotlib plotting (recommended)
pip install dataeval-plots[matplotlib]

# With multiple backends
pip install dataeval-plots[matplotlib,plotly]

# Everything
pip install dataeval-plots[all]

For development:

pip install -e dataeval-plots[all]

Available Backends

Backend Status Install With Description
matplotlib ✅ Default [matplotlib] Standard publication-quality plots
seaborn ✅ Available [seaborn] Statistical data visualization
plotly ✅ Available [plotly] Interactive web-based plots
altair ✅ Available [altair] Declarative visualization grammar

Usage

Option 1: Import from dataeval-plots directly

from dataeval_plots import plot
from dataeval.metrics.bias import coverage

result = coverage(embeddings)
fig = plot(result, images=dataset, top_k=6)
fig.savefig("coverage.png")

Option 2: Import from dataeval core (convenience)

from dataeval import plotting
from dataeval.metrics.bias import coverage

result = coverage(embeddings)
fig = plotting.plot(result, images=dataset)

Option 3: Set default backend

from dataeval_plots import plot, set_default_backend

# Set seaborn as default
set_default_backend("seaborn")
fig = plot(result, images=dataset)  # Uses seaborn

# Override for a specific plot
fig = plot(result, backend="matplotlib", images=dataset)

Features

  • Multi-backend architecture: Support for matplotlib (default), seaborn, plotly, and altair
  • Optional dependencies: Install only the backends you need
  • Clean separation: Core dataeval has zero plotting dependencies
  • Protocol-based design: Loose coupling via structural typing (Plottable protocol)
  • Extensible: Easy to add new backends via BasePlottingBackend or custom outputs via Plottable
  • Lazy loading: Backends are only imported when first used
  • Type safe: Static type checking with mypy/pyright via @runtime_checkable protocols
  • DRY architecture: Centralized routing logic in BasePlottingBackend

Architecture

The package uses a protocol-based architecture for loose coupling between dataeval and dataeval-plots:

dataeval/                           # Core package
    outputs/
        _bias.py                    # CoverageOutput, BalanceOutput, DiversityOutput
        _stats.py                   # BaseStatsOutput
        _workflows.py               # SufficiencyOutput
        _drift.py                   # DriftMVDCOutput
    plotting.py                     # Convenience hook to dataeval-plots

dataeval-plots/                     # Separate plotting package
    src/dataeval_plots/
        __init__.py                 # Main plot() function
        _registry.py                # Backend registry with lazy loading
        protocols.py                # Protocol definitions (Plottable hierarchy)
        backends/
            _base.py                # BasePlottingBackend (abstract routing)
            _matplotlib.py          # MatplotlibBackend (default)
            _seaborn.py             # SeabornBackend
            _plotly.py              # PlotlyBackend
            _altair.py              # AltairBackend

Protocol-Based Design

All DataEval output classes implement the Plottable protocol, which requires:

  • plot_type(): Returns a string identifying the plot type (e.g., "coverage", "balance")
  • meta(): Returns execution metadata

This enables:

  • Loose coupling: dataeval-plots doesn't import concrete classes from dataeval
  • Type safety: Static and runtime type checking via @runtime_checkable protocols
  • Extensibility: Anyone can create custom outputs implementing Plottable
  • Zero dependencies: Core dataeval has no plotting dependencies

Supported Output Types

Output Type Plot Type Description Source
CoverageOutput "coverage" Image grid showing uncovered samples dataeval/_bias.py
BalanceOutput "balance" Heatmap of class balance metrics dataeval/_bias.py
DiversityOutput "diversity" Visualization of diversity indices dataeval/_bias.py
SufficiencyOutput "sufficiency" Learning curves with extrapolation dataeval/_workflows.py
BaseStatsOutput "base_stats" Statistical histograms and distributions dataeval/_stats.py
DriftMVDCOutput "drift_mvdc" Drift detection plots (MVDC analysis) dataeval/_drift.py

Each output type implements the Plottable protocol and can be plotted using any registered backend.

Extending the Package

Creating Custom Outputs

You can create custom output classes that work with the plotting system by implementing the Plottable protocol:

from dataclasses import dataclass
from numpy.typing import NDArray
from dataeval_plots.protocols import Plottable, ExecutionMetadata

@dataclass
class MyCustomOutput:
    """Custom output that reuses existing plot type."""
    uncovered_indices: NDArray

    def plot_type(self) -> str:
        return "coverage"  # Reuse existing coverage plotting

    def meta(self) -> ExecutionMetadata:
        return ExecutionMetadata.empty()

# Works seamlessly with existing backends
result = MyCustomOutput(uncovered_indices=my_data)
fig = plot(result, images=my_images)

Creating Custom Backends

Extend BasePlottingBackend to create a new plotting backend:

from dataeval_plots.backends._base import BasePlottingBackend
from dataeval_plots.protocols import PlottableCoverage, PlottableBalance
from dataeval_plots import register_backend

class CustomBackend(BasePlottingBackend):
    """Custom plotting backend using your preferred library."""

    def _plot_coverage(self, output: PlottableCoverage, **kwargs):
        # Implement coverage plotting
        # Access output.uncovered_indices, etc.
        return my_figure

    def _plot_balance(self, output: PlottableBalance, **kwargs):
        # Implement balance plotting
        return my_figure

    # Implement other _plot_* methods...

# Register and use
register_backend("custom", CustomBackend())
fig = plot(result, backend="custom")

The BasePlottingBackend class handles all routing logic automatically - you just implement the plot-type-specific methods (_plot_coverage, _plot_balance, etc.).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataeval_plots-0.0.3.tar.gz (24.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataeval_plots-0.0.3-py3-none-any.whl (33.0 kB view details)

Uploaded Python 3

File details

Details for the file dataeval_plots-0.0.3.tar.gz.

File metadata

  • Download URL: dataeval_plots-0.0.3.tar.gz
  • Upload date:
  • Size: 24.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.8

File hashes

Hashes for dataeval_plots-0.0.3.tar.gz
Algorithm Hash digest
SHA256 3e1222524ac80cdd0aa5602f1ff648e9cd4d44cf5f8025f6ed89de5aebcf806c
MD5 bbf137b159e5ba15073ef63a45f2d331
BLAKE2b-256 b67d07e73c8cf5afef9b411f119b8caceb11d27b009a7b115ffbf47f376dae47

See more details on using hashes here.

File details

Details for the file dataeval_plots-0.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for dataeval_plots-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 c6e3338c71f8a2a2442f104c316d504f9d5b638745dd71d5d885dbb508aa5605
MD5 bc4bd8b63fcb231a04a56375062e4eea
BLAKE2b-256 00d80ff73173a241a7c66698d0803496e8ac873ebb0704831163c930077cfde7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page