Metamorphic Testing Framework for Regression-based Autonomous Driving ML/AI Models
Project description
Model agnostic + Input agnostic + Output agnostic Metamorphic testing framework for regressional autonomous driving AI/ML models.
Overview
AutoMR evaluates ML models by verifying metamorphic relations (MRs) — expected behavioral properties that must hold under controlled input transformations. Instead of checking exact outputs against ground truth, AutoMR checks whether the model behaves consistently when inputs are perturbed in predictable ways.
| Problem | What AutoMR does |
|---|---|
| No labeled data | Tests models without any ground-truth labels |
| Real-world perturbations | Measures robustness under realistic noise and conditions |
| Silent failures | Pinpoints when and how models begin to fail |
Key Features
- Model-agnostic — works with TensorFlow, PyTorch, sklearn, or any custom model
- Input-agnostic — supports images, text, tabular, and sequential data
- Output-agnostic — handles regression and classification outputs
- Built-in MR pipeline — end-to-end execution with zero boilerplate
- Parametric testing — sweep transformation parameters across configurable ranges
- Automated analysis — failure rate, severity scores, and worst-case detection
- Automatic CSV export — all results persisted without manual intervention
- Optional progress tracking — live progress bars for long-running tests
Installation
git clone https://github.com/CharithManaujayaMUTEC/AutoMR-Framework.git
cd AutoMR-Framework
python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
Quick Start
from automr.api import AutoMR
automr = AutoMR(model)
df, results = automr.run_full_test(
dataset,
max_samples=2000,
samples_per_mr=5,
show_progress=True
)
Execution Flow
1. Load dataset → user-defined input source
2. Load model → any model exposing predict(x)
3. Apply transforms → brightness, rotation, noise, fog, ...
4. Generate outputs → original vs. transformed predictions
5. Validate MRs → check expected behavioral properties
6. Analyze results → failure rate, severity, worst cases
7. Export → CSV files written to /results
Output Files
All results are saved automatically to the /results directory.
| File | Description |
|---|---|
automr_results.csv |
Full per-sample test log |
failure_summary.csv |
Failure rate per metamorphic relation |
severity_summary.csv |
Average output deviation per MR |
worst_cases.csv |
Samples with highest deviation |
failure_regions.txt |
Parametric boundaries where failures occur |
Output columns
| Column | Description |
|---|---|
mr |
Metamorphic relation identifier |
param |
Transformation parameter value |
original |
Original model prediction |
transformed |
Prediction after transformation |
difference |
Raw output difference |
percent_change |
Percentage change between outputs |
status |
PASS / FAIL |
expected_behavior |
Expected MR rule |
actual_behavior |
Consistent / Violation |
sample_id |
Input sample index |
Metamorphic Relations
| Relation | Description | Type |
|---|---|---|
BrightnessRelation |
Output invariant to lighting changes | Invariance |
RotationRelation |
Stable under small rotations | Invariance |
TranslationRelation |
Stable under image shifts | Invariance |
NoiseRelation |
Robust to random noise | Robustness |
FogRelation |
Robust to visibility degradation | Robustness |
TemporalSmoothness |
Consistent outputs across frames | Monotonic |
Transformations
| Transform | Description |
|---|---|
| Brightness | Adjust pixel intensity |
| Rotation | Rotate image by angle |
| Translation | Shift image spatially |
| Noise | Add random Gaussian noise |
| Fog / Rain | Simulate adverse weather |
| Blur | Apply smoothing filter |
Design Principles
Model-agnostic
Works with any model that implements a predict(x) interface:
# TensorFlow, PyTorch, sklearn, or fully custom — all compatible
output = model.predict(input)
Input-agnostic
Accepts any input type — images, sequences, tabular data, or custom formats. Transformations are applied modularly and do not depend on input structure.
Modular architecture
| Component | Role |
|---|---|
Model |
Generates predictions |
Transform |
Modifies input samples |
Relation |
Defines expected behavioral properties |
Analyzer |
Computes failure metrics and summaries |
Project Structure
AutoMR-Framework/
│
├── automr/
│ ├── api.py
│ ├── comparator.py
│ │
│ ├── core/
│ │ ├── range_tester.py
│ │ └── failure_analysis.py
│ │
│ ├── relations/
│ ├── transforms/
│ └── analysis/
│
├── run_test_example.py
├── requirements.txt
├── .gitignore
└── automrlogo.png
Example Run
Running AutoMR: ██████████████ 100%
=== AutoMR Results ===
Failure Summary:
BrightnessRelation → 12.4% failure rate | avg deviation: 0.031
RotationRelation → 8.7% failure rate | avg deviation: 0.019
NoiseRelation → 21.1% failure rate | avg deviation: 0.074
FogRelation → 34.2% failure rate | avg deviation: 0.112
DONE: Results saved in /results
Built-in Analysis
AutoMR automatically computes the following after each test run:
- Failure rate per metamorphic relation
- Severity — average output deviation across failures
- Worst-case failures — samples with the largest deviations
- Failure regions — parameter ranges where the model is most unstable
Limitations
- Current transformation suite is image-focused
- Comparator thresholds require manual tuning per task
- End-to-end performance depends on model inference speed
Future Work
- NLP and tabular transformation extensions
- Classification-specific comparators
- Streamlit dashboard for interactive analysis
- Cross-model MR testing
- Automated result visualizations (plots and charts)
Authors
CharithManaujayaMUTEC — github.com/CharithManaujayaMUTEC
RaveeshaPeiris — github.com/RaveeshaPeiris
Final Year Project — Metamorphic Testing Framework for Regressional Based Autonomous Driving AI/ML Models
License
Released under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file automr-0.1.0.tar.gz.
File metadata
- Download URL: automr-0.1.0.tar.gz
- Upload date:
- Size: 19.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14e11054af033c7be64e7e5025ce1d560cdf9168a10158d200f5416569f5731b
|
|
| MD5 |
10284dbb87904d62bb656e12c5af6d79
|
|
| BLAKE2b-256 |
d7adf1520688ecdb9ec681b0aaae3ac651f338bcc783c3369c433d727d4dc0c0
|
File details
Details for the file automr-0.1.0-py3-none-any.whl.
File metadata
- Download URL: automr-0.1.0-py3-none-any.whl
- Upload date:
- Size: 24.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b7acaefba6b6482db9013462adf82fe1d830c21ae0a6cc7dcd92e09896e2ed25
|
|
| MD5 |
9615467686f3d9a4e59d2a70d713c00c
|
|
| BLAKE2b-256 |
07bc4137e30a3c692320aa9b63013b1c752ba88eab0f5decfdf21a8ec4cfe3c7
|