Outlier detection using the MISS (MAD-IQR-SD Simultaneous) method
Project description
๐ฏ missoutlier
Outlier Detection Using the MISS Method
A weighted composite of MAD, IQR, and SD for robust univariate outlier detection
Overview
missoutlier implements the MISS (MADโIQRโSD Simultaneous) method, a new approach for univariate outlier detection that combines three classical techniques into a single robust threshold:
| Method | Bounds | Weight |
|---|---|---|
| MAD (Median Absolute Deviation) | median ยฑ 1.5 ร MAD |
87.8% |
| IQR (Interquartile Range) | Q25/Q75 ยฑ 1 ร IQR |
1.2% |
| SD (Standard Deviation) | mean ยฑ 5 ร SD |
11.0% |
The composite threshold is computed as:
$$\text{MISS} = 0.878 \times \text{MAD} + 0.012 \times \text{IQR} + 0.11 \times \text{SD}$$
By heavily weighting the robust MAD while retaining sensitivity from IQR and SD, MISS offers a balanced approach that handles skewed and heavy-tailed distributions better than any single method alone.
Installation
# Install from PyPI
pip install missoutlier
Or install directly from GitHub:
pip install git+https://github.com/GuillaumePech/missOutlierPy.git
Dependencies: numpy >= 1.20, scipy >= 1.7
Quick Start
import numpy as np
from missoutlier import detect_outliers_miss
# Generate data with outliers
x = np.concatenate([np.random.randn(100), [50, -40]])
# Default: replace outliers with NaN
x_clean = detect_outliers_miss(x)
# Detected 2 outliers (1.96% of data) using MISS method.
# Drop outliers entirely
x_dropped = detect_outliers_miss(x, drop=True)
# Detected 2 outliers (1.96% of data) using MISS method.
# Handle existing NaNs
x_na = np.concatenate([np.random.randn(100), [np.nan, 50]])
x_clean = detect_outliers_miss(x_na, na_rm=True)
# Silent mode (no messages)
x_clean = detect_outliers_miss(x, silent=True)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
data |
array-like | โ | Input data (must be one-dimensional) |
drop |
bool | False |
If True, removes outliers. If False, replaces them with NaN |
na_rm |
bool | False |
If True, ignores NaN values when computing thresholds |
silent |
bool | False |
If True, suppresses the detection message |
How It Works
โโโโโโโโโโโโโโโโ
โ Input Data โ
โโโโโโโโฌโโโโโโโโ
โ
โโโโโโโโโโโโโโผโโโโโโโโโโโโโ
โผ โผ โผ
โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ
โ 1.5 MAD โ โ 1 IQR โ โ 5 SD โ
โ ร0.878 โ โ ร0.012 โ โ ร0.11 โ
โโโโโโฌโโโโโ โโโโโโฌโโโโโ โโโโโโฌโโโโโ
โ โ โ
โโโโโโโโโโโโโโผโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโโโโโ
โ MISS Threshold โ
โโโโโโโโโโฌโโโโโโโโ
โผ
โโโโโโโโโโโโโโโโโโ
โ Flag Outliers โ
โโโโโโโโโโโโโโโโโโ
Citation
If you use this package in your research, please cite:
Pech, G., Vaccaro, N., Caspar, E. A., Amerio, P., Cleeremans, A., Leys, C., & Ley, C. (2026). How not to MISS an outlier: comparing three classic univariate methods and introducing a new one, the MADโIQRโSD Simultaneous (MISS). PsyArXiv. https://doi.org/10.31234/osf.io/2r9yw_v2
License
MIT ยฉ Guillaume Pech
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file missoutlier-0.1.2.tar.gz.
File metadata
- Download URL: missoutlier-0.1.2.tar.gz
- Upload date:
- Size: 5.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c336119c13bcb051a70ab8a935d8e26715db8ce1f989b0efec8152a1ee4c7829
|
|
| MD5 |
32bfa0545fa081138e87421353c9407a
|
|
| BLAKE2b-256 |
89c914ce240f7b1caa07f5a0404b1569a7f3ffc4b73e21759aa07d14bdf17b30
|
File details
Details for the file missoutlier-0.1.2-py3-none-any.whl.
File metadata
- Download URL: missoutlier-0.1.2-py3-none-any.whl
- Upload date:
- Size: 5.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
980e429a98097c52b4df4f69aa3ace49d9aa68db8e735d6a858e96f534ae307e
|
|
| MD5 |
230090389ceb3443b933d29dee65b2db
|
|
| BLAKE2b-256 |
3cd4b887f68efd260969b34617955fe3367a625110867cd906d71e158038b6d1
|