Skip to main content

Outlier detection using the MISS (MAD-IQR-SD Simultaneous) method

Project description

๐ŸŽฏ missoutlier

Outlier Detection Using the MISS Method

A weighted composite of MAD, IQR, and SD for robust univariate outlier detection

Python License: MIT PsyArXiv


Overview

missoutlier implements the MISS (MADโ€“IQRโ€“SD Simultaneous) method, a new approach for univariate outlier detection that combines three classical techniques into a single robust threshold:

Method Bounds Weight
MAD (Median Absolute Deviation) median ยฑ 1.5 ร— MAD 87.8%
IQR (Interquartile Range) Q25/Q75 ยฑ 2 ร— IQR 1.2%
SD (Standard Deviation) mean ยฑ 5 ร— SD 11.0%

The composite threshold is computed as:

$$\text{MISS} = 0.878 \times \text{MAD} + 0.012 \times \text{IQR} + 0.11 \times \text{SD}$$

By heavily weighting the robust MAD while retaining sensitivity from IQR and SD, MISS offers a balanced approach that handles skewed and heavy-tailed distributions better than any single method alone.


Installation

# Install from GitHub
pip install git+https://github.com/GuillaumePech/missOutlierPy.git

Dependencies: numpy >= 1.20, scipy >= 1.7


Quick Start

import numpy as np
from missoutlier import detect_outliers_miss

# Generate data with outliers
x = np.concatenate([np.random.randn(100), [50, -40]])

# Default: replace outliers with NaN
x_clean = detect_outliers_miss(x)
# Detected 2 outliers (1.96% of data) using MISS method.

# Drop outliers entirely
x_dropped = detect_outliers_miss(x, drop=True)
# Detected 2 outliers (1.96% of data) using MISS method.

# Handle existing NaNs
x_na = np.concatenate([np.random.randn(100), [np.nan, 50]])
x_clean = detect_outliers_miss(x_na, na_rm=True)

# Silent mode (no messages)
x_clean = detect_outliers_miss(x, silent=True)

Parameters

Parameter Type Default Description
data array-like โ€” Input data (must be one-dimensional)
drop bool False If True, removes outliers. If False, replaces them with NaN
na_rm bool False If True, ignores NaN values when computing thresholds
silent bool False If True, suppresses the detection message

How It Works

                โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                โ”‚  Input Data  โ”‚
                โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ”‚
          โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
          โ–ผ            โ–ผ            โ–ผ
     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
     โ”‚   MAD   โ”‚ โ”‚   IQR   โ”‚ โ”‚   SD    โ”‚
     โ”‚  ร—0.878 โ”‚ โ”‚  ร—0.012 โ”‚ โ”‚  ร—0.11  โ”‚
     โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜
          โ”‚            โ”‚            โ”‚
          โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ–ผ
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ”‚ MISS Threshold โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ–ผ
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ”‚ Flag Outliers  โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Also available in R

devtools::install_github("GuillaumePech/missOutlierR")

Citation

If you use this package in your research, please cite:

Pech, G., Vaccaro, N., Caspar, E. A., Amerio, P., Cleeremans, A., Leys, C., & Ley, C. (2026). How not to MISS an outlier: comparing three classic univariate methods and introducing a new one, the MADโ€“IQRโ€“SD Simultaneous (MISS). PsyArXiv. https://doi.org/10.31234/osf.io/2r9yw_v2

@article{pech2026miss,
  title={How not to {MISS} an outlier: comparing three classic univariate methods and introducing a new one, the {MAD--IQR--SD} Simultaneous ({MISS})},
  author={Pech, Guillaume and Vaccaro, Niccol{\`o} and Caspar, Emilie A. and Amerio, Pietro and Cleeremans, Axel and Leys, Christophe and Ley, Christophe},
  year={2026},
  journal={PsyArXiv},
  doi={10.31234/osf.io/2r9yw_v2}
}

License

MIT ยฉ Guillaume Pech

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

missoutlier-0.1.0.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

missoutlier-0.1.0-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file missoutlier-0.1.0.tar.gz.

File metadata

  • Download URL: missoutlier-0.1.0.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.18

File hashes

Hashes for missoutlier-0.1.0.tar.gz
Algorithm Hash digest
SHA256 db413b1c7ca1943a04c6f8b003a7f96050f78cf34f2521c51b05b3dc046dcfcc
MD5 08cc26bce04b034bc4a9f2e9acc87f20
BLAKE2b-256 02feae18a4315c6d70f62e373765162393a4bce87873c100b83633e81efa647b

See more details on using hashes here.

File details

Details for the file missoutlier-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: missoutlier-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.18

File hashes

Hashes for missoutlier-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 94b476280a9c958dfe39c4c24aee1c3443648de00008c5602694c4171ce7466b
MD5 c335c9cd0ab3c7e4a98a8bde23b208c3
BLAKE2b-256 5173113aee1f000a2844c2b82759295ecee863a8f05182c5eb3f1db4c3857d42

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page