Skip to main content

Outlier detection using the MISS (MAD-IQR-SD Simultaneous) method

Project description

๐ŸŽฏ missoutlier

Outlier Detection Using the MISS Method

A weighted composite of MAD, IQR, and SD for robust univariate outlier detection

Python PyPI License: MIT PsyArXiv


Overview

missoutlier implements the MISS (MADโ€“IQRโ€“SD Simultaneous) method, a new approach for univariate outlier detection that combines three classical techniques into a single robust threshold:

Method Bounds Weight
MAD (Median Absolute Deviation) median ยฑ 1.5 ร— MAD 87.8%
IQR (Interquartile Range) Q25/Q75 ยฑ 2 ร— IQR 1.2%
SD (Standard Deviation) mean ยฑ 5 ร— SD 11.0%

The composite threshold is computed as:

$$\text{MISS} = 0.878 \times \text{MAD} + 0.012 \times \text{IQR} + 0.11 \times \text{SD}$$

By heavily weighting the robust MAD while retaining sensitivity from IQR and SD, MISS offers a balanced approach that handles skewed and heavy-tailed distributions better than any single method alone.


Installation

# Install from PyPI
pip install missoutlier

Or install directly from GitHub:

pip install git+https://github.com/GuillaumePech/missOutlierPy.git

Dependencies: numpy >= 1.20, scipy >= 1.7


Quick Start

import numpy as np
from missoutlier import detect_outliers_miss

# Generate data with outliers
x = np.concatenate([np.random.randn(100), [50, -40]])

# Default: replace outliers with NaN
x_clean = detect_outliers_miss(x)
# Detected 2 outliers (1.96% of data) using MISS method.

# Drop outliers entirely
x_dropped = detect_outliers_miss(x, drop=True)
# Detected 2 outliers (1.96% of data) using MISS method.

# Handle existing NaNs
x_na = np.concatenate([np.random.randn(100), [np.nan, 50]])
x_clean = detect_outliers_miss(x_na, na_rm=True)

# Silent mode (no messages)
x_clean = detect_outliers_miss(x, silent=True)

Parameters

Parameter Type Default Description
data array-like โ€” Input data (must be one-dimensional)
drop bool False If True, removes outliers. If False, replaces them with NaN
na_rm bool False If True, ignores NaN values when computing thresholds
silent bool False If True, suppresses the detection message

How It Works

                โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                โ”‚  Input Data  โ”‚
                โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ”‚
          โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
          โ–ผ            โ–ผ            โ–ผ
     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
     โ”‚   MAD   โ”‚ โ”‚   IQR   โ”‚ โ”‚   SD    โ”‚
     โ”‚  ร—0.878 โ”‚ โ”‚  ร—0.012 โ”‚ โ”‚  ร—0.11  โ”‚
     โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜
          โ”‚            โ”‚            โ”‚
          โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ–ผ
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ”‚ MISS Threshold โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ–ผ
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ”‚ Flag Outliers  โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Also available in R

devtools::install_github("GuillaumePech/missOutlierR")

Citation

If you use this package in your research, please cite:

Pech, G., Vaccaro, N., Caspar, E. A., Amerio, P., Cleeremans, A., Leys, C., & Ley, C. (2026). How not to MISS an outlier: comparing three classic univariate methods and introducing a new one, the MADโ€“IQRโ€“SD Simultaneous (MISS). PsyArXiv. https://doi.org/10.31234/osf.io/2r9yw_v2

@article{pech2026miss,
  title={How not to {MISS} an outlier: comparing three classic univariate methods and introducing a new one, the {MAD--IQR--SD} Simultaneous ({MISS})},
  author={Pech, Guillaume and Vaccaro, Niccol{\`o} and Caspar, Emilie A. and Amerio, Pietro and Cleeremans, Axel and Leys, Christophe and Ley, Christophe},
  year={2026},
  journal={PsyArXiv},
  doi={10.31234/osf.io/2r9yw_v2}
}

License

MIT ยฉ Guillaume Pech

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

missoutlier-0.1.1.tar.gz (5.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

missoutlier-0.1.1-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file missoutlier-0.1.1.tar.gz.

File metadata

  • Download URL: missoutlier-0.1.1.tar.gz
  • Upload date:
  • Size: 5.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.18

File hashes

Hashes for missoutlier-0.1.1.tar.gz
Algorithm Hash digest
SHA256 45ed92142d8b79c9fdac2aa3b8df86282e63e8ede9f7072a509c38435f5cb9cf
MD5 932d41f7d46171772115b2377a3f35d4
BLAKE2b-256 7bbbeeb71f391a0c52f8b20076e7013001f2f926636d3160f2aabe1ac37f0b13

See more details on using hashes here.

File details

Details for the file missoutlier-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: missoutlier-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.18

File hashes

Hashes for missoutlier-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1c53463525b80c4d9db92b044868afd679c9dc264732be46f912fbe60bdb4dc5
MD5 44d07c3b056b53c1e28f845817df8380
BLAKE2b-256 dc83e9cabaa7b7227887500ce27108ac186025dec7329e8e435f740563dc7b70

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page