Skip to main content

Outlier detection using the MISS (MAD-IQR-SD Simultaneous) method

Project description

๐ŸŽฏ missoutlier

Outlier Detection Using the MISS Method

A weighted composite of MAD, IQR, and SD for robust univariate outlier detection

Python PyPI License: MIT PsyArXiv


Overview

missoutlier implements the MISS (MADโ€“IQRโ€“SD Simultaneous) method, a new approach for univariate outlier detection that combines three classical techniques into a single robust threshold:

Method Bounds Weight
MAD (Median Absolute Deviation) median ยฑ 1.5 ร— MAD 87.8%
IQR (Interquartile Range) Q25/Q75 ยฑ 1 ร— IQR 1.2%
SD (Standard Deviation) mean ยฑ 5 ร— SD 11.0%

The composite threshold is computed as:

$$\text{MISS} = 0.878 \times \text{MAD} + 0.012 \times \text{IQR} + 0.11 \times \text{SD}$$

By heavily weighting the robust MAD while retaining sensitivity from IQR and SD, MISS offers a balanced approach that handles skewed and heavy-tailed distributions better than any single method alone.


Installation

# Install from PyPI
pip install missoutlier

Or install directly from GitHub:

pip install git+https://github.com/GuillaumePech/missOutlierPy.git

Dependencies: numpy >= 1.20, scipy >= 1.7


Quick Start

import numpy as np
from missoutlier import detect_outliers_miss

# Generate data with outliers
x = np.concatenate([np.random.randn(100), [50, -40]])

# Default: replace outliers with NaN
x_clean = detect_outliers_miss(x)
# Detected 2 outliers (1.96% of data) using MISS method.

# Drop outliers entirely
x_dropped = detect_outliers_miss(x, drop=True)
# Detected 2 outliers (1.96% of data) using MISS method.

# Handle existing NaNs
x_na = np.concatenate([np.random.randn(100), [np.nan, 50]])
x_clean = detect_outliers_miss(x_na, na_rm=True)

# Silent mode (no messages)
x_clean = detect_outliers_miss(x, silent=True)

Parameters

Parameter Type Default Description
data array-like โ€” Input data (must be one-dimensional)
drop bool False If True, removes outliers. If False, replaces them with NaN
na_rm bool False If True, ignores NaN values when computing thresholds
silent bool False If True, suppresses the detection message

How It Works

                โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                โ”‚  Input Data  โ”‚
                โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ”‚
          โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
          โ–ผ            โ–ผ            โ–ผ
     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
     โ”‚ 1.5 MAD โ”‚ โ”‚  1 IQR  โ”‚ โ”‚  5 SD   โ”‚
     โ”‚  ร—0.878 โ”‚ โ”‚  ร—0.012 โ”‚ โ”‚  ร—0.11  โ”‚
     โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜
          โ”‚            โ”‚            โ”‚
          โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ–ผ
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ”‚ MISS Threshold โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ–ผ
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ”‚ Flag Outliers  โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Citation

If you use this package in your research, please cite:

Pech, G., Vaccaro, N., Caspar, E. A., Amerio, P., Cleeremans, A., Leys, C., & Ley, C. (2026). How not to MISS an outlier: comparing three classic univariate methods and introducing a new one, the MADโ€“IQRโ€“SD Simultaneous (MISS). PsyArXiv. https://doi.org/10.31234/osf.io/2r9yw_v2


License

MIT ยฉ Guillaume Pech

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

missoutlier-0.1.2.tar.gz (5.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

missoutlier-0.1.2-py3-none-any.whl (5.6 kB view details)

Uploaded Python 3

File details

Details for the file missoutlier-0.1.2.tar.gz.

File metadata

  • Download URL: missoutlier-0.1.2.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.18

File hashes

Hashes for missoutlier-0.1.2.tar.gz
Algorithm Hash digest
SHA256 c336119c13bcb051a70ab8a935d8e26715db8ce1f989b0efec8152a1ee4c7829
MD5 32bfa0545fa081138e87421353c9407a
BLAKE2b-256 89c914ce240f7b1caa07f5a0404b1569a7f3ffc4b73e21759aa07d14bdf17b30

See more details on using hashes here.

File details

Details for the file missoutlier-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: missoutlier-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 5.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.18

File hashes

Hashes for missoutlier-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 980e429a98097c52b4df4f69aa3ace49d9aa68db8e735d6a858e96f534ae307e
MD5 230090389ceb3443b933d29dee65b2db
BLAKE2b-256 3cd4b887f68efd260969b34617955fe3367a625110867cd906d71e158038b6d1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page