Skip to main content

Minimal tool for outliers detection on small samples set

Project description

Outlier Detector toolkit

Build Status codecov License: MIT Code style: black

This project features a set of tools for outlier detection, marking or filtering away samples as they come to your Python analysis code.

Most of the tools rely on double tailed Dixon's Q-test (https://en.wikipedia.org/wiki/Dixon%27s_Q_test).

TL;DR

I have a sample, and a know data distribution: is the sample an outlier?
sample = 2.7
distribution = [0.1, 1.1, 4.78, 2.0, 7.2, 5.3]

from outlier_detector.functions import is_outlier
print(is_outlier(distribution, sample))
I have a distribution and I iterate over it: is the n-th sample is an outlier?
distribution = [0.1, 1.1, 4.78, 2.0, 7.2, 5.3, 8.1, -4.1, 5.4]
from outlier_detector.detectors import OutlierDetector
od = OutlierDetector(buffer_samples=5)
for x in distribution:
    print(od.is_outlier(x))
I have a generating object from which I pop samples and I want only valid samples, rejecting outliers.
distribution = [0.1, 1.1, 4.78, 2.0, 7.2, 5.3, 8.1, -14.1, 5.4]
from outlier_detector.filters import filter_outlier

class MyGen:
    def __init__(self):
        self.cursor = -1

    @filter_outlier()
    def pop(self):
        self.cursor += 1
        return distribution[self.cursor]

g = MyGen()
while True:
    try:
        r = g.pop()
        print(r)
    except IndexError:
        print('No more data')

Documentation

The toolkit is organized so you can exploit one of the following pattern in the easiest way possible: functions for static analysis, detectors for objects with internal buffers, and filters for decorators.

For documentation see doc file

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

outlier_detector-0.0.1.tar.gz (8.3 kB view hashes)

Uploaded Source

Built Distribution

outlier_detector-0.0.1-py3-none-any.whl (12.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page