Skip to main content

Outlier identifiers functions package.

Project description

OutlierIdentifiers

In brief

This a Python package for 1D outlier identifier functions. If follows closely the Wolfram Language (WL) paclet [AAp1], the R package [AAp2], and the Raku package [AAp3].


Installation

From PyPI.org:

python3 -m pip install OutlierIdentifiers

From GitHub:

python3 -m pip install git+https://github.com/antononcube/Python-packages.git#egg=OutlierIdentifiers\&subdirectory=OutlierIdentifiers

Usage examples

Load packages:

import numpy as np
import plotly.graph_objects as go

from OutlierIdentifiers import *

Generate a vector with random numbers:

np.random.seed(14)
vec = np.random.normal(loc=10, scale=20, size=30)
print(vec)
[ 41.02678223  11.58372049  13.47953057   8.55326868 -30.086588
  12.89355626 -20.02337245  14.22218902  -1.16410111  31.6905813
   6.27421752  10.2932275  -11.51138939  22.84504148   6.39326577
  22.40600507  26.21948669  25.55871733   5.25020644 -27.83824691
 -13.44243588  26.72413943  30.18546801  35.86198722  -0.98662331
  -9.6342573   28.29345516  27.46140757  10.44222283   9.91712833]

Plot the vector:

# Create a scatter plot with markers
fig = go.Figure(data=go.Scatter(y=vec, mode='markers'))

# Add labels and title
fig.update_layout(title='Vector of Numbers', xaxis_title='Index', yaxis_title='Value', template = "plotly_dark")

# Display the plot
fig.show()

Find outlier positions:

outlier_identifier(vec, identifier=hampel_identifier_parameters)
array([ True, False, False, False,  True, False,  True, False, False,
        True, False, False,  True, False, False, False, False, False,
       False,  True,  True, False, False,  True, False,  True, False,
       False, False, False])

Find outlier values:

outlier_identifier(vec, identifier=hampel_identifier_parameters, value = True)
array([ 41.02678223, -30.086588  , -20.02337245,  31.6905813 ,
       -11.51138939, -27.83824691, -13.44243588,  35.86198722,
        -9.6342573 ])

Find top outlier positions and values:

outlier_identifier(vec, identifier = lambda v: top_outliers(hampel_identifier_parameters(v)))
array([ True, False, False, False, False, False, False, False, False,
        True, False, False, False, False, False, False, False, False,
       False, False, False, False, False,  True, False, False, False,
       False, False, False])
outlier_identifier(vec, identifier = lambda v: top_outliers(hampel_identifier_parameters(v)), value=True)
array([41.02678223, 31.6905813 , 35.86198722])

Find bottom outlier positions and values (using quartiles-based identifier):

outlier_identifier(vec, identifier = lambda v: bottom_outliers(quartile_identifier_parameters(v)))
array([False, False, False, False,  True, False,  True, False, False,
       False, False, False, False, False, False, False, False, False,
       False,  True, False, False, False, False, False, False, False,
       False, False, False])
outlier_identifier(vec, identifier = lambda v: bottom_outliers(quartile_identifier_parameters(v)), value=True)
array([-30.086588  , -20.02337245, -27.83824691])

Here is another way to get the outlier values:

vec[pred]
array([-30.086588  , -20.02337245, -27.83824691])

The available outlier parameters functions are:

  • hampel_identifier_parameters
  • splus_quartile_identifier_parameters
  • quartile_identifier_parameters
[ f(vec) for f in (hampel_identifier_parameters, splus_quartile_identifier_parameters, quartile_identifier_parameters)]
[(-8.796653643076334, 30.822596969354976),
 (-37.649981209714, 64.27685968784428),
 (-14.46873856125025, 36.49468188752889)]

References

[AA1] Anton Antonov, "Outlier detection in a list of numbers", (2013), MathematicaForPrediction at WordPress.

[AAp1] Anton Antonov, OutlierIdentifiers WL paclet, (2023), Wolfram Language Paclet Repository.

[AAp2] Anton Antonov, OutlierIdentifiers R package, (2019), R-packages at GitHub/antononcube.

[AAp3] Anton Antonov, OutlierIdentifiers Raku package, (2022), GitHub/antononcube.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

outlieridentifiers-0.1.1.tar.gz (4.8 kB view details)

Uploaded Source

Built Distribution

OutlierIdentifiers-0.1.1-py3-none-any.whl (5.1 kB view details)

Uploaded Python 3

File details

Details for the file outlieridentifiers-0.1.1.tar.gz.

File metadata

  • Download URL: outlieridentifiers-0.1.1.tar.gz
  • Upload date:
  • Size: 4.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.2

File hashes

Hashes for outlieridentifiers-0.1.1.tar.gz
Algorithm Hash digest
SHA256 8a019686249cb562ab42f606deae6e30aed4b825d977e0fddfc5419297901375
MD5 c7805c9e6aa998f2e155b13e7b9408d3
BLAKE2b-256 a8e19be63c6067e46fff59543015a0e6a9214e07eae311d1f2782c00ad4a7ab4

See more details on using hashes here.

File details

Details for the file OutlierIdentifiers-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for OutlierIdentifiers-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bb425b0db0b20b754c592e0b222b982a7e50da1333a381faf2d07d90e877678b
MD5 c6ec224d284e98bcea03bc2ad590204c
BLAKE2b-256 04aff504432edd91b0d8f637b5b1adfb4ca0a33e557012b908eb0fdb76334314

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page