Skip to main content

Outlier identifiers functions package.

Project description

OutlierIdentifiers

In brief

This a Python package for 1D outlier identifier functions. If follows closely the Wolfram Language (WL) paclet [AAp1], the R package [AAp2], and the Raku package [AAp3].


Installation

From PyPI.org:

python3 -m pip install OutlierIdentifiers

From GitHub:

python3 -m pip install git+https://github.com/antononcube/Python-packages.git#egg=OutlierIdentifiers\&subdirectory=OutlierIdentifiers

Usage examples

Load packages:

import numpy as np
import plotly.graph_objects as go

from OutlierIdentifiers import *

Generate a vector with random numbers:

np.random.seed(14)
vec = np.random.normal(loc=10, scale=20, size=30)
print(vec)
[ 41.02678223  11.58372049  13.47953057   8.55326868 -30.086588
  12.89355626 -20.02337245  14.22218902  -1.16410111  31.6905813
   6.27421752  10.2932275  -11.51138939  22.84504148   6.39326577
  22.40600507  26.21948669  25.55871733   5.25020644 -27.83824691
 -13.44243588  26.72413943  30.18546801  35.86198722  -0.98662331
  -9.6342573   28.29345516  27.46140757  10.44222283   9.91712833]

Plot the vector:

# Create a scatter plot with markers
fig = go.Figure(data=go.Scatter(y=vec, mode='markers'))

# Add labels and title
fig.update_layout(title='Vector of Numbers', xaxis_title='Index', yaxis_title='Value', template = "plotly_dark")

# Display the plot
fig.show()

Find outlier positions:

outlier_identifier(vec, identifier=hampel_identifier_parameters)
array([ True, False, False, False,  True, False,  True, False, False,
        True, False, False,  True, False, False, False, False, False,
       False,  True,  True, False, False,  True, False,  True, False,
       False, False, False])

Find outlier values:

outlier_identifier(vec, identifier=hampel_identifier_parameters, value = True)
array([ 41.02678223, -30.086588  , -20.02337245,  31.6905813 ,
       -11.51138939, -27.83824691, -13.44243588,  35.86198722,
        -9.6342573 ])

Find top outlier positions and values:

outlier_identifier(vec, identifier = lambda v: top_outliers(hampel_identifier_parameters(v)))
array([ True, False, False, False, False, False, False, False, False,
        True, False, False, False, False, False, False, False, False,
       False, False, False, False, False,  True, False, False, False,
       False, False, False])
outlier_identifier(vec, identifier = lambda v: top_outliers(hampel_identifier_parameters(v)), value=True)
array([41.02678223, 31.6905813 , 35.86198722])

Find bottom outlier positions and values (using quartiles-based identifier):

outlier_identifier(vec, identifier = lambda v: bottom_outliers(quartile_identifier_parameters(v)))
array([False, False, False, False,  True, False,  True, False, False,
       False, False, False, False, False, False, False, False, False,
       False,  True, False, False, False, False, False, False, False,
       False, False, False])
outlier_identifier(vec, identifier = lambda v: bottom_outliers(quartile_identifier_parameters(v)), value=True)
array([-30.086588  , -20.02337245, -27.83824691])

Here is another way to get the outlier values:

vec[pred]
array([-30.086588  , -20.02337245, -27.83824691])

The available outlier parameters functions are:

  • hampel_identifier_parameters
  • splus_quartile_identifier_parameters
  • quartile_identifier_parameters
[ f(vec) for f in (hampel_identifier_parameters, splus_quartile_identifier_parameters, quartile_identifier_parameters)]
[(-8.796653643076334, 30.822596969354976),
 (-37.649981209714, 64.27685968784428),
 (-14.46873856125025, 36.49468188752889)]

References

[AA1] Anton Antonov, "Outlier detection in a list of numbers", (2013), MathematicaForPrediction at WordPress.

[AAp1] Anton Antonov, OutlierIdentifiers WL paclet, (2023), Wolfram Language Paclet Repository.

[AAp2] Anton Antonov, OutlierIdentifiers R package, (2019), R-packages at GitHub/antononcube.

[AAp3] Anton Antonov, OutlierIdentifiers Raku package, (2022), GitHub/antononcube.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

outlieridentifiers-0.1.0.tar.gz (4.8 kB view details)

Uploaded Source

Built Distribution

OutlierIdentifiers-0.1.0-py3-none-any.whl (5.0 kB view details)

Uploaded Python 3

File details

Details for the file outlieridentifiers-0.1.0.tar.gz.

File metadata

  • Download URL: outlieridentifiers-0.1.0.tar.gz
  • Upload date:
  • Size: 4.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.2

File hashes

Hashes for outlieridentifiers-0.1.0.tar.gz
Algorithm Hash digest
SHA256 cbc50675cedf85560fe15c766f91c919ec60617bd444b5b5a42b38c133d0126f
MD5 2688b1ebfbf01b0faa6484b5a3fd8941
BLAKE2b-256 ecef3dd3db45820d7ba49f47a491da340ebbe80f1a773d5e1af5853fd51d907e

See more details on using hashes here.

File details

Details for the file OutlierIdentifiers-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for OutlierIdentifiers-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e2978158aa05b126b0a2d68e47b307bc29ada22080d27f5b3a6ffa2d42641394
MD5 8918d9d57f866087020ed79f971dcabe
BLAKE2b-256 5d77e32ce35fec6c72defdd896cce18f4900dae543aa16b8c4c825cf65867359

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page