Outlier identifiers functions package.
Project description
OutlierIdentifiers
In brief
This a Python package for 1D outlier identifier functions. If follows closely the Wolfram Language (WL) paclet [AAp1], the R package [AAp2], and the Raku package [AAp3].
Here is a Jupyter notebook with usage examples: "OutlierIdentifiers-guide.ipynb"; (Markdown version).
Installation
From PyPI.org:
python3 -m pip install OutlierIdentifiers
From GitHub:
python3 -m pip install git+https://github.com/antononcube/Python-packages.git#egg=OutlierIdentifiers\&subdirectory=OutlierIdentifiers
Usage examples
Load packages:
import numpy as np
import plotly.graph_objects as go
from OutlierIdentifiers import *
Generate a vector with random numbers:
np.random.seed(14)
vec = np.random.normal(loc=10, scale=20, size=30)
print(vec)
[ 41.02678223 11.58372049 13.47953057 8.55326868 -30.086588
12.89355626 -20.02337245 14.22218902 -1.16410111 31.6905813
6.27421752 10.2932275 -11.51138939 22.84504148 6.39326577
22.40600507 26.21948669 25.55871733 5.25020644 -27.83824691
-13.44243588 26.72413943 30.18546801 35.86198722 -0.98662331
-9.6342573 28.29345516 27.46140757 10.44222283 9.91712833]
Plot the vector:
# Create a scatter plot with markers
fig = go.Figure(data=go.Scatter(y=vec, mode='markers'))
# Add labels and title
fig.update_layout(title='Vector of Numbers', xaxis_title='Index', yaxis_title='Value', template = "plotly_dark")
# Display the plot
fig.show()
Find outlier positions:
outlier_identifier(vec, identifier=hampel_identifier_parameters)
array([ True, False, False, False, True, False, True, False, False,
True, False, False, True, False, False, False, False, False,
False, True, True, False, False, True, False, True, False,
False, False, False])
Find outlier values:
outlier_identifier(vec, identifier=hampel_identifier_parameters, value = True)
array([ 41.02678223, -30.086588 , -20.02337245, 31.6905813 ,
-11.51138939, -27.83824691, -13.44243588, 35.86198722,
-9.6342573 ])
Find top outlier positions and values:
outlier_identifier(vec, identifier = lambda v: top_outliers(hampel_identifier_parameters(v)))
array([ True, False, False, False, False, False, False, False, False,
True, False, False, False, False, False, False, False, False,
False, False, False, False, False, True, False, False, False,
False, False, False])
outlier_identifier(vec, identifier = lambda v: top_outliers(hampel_identifier_parameters(v)), value=True)
array([41.02678223, 31.6905813 , 35.86198722])
Find bottom outlier positions and values (using quartiles-based identifier):
outlier_identifier(vec, identifier = lambda v: bottom_outliers(quartile_identifier_parameters(v)))
array([False, False, False, False, True, False, True, False, False,
False, False, False, False, False, False, False, False, False,
False, True, False, False, False, False, False, False, False,
False, False, False])
outlier_identifier(vec, identifier = lambda v: bottom_outliers(quartile_identifier_parameters(v)), value=True)
array([-30.086588 , -20.02337245, -27.83824691])
Here is another way to get the outlier values:
vec[pred]
array([-30.086588 , -20.02337245, -27.83824691])
The available outlier parameters functions are:
hampel_identifier_parameters
splus_quartile_identifier_parameters
quartile_identifier_parameters
[ f(vec) for f in (hampel_identifier_parameters, splus_quartile_identifier_parameters, quartile_identifier_parameters)]
[(-8.796653643076334, 30.822596969354976),
(-37.649981209714, 64.27685968784428),
(-14.46873856125025, 36.49468188752889)]
References
[AA1] Anton Antonov, "Outlier detection in a list of numbers", (2013), MathematicaForPrediction at WordPress.
[AAp1] Anton Antonov, OutlierIdentifiers WL paclet, (2023), Wolfram Language Paclet Repository.
[AAp2] Anton Antonov, OutlierIdentifiers R package, (2019), R-packages at GitHub/antononcube.
[AAp3] Anton Antonov, OutlierIdentifiers Raku package, (2022), GitHub/antononcube.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file outlieridentifiers-0.1.2.tar.gz
.
File metadata
- Download URL: outlieridentifiers-0.1.2.tar.gz
- Upload date:
- Size: 5.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae9736d79ef60f76f6fa21cf8db73b357a48011fa486c74a830d20e78492e708 |
|
MD5 | 8afe62cedcb16f42338b181905cffcf8 |
|
BLAKE2b-256 | fc396f1405ed925d44c1126edb360bf1a5a5d942df30019273c2f27296083d16 |
File details
Details for the file OutlierIdentifiers-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: OutlierIdentifiers-0.1.2-py3-none-any.whl
- Upload date:
- Size: 5.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 74a97360739c75c619b82caa1ec46f3e3cbd1f75af2756b92d9a3898ab3c284c |
|
MD5 | f733997bce544a1a31a89da8a19cb1b0 |
|
BLAKE2b-256 | 9759c77d4c512a51f52bac9e2210f16cf6ee76b93a1b0df34cef6f76b6a99a61 |