Robustats is a Python library for high-performance computation of robust statistical estimators.
Project description
Robustats
Robustats is a Python library for high-performance computation of robust statistical estimators.
The functions that compute the robust estimators are implemented in C for speed and called by Python.
Estimators implemented in the library:
- Weighted Median (temporal complexity:
O(n)
) [1, 2, 3] - Medcouple (temporal complexity:
O(n * log(n))
) [4, 5, 6, 7] - Mode (temporal complexity:
O(n * log(n))
) [8]
How to Install
This library requires Python 3.
You can install the library directly from GitHub using the following command.
pip install -e 'git+https://github.com/FilippoBovo/robustats.git#egg=robustats'
Otherwise, you may clone the repository, and install and test the Robustats package in the following way.
git clone https://github.com/FilippoBovo/robustats.git
cd robustats
pip install -e .
python -m unittest
How to Use
This is an example of how to use the Robustats library in Python.
import numpy as np
import robustats
# Weighted Median
x = np.array([1.1, 5.3, 3.7, 2.1, 7.0, 9.9])
weights = np.array([1.1, 0.4, 2.1, 3.5, 1.2, 0.8])
weighted_median = robustats.weighted_median(x, weights)
print("The weighted median is {}".format(weighted_median))
# Output: The weighted median is 2.1
# Medcouple
x = np.array([0.2, 0.17, 0.08, 0.16, 0.88, 0.86, 0.09, 0.54, 0.27, 0.14])
medcouple = robustats.medcouple(x)
print("The medcouple is {}".format(medcouple))
# Output: The medcouple is 0.7692307692307692
# Mode
x = np.array([1., 2., 2., 3., 3., 3., 4., 4., 5.])
mode = robustats.mode(x)
print("The mode is {}".format(mode))
# Output: The mode is 3.0
How to Contribute
If you wish to contribute to this library, please follow the patterns and style of the rest of the code.
Tips:
- In C, use
malloc
to allocate memory to the heap, instead of creating arrays that allocate memory to the stack, as with large array we would incur in a segmentation fault due to stack overflow. - Avoid recursions where possible to limit the spatial complexity of the problem. In place of recursions, use loops.
References
[1] Cormen, Leiserson, Rivest, Stein - Introduction to Algorithms (3rd Edition).
[2] Cormen - Introduction to Algorithms (3rd Edition) - Instructor's Manual.
[3] Weighted median on Wikipedia.
[6] Medcouple implementation in Python by Jordi Gutiérrez Hermoso.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.