Python Software Foundation 20th Year Anniversary Fundraiser

Robustats is a Python library for high-performance computation of robust statistical estimators.

# Robustats

Robustats is a Python library for high-performance computation of robust statistical estimators.

The functions that compute the robust estimators are implemented in C for speed and called by Python.

Estimators implemented in the library:

• Weighted Median (temporal complexity: `O(n)`) [1, 2, 3]
• Medcouple (temporal complexity: `O(n * log(n))`) [4, 5, 6, 7]
• Mode (temporal complexity: `O(n * log(n))`) [8]

## How to Install

This library requires Python 3.

You can install the library using Pip.

```pip install robustats
```

You can also install the library directly from GitHub using the following command.

```pip install -e 'git+https://github.com/FilippoBovo/robustats.git#egg=robustats'
```

Otherwise, you may clone the repository, and install and test the Robustats package in the following way.

```git clone https://github.com/FilippoBovo/robustats.git
cd robustats
pip install -e .
python -m unittest
```

## How to Use

This is an example of how to use the Robustats library in Python.

```import numpy as np
import robustats

# Weighted Median
x = np.array([1.1, 5.3, 3.7, 2.1, 7.0, 9.9])
weights = np.array([1.1, 0.4, 2.1, 3.5, 1.2, 0.8])

weighted_median = robustats.weighted_median(x, weights)

print("The weighted median is {}".format(weighted_median))
# Output: The weighted median is 2.1

# Medcouple
x = np.array([0.2, 0.17, 0.08, 0.16, 0.88, 0.86, 0.09, 0.54, 0.27, 0.14])

medcouple = robustats.medcouple(x)

print("The medcouple is {}".format(medcouple))
# Output: The medcouple is 0.7749999999999999

# Mode
x = np.array([1., 2., 2., 3., 3., 3., 4., 4., 5.])

mode = robustats.mode(x)

print("The mode is {}".format(mode))
# Output: The mode is 3.0
```

## How to Contribute

If you wish to contribute to this library, please follow the patterns and style of the rest of the code.

Moreover, install the Git hooks.

```git config core.hooksPath .githooks
```

Tips:

• In C, use `malloc` to allocate memory to the heap, instead of creating arrays that allocate memory to the stack, as with large array we would incur in a segmentation fault due to stack overflow.
• Avoid recursions where possible to limit the spatial complexity of the problem. In place of recursions, use loops.