A package to compute bootstrap sampling significance test
Project description
boostsa - BOOtSTrap SAmpling in pyhton
.. image:: https://img.shields.io/github/license/fornaciari/boostsa :target: https://lbesson.mit-license.org/ :alt: License
Intro
boostsa - BOOtSTrap SAmpinlg - is a tool to compute bootstrap sampling significance test, even in the pipeline of a complex experimental design.
- Free software: MIT license
- Documentation: https://boostsa.readthedocs.io.
Installation
.. code-block:: bash
pip install -U boostsa
Getting started
First, import boostsa
:
.. code-block:: python
from boostsa import Bootstrap
Then, create a boostrap instance. You will use it to store your experiments' results and to compute the bootstrap sampling significance test:
.. code-block:: python
boot = Bootstrap()
Inputs ^^^^^^
The assumption is that you ran at least two classification task experiments, which you want compare.
One is your baseline, or control, or hypothesis 0 (h0).
The other one is the experimental condition that hopefully beats the baseline, or treatment, or hypothesis 1 (h1).
You compare the h0 and h1 predictions against the same targets.
Therefore, h0 predictions, h1 predictions and targets will be the your Bootstrap
instance's data inputs.
Outputs ^^^^^^^
By defalut, boostsa produces two output files:
results.tsv
, that contains the experiments' performance and the (possible) significance levels;outcomes.json
, that contains targets and predictions for all the experimental conditions.
You can define the outputs when you create the instance, using the following parameters:
save_results
, type:bool
, default:True
. This determines if you want to save the results.save_outcomes
, type:bool
, default:True
. This determines if you want to save the experiments' outcomes..dir_out
, type:str
, default:''
, that is your working directory. This indicates the directory where to save the results.
For example, if you want to save only the results in a particular folder, you will create an instance like this:
.. code-block:: python
boot = Bootstrap(save_outcomes=False, dir_out='my/favourite/directory/')
Test function
In the simplest conditions, you will run the bootstrap sampling significance test with the test
function.
It takes the following inputs:
targs
, type:list
orstr
. They are the targets, or gold standard, that you use as benchmark to measure the h0 and h1 predictions' performance. They can be a list of integers, representing the labels' indexes for each data point, or a string. In such case, the string will be interpreted as the path to a text file containing a single integer in each row, having the same meaning as for the list input.h0_preds
, type:list
orstr
. The h0 predictions, in the same formats oftargs
.h1_preds
, type:list
orstr
. The h1 predictions, in the same formats as above.h0_name
, type:str
, default:h0
. Expression to describe the h0 condition.h1_name
, type:str
, default:h1
. Expression to describe the h1 condition.n_loops
, type:int
, default:100
. Number of iterations for computing the bootstrap sampling.sample_size
, type:float
, default:.1
. Percentage of data points sampled, with respect to their whole set. The admitted values range between 0.05 (5%) and 0.5 (50%).verbose
, type:bool
, default:False
. If true, the experiments' performance is shown.
For example:
.. code-block:: python
boot.test(targs='../test_boot/h0.0/targs.txt', h0_preds='../test_boot/h0.0/preds.txt', h1_preds='../test_boot/h1.0/preds.txt', n_loops=1000, sample_size=.2, verbose=True)
The ouput will be:
.. sourcecode::
total size............... 1000
sample size.............. 200
targs count: ['class 0 freq 465 perc 46.50%', 'class 1 freq 535 perc 53.50%']
h0 preds count: ['class 0 freq 339 perc 33.90%', 'class 1 freq 661 perc 66.10%']
h1 preds count: ['class 0 freq 500 perc 50.00%', 'class 1 freq 500 perc 50.00%']
h0 F-measure............. 67.76 h1 F-measure............. 74.07 diff... 6.31
h0 accuracy.............. 69.0 h1 accuracy.............. 74.1 diff... 5.1
h0 precision............. 69.94 h1 precision............. 74.1 diff... 4.16
h0 recall................ 67.96 h1 recall................ 74.22 diff... 6.26
bootstrap: 100%|███████████████████████████| 1000/1000 [00:07<00:00, 139.84it/s]
count sample diff f1 is twice tot diff f1....... 37 / 1000 p < 0.037 *
count sample diff acc is twice tot diff acc...... 73 / 1000 p < 0.073
count sample diff prec is twice tot diff prec..... 111 / 1000 p < 0.111
count sample diff rec is twice tot diff rec ..... 27 / 1000 p < 0.027 *
Out[3]:
f1 diff_f1 sign_f1 acc diff_acc sign_acc prec diff_prec sign_prec rec diff_rec sign_rec
h0 67.76 69.0 69.94 67.96
h1 74.07 6.31 * 74.1 5.1 74.10 4.16 74.22 6.26 *
That's it!
Where you see two asterisks ** you have a significance with :math:p \le .01
; one asterisk * indicates siginficance with :math:p \le .05
.
For more complex experimental designs and technical/ethical considerations, please refer to the documentation page.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.