Skip to main content

High-dimensional statistical inference tools for Python

Project description

HiDimStat: High-dimensional statistical inference tool for Python

build coverage

The HiDimStat package provides statistical inference methods to solve the problem of support recovery in the context of high-dimensional and spatially structured data.

Installation

HiDimStat working only with Python 3, ideally Python 3.6+. For installation, run the following from terminal

pip install hidimstat

Or if you want the latest version available (for example to contribute to the development of this project:

git clone https://github.com/ja-che/hidimstat.git
cd hidimstat
pip install -e .

Dependencies

joblib
numpy
scipy
scikit-learn

To run examples it is neccessary to install matplotlib, and to run tests it is also needed to install pytest.

Documentation & Examples

All the documentation of HiDimStat is available at https://ja-che.github.io/hidimstat/.

As of now in the examples folder there are three Python scripts that illustrate how to use the main HiDimStat functions. In each script we handle a different kind of dataset: plot_2D_simulation_example.py handles a simulated dataset with a 2D spatial structure, plot_fmri_data_example.py solves the decoding problem on Haxby fMRI dataset, plot_meg_data_example.py tackles the source localization problem on several MEG/EEG datasets.

# For example run the following command in terminal
python plot_2D_simulation_example.py

References

The algorithms developed in this package have been detailed in several conference/journal articles that can be downloaded at https://ja-che.github.io/research.html.

Main references:

Ensemble of Clustered desparsified Lasso (ECDL):

  • Chevalier, J. A., Salmon, J., & Thirion, B. (2018). Statistical inference with ensemble of clustered desparsified lasso. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 638-646). Springer, Cham.

  • Chevalier, J. A., Nguyen, T. B., Thirion, B., & Salmon, J. (2021). Spatially relaxed inference on high-dimensional linear models. arXiv preprint arXiv:2106.02590.

Aggregation of multiple Knockoffs (AKO):

  • Nguyen T.-B., Chevalier J.-A., Thirion B., & Arlot S. (2020). Aggregation of Multiple Knockoffs. In Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, PMLR 119.

Application to decoding (fMRI data):

  • Chevalier, J. A., Nguyen T.-B., Salmon, J., Varoquaux, G. & Thirion, B. (2021). Decoding with confidence: Statistical control on decoder maps. In NeuroImage, 234, 117921.

Application to source localization (MEG/EEG data):

  • Chevalier, J. A., Gramfort, A., Salmon, J., & Thirion, B. (2020). Statistical control for spatio-temporal MEG/EEG source imaging with desparsified multi-task Lasso. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada.

If you use our packages, we would appreciate citations to the relevant aforementioned papers.

Other useful references:

For de-sparsified(or de-biased) Lasso:

  • Javanmard, A., & Montanari, A. (2014). Confidence intervals and hypothesis testing for high-dimensional regression. The Journal of Machine Learning Research, 15(1), 2869-2909.

  • Zhang, C. H., & Zhang, S. S. (2014). Confidence intervals for low dimensional parameters in high dimensional linear models. Journal of the Royal Statistical Society: Series B: Statistical Methodology, 217-242.

  • Van de Geer, S., Bühlmann, P., Ritov, Y. A., & Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. The Annals of Statistics, 42(3), 1166-1202.

For Knockoffs Inference:

  • Barber, R. F; Candès, E. J. (2015). Controlling the false discovery rate via knockoffs. Annals of Statistics. 43 , no. 5, 2055--2085. doi:10.1214/15-AOS1337. https://projecteuclid.org/euclid.aos/1438606853

  • Candès, E., Fan, Y., Janson, L., & Lv, J. (2018). Panning for gold: Model-X knockoffs for high dimensional controlled variable selection. Journal of the Royal Statistical Society Series B, 80(3), 551-577.

License

This project is licensed under the BSD 2-Clause License.

Acknowledgments

This project has been funded by Labex DigiCosme (ANR-11-LABEX-0045-DIGICOSME) as part of the program "Investissement d’Avenir" (ANR-11-IDEX-0003-02), by the Fast Big project (ANR-17-CE23-0011) and the KARAIB AI Chair (ANR-20-CHIA-0025-01). This study has also been supported by the European Union’s Horizon 2020 research and innovation program (Grant Agreement No. 945539, Human Brain Project SGA3).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hidimstat-0.1.0.tar.gz (53.2 kB view details)

Uploaded Source

Built Distribution

hidimstat-0.1.0-py3-none-any.whl (36.8 kB view details)

Uploaded Python 3

File details

Details for the file hidimstat-0.1.0.tar.gz.

File metadata

  • Download URL: hidimstat-0.1.0.tar.gz
  • Upload date:
  • Size: 53.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.5.0.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for hidimstat-0.1.0.tar.gz
Algorithm Hash digest
SHA256 109ae3f5900fccaf182c7b6c2432893f7be75cd1fd76e56e5e663e3cc5d924d0
MD5 d5ac6ebc1e72dd599e86af8959c7f572
BLAKE2b-256 eff233bd2dce3615e037b2c1d25772bb96b27ff1ba2514bc4e7141be81eaade7

See more details on using hashes here.

File details

Details for the file hidimstat-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: hidimstat-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 36.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.5.0.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for hidimstat-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7fbdd6a2f1c8d10e1790366876426ace257ef002353bc62c8e3cc44a20bc197f
MD5 1af684ff785262e8327efc1fe813efba
BLAKE2b-256 cd280af0a723c85a75aaac9d458e2e3892932096e300976c86cbd8a37a02924a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page