Skip to main content

Fully cumstomizable robust Independent Components Analysis (ICA)

Project description

robustica

robustica logo

Fully customizable robust Independent Component Analysis (ICA).

pipy License

Description

This package contains 3 modules:

  • RobustICA

    Defines the most important class that allows to perform and customize robust independent component analysis.

  • InferComponents

    Retrieves the number of components that explain a user-defined percentage of variance.

  • examples

    Contains handy functions to quickly create or access example datasets.

A more user-friendly documentation can be found at https://crg-cnag.github.io/robustica/.

Requirements

In brackets, versions of packages used to develop robustica.

  • numpy (1.19.2)
  • pandas (1.1.2)
  • scipy (1.6.2)
  • scikit-learn (0.23.2)
  • joblib (1.0.1)
  • tqdm (4.59.0)
  • (optional) scikit-learn-extra (0.2.0): required only for clustering algorithms KMedoids and CommonNNClustering

Installation

[optional] scikit-learn-extra incompatibility

To use the clustering algorithms KMedoids and CommonNNClustering, install a forked version first to avoid incompatibility with the newest numpy (see #6 for more info on this).

pip install git+https://github.com/TimotheeMathieu/scikit-learn-extra

pip

pip install robustica

local (latest version)

git clone https://github.com/CRG-CNAG/robustica
cd robustica
pip install -e .

Usage

from robustica import RobustICA
from robustica.examples import make_sampledata

X = make_sampledata(ncol=300, nrow=2000, seed=123)

rica = RobustICA(n_components=10)
# note that by default, we use DBSCAN algorithm and the number of components can be smaller
# than the number of components defined.
S, A = rica.fit_transform(X)

# source matrix (nrow x n_components)
print(S.shape)
print(S)
(2000, 3) 
[[ 0.00975714  0.00619138  0.00502649]
 [-0.0021527  -0.0376857   0.0117938 ]
 [ 0.00046302  0.01712561  0.00518039]
 ...
 [ 0.00128344 -0.00767099  0.0047334 ]
 [ 0.00644422 -0.00498327  0.01325542]
 [ 0.0017873  -0.01739889 -0.00445954]]
# mixing matrix (ncol x n_components)
print(A.shape)
print(A)
(300, 3)
[[-1.79503194e-02 -1.05611924e+00  5.36688700e-01]
 [ 1.03342514e-01  7.43471382e-02  4.90472157e-01]
 [ 4.89753256e-01 -1.11300905e+00 -7.55809647e-01]
 ...
 [ 4.30468472e-01 -4.87992838e-01 -7.77965512e-01]
 [ 3.44078031e-02  4.09029805e-01 -7.29076312e-01]
 [ 2.15557427e-02  2.89301273e-01 -2.96690459e-01]]

Tutorials

Contact

This project has been fully developed at the Centre for Genomic Regulation within the group of Design of Biological Systems

Please, report any issues that you experience through this repository's "Issues" or email:

License

robustica is distributed under a BSD 3-Clause License (see LICENSE).

Citation

Anglada-Girotto, M., Miravet-Verde, S., Serrano, L., Head, S. A.. "robustica: customizable robust independent component analysis". BMC Bioinformatics 23, 519 (2022). DOI: https://doi.org/10.1186/s12859-022-05043-9

References

  • Himberg, J., & Hyvarinen, A. "Icasso: software for investigating the reliability of ICA estimates by clustering and visualization". IEEE XIII Workshop on Neural Networks for Signal Processing (2003). DOI: https://doi.org/10.1109/NNSP.2003.1318025
  • Sastry, Anand V., et al. "The Escherichia coli transcriptome mostly consists of independently regulated modules." Nature communications 10.1 (2019): 1-14. DOI: https://doi.org/10.1038/s41467-019-13483-w
  • Kairov, U., Cantini, L., Greco, A. et al. Determining the optimal number of independent components for reproducible transcriptomic data analysis. BMC Genomics 18, 712 (2017). DOI: https://doi.org/10.1186/s12864-017-4112-9

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

robustica-0.1.4.tar.gz (16.3 kB view details)

Uploaded Source

File details

Details for the file robustica-0.1.4.tar.gz.

File metadata

  • Download URL: robustica-0.1.4.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.8

File hashes

Hashes for robustica-0.1.4.tar.gz
Algorithm Hash digest
SHA256 2ec0a10d8815a016c8319ffe4b460914044efd47cf4186e4ded2d3b96ca91aa9
MD5 a0f3f8e8c782a713ae662f4640046c05
BLAKE2b-256 87174f0c142525f655729e5710b2763d8306a64b78efaa4488eab5750f45d18d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page