Skip to main content

A biclustering library with datasets, evaluation measures and a benchmarking framework

Project description

biclustlib

The package is an extension of biclustlib Python library by Victor Alexandre Padilha.
It is highly recommended to see the original repository first.
The goal of this package is to create a unified biclustering framework for performing research on gene expression data and comparing different biclustering algorithms and measures.

Distributed under GPLv3 license.

Installation

pip install biclustlib
You must also install R and the following R packages:

Benchmarking example

import multiprocessing

import pandas as pd
from sklearn.preprocessing import KBinsDiscretizer

from biclustlib.algorithms import *
from biclustlib.algorithms.wrappers import *
from biclustlib.benchmark import GeneExpressionBenchmark, Algorithm
from biclustlib.benchmark.data import load_tavazoie, load_prelic


def discretize_data(raw_data: pd.DataFrame, n_bins: int = 2) -> pd.DataFrame:
    return pd.DataFrame(KBinsDiscretizer(n_bins, encode='ordinal', strategy='kmeans').fit_transform(raw_data),
                        index=raw_data.index).astype(int if n_bins > 2 else bool)


if __name__ == '__main__':
    
    data = load_tavazoie()
    n_biclusters = 10
    reduction_level = 10
    discretion_level = 30

    data_dis = discretize_data(data, discretion_level)
    data_bin = discretize_data(data)

    setup = [
        Algorithm('CCA', ChengChurchAlgorithm(n_biclusters), data),
        Algorithm('xMotifs', RConservedGeneExpressionMotifs(n_biclusters), data_dis),
        Algorithm('BiBit', BitPatternBiclusteringAlgorithm(), data_bin),
        Algorithm('Bimax', RBinaryInclusionMaximalBiclusteringAlgorithm(n_biclusters), data_bin),
        Algorithm('LAS', LargeAverageSubmatrices(n_biclusters), data),
        Algorithm('Plaid', RPlaid(n_biclusters), data),
        Algorithm('Spectral', Spectral(n_clusters=data.shape[1] // 2), data + abs(data.min().min()) + 1),
    ]

    with multiprocessing.Pool() as pool:
        tavazoie_benchmark = GeneExpressionBenchmark(algorithms=setup,
                                                     raw_data=data,
                                                     reduction_level=reduction_level).run(pool)
    tavazoie_benchmark.generate_report()

    tavazoie_benchmark.perform_goea()
    tavazoie_benchmark.generate_goea_report()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biclustlib-0.0.13.tar.gz (18.7 MB view details)

Uploaded Source

Built Distribution

biclustlib-0.0.13-py3-none-any.whl (19.3 MB view details)

Uploaded Python 3

File details

Details for the file biclustlib-0.0.13.tar.gz.

File metadata

  • Download URL: biclustlib-0.0.13.tar.gz
  • Upload date:
  • Size: 18.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for biclustlib-0.0.13.tar.gz
Algorithm Hash digest
SHA256 ed21dfda5b9f7447f8b14a182b5fae951c8ad8885d5b3131cea467ed94502c90
MD5 6137befc04b5ab8f8e2766747add97fd
BLAKE2b-256 03e28dbf7b5edd44dd1a585524d0244b992a696a44f1fdb59b8e4ad32f9b0173

See more details on using hashes here.

File details

Details for the file biclustlib-0.0.13-py3-none-any.whl.

File metadata

  • Download URL: biclustlib-0.0.13-py3-none-any.whl
  • Upload date:
  • Size: 19.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for biclustlib-0.0.13-py3-none-any.whl
Algorithm Hash digest
SHA256 862795ccf30338336b9f6e95c205ef20e0381f1c3c592977eed6be11e3c79dc6
MD5 fb8b9899f1dbe1792683c11f46973bbc
BLAKE2b-256 12e27d77584bf22587a8127807549619bf4f99e3687a78ebc2a5be2745e3e37c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page