Skip to main content

A biclustering library with datasets, evaluation measures and a benchmarking framework

Project description

biclustlib

The package is an extension of biclustlib Python library by Victor Alexandre Padilha.
It is highly recommended to see the original repository first.
The goal of this package is to create a unified biclustering framework for performing research on gene expression data and comparing different biclustering algorithms and measures.

Distributed under GPLv3 license.

Installation

pip install biclustlib
You must also install R and the following R packages:

Benchmarking example

import multiprocessing

import pandas as pd
from sklearn.preprocessing import KBinsDiscretizer

from biclustlib.algorithms import *
from biclustlib.algorithms.wrappers import *
from biclustlib.benchmark import GeneExpressionBenchmark, Algorithm
from biclustlib.benchmark.data import load_tavazoie, load_prelic


def discretize_data(raw_data: pd.DataFrame, n_bins: int = 2) -> pd.DataFrame:
    return pd.DataFrame(KBinsDiscretizer(n_bins, encode='ordinal', strategy='kmeans').fit_transform(raw_data),
                        index=raw_data.index).astype(int if n_bins > 2 else bool)


if __name__ == '__main__':
    pool = multiprocessing.Pool()

    data = load_tavazoie()
    n_biclusters = 10
    reduction_level = 10
    discretion_level = 30

    data_dis = discretize_data(data, discretion_level)
    data_bin = discretize_data(data)

    setup = [
        Algorithm('CCA', ChengChurchAlgorithm(n_biclusters), data),
        Algorithm('xMotifs', RConservedGeneExpressionMotifs(n_biclusters), data_dis),
        Algorithm('BiBit', BitPatternBiclusteringAlgorithm(), data_bin),
        Algorithm('Bimax', RBinaryInclusionMaximalBiclusteringAlgorithm(n_biclusters), data_bin),
        Algorithm('LAS', LargeAverageSubmatrices(n_biclusters), data),
        Algorithm('Plaid', RPlaid(n_biclusters), data),
        Algorithm('Spectral', Spectral(n_clusters=data.shape[1] // 2), data + abs(data.min().min()) + 1),
    ]

    tavazoie_benchmark = GeneExpressionBenchmark(algorithms=setup,
                                                 raw_data=data,
                                                 reduction_level=reduction_level).run(pool)
    tavazoie_benchmark.generate_report()

    tavazoie_benchmark.perform_goea()
    tavazoie_benchmark.generate_goea_report()

    pool.close()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biclustlib-0.0.12.tar.gz (18.7 MB view details)

Uploaded Source

Built Distribution

biclustlib-0.0.12-py3-none-any.whl (19.3 MB view details)

Uploaded Python 3

File details

Details for the file biclustlib-0.0.12.tar.gz.

File metadata

  • Download URL: biclustlib-0.0.12.tar.gz
  • Upload date:
  • Size: 18.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for biclustlib-0.0.12.tar.gz
Algorithm Hash digest
SHA256 ca25a0072b16d55b037cd10117a4469c2d337ff3a47b85854aa759564c2fbd9c
MD5 c74720db20007de26fec2eb3a0d24179
BLAKE2b-256 83a13d8ecd4804e0b1cb08b9afcae71c21d5c28dc353ba23dfdda35da90a8769

See more details on using hashes here.

File details

Details for the file biclustlib-0.0.12-py3-none-any.whl.

File metadata

  • Download URL: biclustlib-0.0.12-py3-none-any.whl
  • Upload date:
  • Size: 19.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for biclustlib-0.0.12-py3-none-any.whl
Algorithm Hash digest
SHA256 bfb5b252952500bd01a7483f49d5fa25384be02b07d3199bf52dc082492610aa
MD5 f3d79b70699ebb6a2d7446c5715e05c6
BLAKE2b-256 45dac7b8c4b24dc9f5cee71af2f82131740edf7a589813f73ec3bfa611211acc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page