Skip to main content

bioScience: A new Python science library for High-Performance Computing Bioinformatics Analytics

Project description

bioScience: A new Python science library for High-Performance Computing Bioinformatics Analytics

Deployment & Documentation & Stats

PyPI version Documentation Status GitHub stars GitHub forks License

BioScience is an advanced Python library designed to satisfy the growing data analysis needs in the field of bioinformatics by leveraging High-Performance Computing (HPC). This library encompasses a vast multitude of functionalities, from loading specialised gene expression datasets (microarrays, RNA-Seq, etc.) to pre-processing techniques and data mining algorithms suitable for this type of datasets. BioScience is distinguished by its capacity to manage large amounts of biological data, providing users with efficient and scalable tools for the analysis of genomic and transcriptomic data through the use of parallel architectures for clusters composed of CPUs and GPUs.

BioScience is featured for:

  • Unified APIs, detailed documentation, and interactive examples available to the community.

  • Complete coverage for generate biological results from gene co-expression datasets.

  • Optimized models to generate results in the shortest possible time.

  • Optimization of a High-Performance Computing (HPC) and Big Data ecosystem.


Installation

It is recommended to use pip for installation. Please make sure the latest version is installed, as bioScience is updated frequently:

pip install bioscience            # normal install
pip install --upgrade bioscience  # or update if needed
pip install --pre bioscience      # or include pre-release version for new features

Alternatively, you could clone and run setup.py file:

git clone https://github.com/aureliolfdez/bioscience.git
pip install .

Required Dependencies:

  • Python>=3.11

  • numpy>=2.0.1

  • pandas>=2.2.2

  • scikit-learn>=1.5.1

  • numba>=0.60.0

  • seaborn>=0.13.2

  • matplotlib>=3.9.0

  • setuptools>=75.1.0

  • requests>=2.32.3


API demo

import bioscience as bs

if __name__ == "__main__":

   # RNA-Seq dataset load
   dataset = load(path="datasets/rnaseq.txt", index_gene=0, index_lengths=1 ,naFilter=True, head = 0)

   # RNA-Seq preprocessing
   bs.tpm(dataset)

   # Binary preprocessing
   bs.binarize(dataset)

   # Data mining phase
   listModels = bs.bibit(dataset, cMnr=2, cMnc=2, mode=3, deviceCount=1, debug = True)

   # Save results
   bs.saveGenes(path="/path/", models=listModels, data=dataset)

Citing bioScience:

bioScience is published in SoftwareX. If you use bioScience in a scientific publication, we would appreciate citations to the following paper:

López-Fernández, A., Gómez-Vela, F. A., Gonzalez-Dominguez, J., & Bidare-Divakarachari, P. (2024). bioScience: A new python science library for high-performance computing bioinformatics analytics. SoftwareX, 26, 101666.

Key Links and Resources:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bioscience-0.1.4.tar.gz (35.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bioscience-0.1.4-py3-none-any.whl (57.2 kB view details)

Uploaded Python 3

File details

Details for the file bioscience-0.1.4.tar.gz.

File metadata

  • Download URL: bioscience-0.1.4.tar.gz
  • Upload date:
  • Size: 35.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for bioscience-0.1.4.tar.gz
Algorithm Hash digest
SHA256 0123c4a4a191eb7d112bc02e5677748f515951a79314dbbf7445d106fd56e6a6
MD5 13c24b355d4b093a1071fe086592bc77
BLAKE2b-256 5dab77bcfbad320b3216897151acf1bbf6737886e76b3195bb152cf4ff4edebf

See more details on using hashes here.

File details

Details for the file bioscience-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: bioscience-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 57.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for bioscience-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 d89e7542340d69674a3eb48acffe3884f378363c434a7a0bde51d0161d15f6e4
MD5 83c92aaad8ea007e560370265fd1ecee
BLAKE2b-256 085e85c51f44d5d1d44403eb6b79337f241b4dc6d69719b6eb81ece9c9c784b9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page