Skip to main content

Open Bio-sequence Toolbox for Supervision and Self-Supervision Learning

Project description

OpenBioSeq

PyPI license open issues issue resolution

News

  • OpenBioSeq v0.1.1 is released, which supports classification and regression tasks on bio-sequence datasets. It inherited most features in OpenMixup.
  • OpenBioSeq v0.1.0 is initialized.

Introduction

The main branch works with PyTorch 1.8 (required by some self-supervised methods) or higher (we recommend PyTorch 1.12). You can still use PyTorch 1.6 for most methods.

OpenBioSeq is an open-source supervised and self-supervised bio-sequence representation learning toolbox based on PyTorch. OpenBioSeq supports popular backbones, pre-training methods, and various features.

What does this repo do?

Learning useful bio-sequence representation efficiently facilitates various downstream tasks in biological and chemical fields. This repo focuses on supervised and self-supervised bio-sequence representation learning and is named OpenBioSeq.

Major features

This repo will be continued to update in 2022! Please watch us for latest update!

Change Log

Please refer to CHANGELOG.md for details and release history.

[2022-06-09] OpenBioSeq v0.1.1 is released.

[2022-05-24] OpenBioSeq v0.1.0 is initialized.

Installation

There are quick installation steps for develepment:

conda create -n openbioseq python=3.8 pytorch=1.12 cudatoolkit=11.3 torchvision -c pytorch -y
conda activate openbioseq
pip install openmim
mim install mmcv-full
git clone https://github.com/Westlake-AI/OpenBioSeq.git
cd OpenBioSeq
python setup.py develop

Please refer to INSTALL.md for detailed installation instructions and dataset preparation.

Get Started

Please see Getting Started for the basic usage of OpenBioSeq (based on OpenMixup and MMSelfSup). As an example, you can start a multiple GPUs training with a certain CONFIG_FILE using the following script:

bash tools/dist_train.sh ${CONFIG_FILE} ${GPUS} [optional arguments]

Then, please see tutorials for more tech details (based on MMClassification).

License

This project is released under the Apache 2.0 license.

Acknowledgement

  • OpenBioSeq is an open-source project for supervised and self-supervised methods on bio-sequence datasets created by researchers in CAIRI AI LAB. We encourage researchers interested in bio-sequence research and applications to contribute to OpenBioSeq!
  • This repo is mainly based on OpenMixup, and borrows the architecture design and part of the code from MMSelfSup and MMClassification.

Citation

If you find this project useful in your research, please consider cite:

@misc{2022openbioseq,
    title={{OpenBioSeq}: Open Toolbox and Benchmark for Bio-sequence Representation Learning},
    author={Li, Siyuan and Liu, Zicheng and Wu, Di and Stan Z. Li},
    howpublished = {\url{https://github.com/Westlake-AI/openbioseq}},
    year={2022}
}

Contributors

For now, the direct contributors include: Siyuan Li (@Lupin1998) and Zicheng Liu (@pone7). We thanks contributors for OpenMixup, MMSelfSup, and MMClassification.

Contact

This repo is currently maintained by Siyuan Li (lisiyuan@westlake.edu.cn) and Zicheng Liu (liuzicheng@westlake.edu.cn).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

OpenBioSeq-0.1.1.tar.gz (236.6 kB view details)

Uploaded Source

Built Distribution

OpenBioSeq-0.1.1-py2.py3-none-any.whl (328.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file OpenBioSeq-0.1.1.tar.gz.

File metadata

  • Download URL: OpenBioSeq-0.1.1.tar.gz
  • Upload date:
  • Size: 236.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.13

File hashes

Hashes for OpenBioSeq-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d10a9951d664dc053002cbd027f8a9c8ddb45007aa19a8db3426c732c58caece
MD5 73e2db7d25bf697e0d2662a1aa7c612e
BLAKE2b-256 59b0b136457a3192cf9b7cb140ae07fc84ec1a318ee6ec8279ff18ade320a52c

See more details on using hashes here.

File details

Details for the file OpenBioSeq-0.1.1-py2.py3-none-any.whl.

File metadata

  • Download URL: OpenBioSeq-0.1.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 328.1 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.13

File hashes

Hashes for OpenBioSeq-0.1.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 0a61fda35f8c16cba4f0c669f6b8143d12c9975bc1ea978c6e8e67bac2ecb8be
MD5 bc0fd395d4f1f1f713141e60372539e4
BLAKE2b-256 25932128ae0ebb55ae9887eb30979d85c3c34eceeabf3f3231109957037f7135

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page