Open Bio-sequence Toolbox for Supervision and Self-Supervision Learning
Project description
OpenBioSeq
News
- OpenBioSeq v0.1.1 is released, which supports classification and regression tasks on bio-sequence datasets. It inherited most features in OpenMixup.
- OpenBioSeq v0.1.0 is initialized.
Introduction
The main branch works with PyTorch 1.8 (required by some self-supervised methods) or higher (we recommend PyTorch 1.12). You can still use PyTorch 1.6 for most methods.
OpenBioSeq
is an open-source supervised and self-supervised bio-sequence representation learning toolbox based on PyTorch. OpenBioSeq
supports popular backbones, pre-training methods, and various features.
What does this repo do?
Learning useful bio-sequence representation efficiently facilitates various downstream tasks in biological and chemical fields. This repo focuses on supervised and self-supervised bio-sequence representation learning and is named OpenBioSeq
.
Major features
This repo will be continued to update in 2022! Please watch us for latest update!
Change Log
Please refer to CHANGELOG.md for details and release history.
[2022-06-09] OpenBioSeq
v0.1.1 is released.
[2022-05-24] OpenBioSeq
v0.1.0 is initialized.
Installation
There are quick installation steps for develepment:
conda create -n openbioseq python=3.8 pytorch=1.12 cudatoolkit=11.3 torchvision -c pytorch -y
conda activate openbioseq
pip install openmim
mim install mmcv-full
git clone https://github.com/Westlake-AI/OpenBioSeq.git
cd OpenBioSeq
python setup.py develop
Please refer to INSTALL.md for detailed installation instructions and dataset preparation.
Get Started
Please see Getting Started for the basic usage of OpenBioSeq (based on OpenMixup and MMSelfSup). As an example, you can start a multiple GPUs training with a certain CONFIG_FILE
using the following script:
bash tools/dist_train.sh ${CONFIG_FILE} ${GPUS} [optional arguments]
Then, please see tutorials for more tech details (based on MMClassification).
License
This project is released under the Apache 2.0 license.
Acknowledgement
OpenBioSeq
is an open-source project for supervised and self-supervised methods on bio-sequence datasets created by researchers in CAIRI AI LAB. We encourage researchers interested in bio-sequence research and applications to contribute toOpenBioSeq
!- This repo is mainly based on OpenMixup, and borrows the architecture design and part of the code from MMSelfSup and MMClassification.
Citation
If you find this project useful in your research, please consider cite:
@misc{2022openbioseq,
title={{OpenBioSeq}: Open Toolbox and Benchmark for Bio-sequence Representation Learning},
author={Li, Siyuan and Liu, Zicheng and Wu, Di and Stan Z. Li},
howpublished = {\url{https://github.com/Westlake-AI/openbioseq}},
year={2022}
}
Contributors
For now, the direct contributors include: Siyuan Li (@Lupin1998) and Zicheng Liu (@pone7). We thanks contributors for OpenMixup, MMSelfSup, and MMClassification.
Contact
This repo is currently maintained by Siyuan Li (lisiyuan@westlake.edu.cn) and Zicheng Liu (liuzicheng@westlake.edu.cn).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for OpenBioSeq-0.1.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0a61fda35f8c16cba4f0c669f6b8143d12c9975bc1ea978c6e8e67bac2ecb8be |
|
MD5 | bc0fd395d4f1f1f713141e60372539e4 |
|
BLAKE2b-256 | 25932128ae0ebb55ae9887eb30979d85c3c34eceeabf3f3231109957037f7135 |