Accelerate Molecular Biology Research with Machine Learning
Project description
MultiMolecule
[!TIP] Accelerate Molecular Biology Research with Machine Learning.
MultiMolecule is a one-stop ecosystem for molecular machine learning. It connects datasets, model implementations, reusable dataset and neural-network modules, the DanLing-based runner for training and evaluation, and task-oriented inference pipelines for RNA, DNA, and protein workflows.
Get Started
Install the latest stable release from PyPI:
pip install multimolecule
Run a registered pipeline through the Hugging Face transformers interface:
import multimolecule # registers MultiMolecule models and pipelines
from transformers import pipeline
predictor = pipeline("rna-secondary-structure", model="multimolecule/ernierna-ss")
result = predictor("AUCAGCCUUCGUUCUGUAAACGG")
Load models directly when you need lower-level control:
import multimolecule
model = multimolecule.AutoModelForSequencePrediction.from_pretrained("multimolecule/basset")
tokenizer = multimolecule.AutoTokenizer.from_pretrained("multimolecule/basset")
Install the latest source version when you need unreleased changes:
pip install git+https://github.com/DLS5-Omics/MultiMolecule
Explore
| Entry point | Use it for |
|---|---|
data |
Task-aware datasets, data loading, and multi-task sampling. |
datasets |
Curated biomolecular datasets and task metadata. |
io |
FASTA, DBN, BPSEQ, and bpRNA ST readers and writers. |
models |
Model cards and API references for supported architectures. |
tokenisers |
DNA, RNA, protein, and dot-bracket tokenisers. |
pipelines |
Task-focused inference workflows for supported biological tasks. |
runner |
Training, evaluation, and inference configuration. |
modules |
Reusable neural-network building blocks. |
Community
- Discourse: release announcements, usage questions, model requests, RFCs, and community discussion.
- GitHub Issues: reproducible bugs, API issues, and implementation-tracked feature requests.
- Hugging Face: released checkpoints, datasets, and demo Spaces.
Citation
[!NOTE] The artifacts distributed in this repository are part of the MultiMolecule project. If MultiMolecule supports your research, please cite the MultiMolecule project as follows:
@software{chen_2024_12638419,
author = {Chen, Zhiyuan and Zhu, Sophia Y.},
title = {MultiMolecule},
doi = {10.5281/zenodo.12638419},
publisher = {Zenodo},
url = {https://doi.org/10.5281/zenodo.12638419},
year = 2024,
month = may,
day = 4
}
License
We believe openness is the Foundation of Research.
MultiMolecule is licensed under the GNU Affero General Public License.
For additional terms and clarifications, please refer to our License FAQ.
Please join us in building an open research community.
SPDX-License-Identifier: AGPL-3.0-or-later
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file multimolecule-0.2.0.tar.gz.
File metadata
- Download URL: multimolecule-0.2.0.tar.gz
- Upload date:
- Size: 2.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4a1cb6b952d4698432f46c7e1e1251a2cdf2ff2ed5c8cf8bf5e94a010198c0f
|
|
| MD5 |
9e026446b80e9b4719772d523bef10bd
|
|
| BLAKE2b-256 |
81e2503907e29b06e703be13199621810bfff3bc58d2318be8c0b779b9e730d5
|
Provenance
The following attestation bundles were made for multimolecule-0.2.0.tar.gz:
Publisher:
push.yaml on DLS5-Omics/multimolecule
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
multimolecule-0.2.0.tar.gz -
Subject digest:
b4a1cb6b952d4698432f46c7e1e1251a2cdf2ff2ed5c8cf8bf5e94a010198c0f - Sigstore transparency entry: 1690675995
- Sigstore integration time:
-
Permalink:
DLS5-Omics/multimolecule@f2e44c83317caeeab475f03556f45890b7720d59 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/DLS5-Omics
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
push.yaml@f2e44c83317caeeab475f03556f45890b7720d59 -
Trigger Event:
push
-
Statement type:
File details
Details for the file multimolecule-0.2.0-py3-none-any.whl.
File metadata
- Download URL: multimolecule-0.2.0-py3-none-any.whl
- Upload date:
- Size: 2.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a3476171e810b7ce76d9b066b4a061dbaca8ea326673ab7df740c5cc88b1c272
|
|
| MD5 |
ffc9600561aa00f796cf147339b6a583
|
|
| BLAKE2b-256 |
a30032808e883b5eb1eefcede33e47d3d9202708ac276aeb470e33e4e1dfeebb
|
Provenance
The following attestation bundles were made for multimolecule-0.2.0-py3-none-any.whl:
Publisher:
push.yaml on DLS5-Omics/multimolecule
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
multimolecule-0.2.0-py3-none-any.whl -
Subject digest:
a3476171e810b7ce76d9b066b4a061dbaca8ea326673ab7df740c5cc88b1c272 - Sigstore transparency entry: 1690676026
- Sigstore integration time:
-
Permalink:
DLS5-Omics/multimolecule@f2e44c83317caeeab475f03556f45890b7720d59 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/DLS5-Omics
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
push.yaml@f2e44c83317caeeab475f03556f45890b7720d59 -
Trigger Event:
push
-
Statement type: