Skip to main content

Accelerate Molecular Biology Research with Machine Learning

Project description

MultiMolecule

[!TIP] Accelerate Molecular Biology Research with Machine Learning.

DOI

Codacy - Quality Codacy - Coverage CodeCov - Coverage

PyPI - Version PyPI - Python Version Downloads Statistics

License: AGPL v3

MultiMolecule is a one-stop ecosystem for molecular machine learning. It connects datasets, model implementations, reusable dataset and neural-network modules, the DanLing-based runner for training and evaluation, and task-oriented inference pipelines for RNA, DNA, and protein workflows.

Get Started

Install the latest stable release from PyPI:

pip install multimolecule

Run a registered pipeline through the Hugging Face transformers interface:

import multimolecule  # registers MultiMolecule models and pipelines
from transformers import pipeline

predictor = pipeline("rna-secondary-structure", model="multimolecule/ernierna-ss")
result = predictor("AUCAGCCUUCGUUCUGUAAACGG")

Load models directly when you need lower-level control:

import multimolecule

model = multimolecule.AutoModelForSequencePrediction.from_pretrained("multimolecule/basset")
tokenizer = multimolecule.AutoTokenizer.from_pretrained("multimolecule/basset")

Install the latest source version when you need unreleased changes:

pip install git+https://github.com/DLS5-Omics/MultiMolecule

Explore

Entry point Use it for
data Task-aware datasets, data loading, and multi-task sampling.
datasets Curated biomolecular datasets and task metadata.
io FASTA, DBN, BPSEQ, and bpRNA ST readers and writers.
models Model cards and API references for supported architectures.
tokenisers DNA, RNA, protein, and dot-bracket tokenisers.
pipelines Task-focused inference workflows for supported biological tasks.
runner Training, evaluation, and inference configuration.
modules Reusable neural-network building blocks.

Community

  • Discourse: release announcements, usage questions, model requests, RFCs, and community discussion.
  • GitHub Issues: reproducible bugs, API issues, and implementation-tracked feature requests.
  • Hugging Face: released checkpoints, datasets, and demo Spaces.

Citation

[!NOTE] The artifacts distributed in this repository are part of the MultiMolecule project. If MultiMolecule supports your research, please cite the MultiMolecule project as follows:

@software{chen_2024_12638419,
  author    = {Chen, Zhiyuan and Zhu, Sophia Y.},
  title     = {MultiMolecule},
  doi       = {10.5281/zenodo.12638419},
  publisher = {Zenodo},
  url       = {https://doi.org/10.5281/zenodo.12638419},
  year      = 2024,
  month     = may,
  day       = 4
}

License

We believe openness is the Foundation of Research.

MultiMolecule is licensed under the GNU Affero General Public License.

For additional terms and clarifications, please refer to our License FAQ.

Please join us in building an open research community.

SPDX-License-Identifier: AGPL-3.0-or-later

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multimolecule-0.2.0.tar.gz (2.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

multimolecule-0.2.0-py3-none-any.whl (2.3 MB view details)

Uploaded Python 3

File details

Details for the file multimolecule-0.2.0.tar.gz.

File metadata

  • Download URL: multimolecule-0.2.0.tar.gz
  • Upload date:
  • Size: 2.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for multimolecule-0.2.0.tar.gz
Algorithm Hash digest
SHA256 b4a1cb6b952d4698432f46c7e1e1251a2cdf2ff2ed5c8cf8bf5e94a010198c0f
MD5 9e026446b80e9b4719772d523bef10bd
BLAKE2b-256 81e2503907e29b06e703be13199621810bfff3bc58d2318be8c0b779b9e730d5

See more details on using hashes here.

Provenance

The following attestation bundles were made for multimolecule-0.2.0.tar.gz:

Publisher: push.yaml on DLS5-Omics/multimolecule

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file multimolecule-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: multimolecule-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 2.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for multimolecule-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a3476171e810b7ce76d9b066b4a061dbaca8ea326673ab7df740c5cc88b1c272
MD5 ffc9600561aa00f796cf147339b6a583
BLAKE2b-256 a30032808e883b5eb1eefcede33e47d3d9202708ac276aeb470e33e4e1dfeebb

See more details on using hashes here.

Provenance

The following attestation bundles were made for multimolecule-0.2.0-py3-none-any.whl:

Publisher: push.yaml on DLS5-Omics/multimolecule

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page