Skip to main content

CluMo: Clustering-based Motif Discovery from Deep Learning Models

Project description

CluMo: Clustering-based Motif Discovery

CluMo is a Python package for discovering DNA motifs from deep learning models using clustering approaches. It uses feature attribution techniques to identify important regions in DNA sequences and clusters them to find common motifs.

Installation

pip install clumo

Quick Start

import torch
from clumo import CluMo
import pandas as pd

# Load your torch model
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH, weights_only=True))

# Initialize CluMo with the model
clumo = CluMo(model=model, output_dir="results")

# Prepare your sequence data, "efficiency" can be other targeted sequence properties
df = pd.DataFrame({
    "sequence": ["ATCGATCG", "GCTAGCTA", ...],
    "label": [1, 0, ...],
    "efficiency": [0.8, 0.2, ...]
})
motifs = clumo.analyze_from_dataframe(df, seq_col="sequence", label_col="label", eff_col="efficiency")

Features

  • Discover DNA motifs using deep learning feature attribution
  • Clustering approach to identify common motifs
  • Statistical significance testing for motif enrichment
  • Visualization of motifs as sequence logos

Requirements

  • Python 3.9
  • PyTorch
  • Captum
  • NumPy
  • Pandas
  • Matplotlib
  • logomaker
  • scikit-learn
  • SciPy

Citation

Please use the following to cite our work:

@article{gimpel2024deep,
  title={Deep learning uncovers sequence-specific amplification bias in multi-template PCR},
  author={Gimpel, Andreas L and Fan, Bowen and Chen, Dexiong and W{\"o}lfle, Laetitia OD and Horn, Max and Meng-Papaxanthos, Laetitia and Antkowiak, Philipp L and Stark, Wendelin J and Christen, Beat and Borgwardt, Karsten and others},
  journal={bioRxiv},
  pages={2024--09},
  year={2024},
  publisher={Cold Spring Harbor Laboratory}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clumo-0.1.0.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clumo-0.1.0-py3-none-any.whl (6.8 kB view details)

Uploaded Python 3

File details

Details for the file clumo-0.1.0.tar.gz.

File metadata

  • Download URL: clumo-0.1.0.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.7

File hashes

Hashes for clumo-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5aabd1e75ba6b0f283b432574413454cda265aa1ac773921d6166d7912bd2d6c
MD5 e12cf8d5dc91e17e945194f9961a883a
BLAKE2b-256 680bc33b9b189b027665649b95ec9deeb907231b2dc78efcdcb7dbb79c9a74b5

See more details on using hashes here.

File details

Details for the file clumo-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: clumo-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.7

File hashes

Hashes for clumo-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 eb522b8bc097b611192c5846296d0b3b2c47f4b494573f25e8cb9f62cce8fd6d
MD5 2d479cef5aa419b135ba1c8c2f459a61
BLAKE2b-256 94d68a8a88a86762e156b0a516caacb8c0214de852e7cdd7e03f420deb91f083

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page