ChromBPNet pytorch
Project description
ChromBPNet Pytorch
- Pytorch implementation for ChromBPNet
- Please refer to original code and paper ChromBPNet: Bias factorized, base-resolution deep learning models of chromatin accessibility reveal cis-regulatory sequence syntax, transcription factor footprints and regulatory variants by Anusri Pampari*, Anna Shcherbina*, Anshul Kundaje. (*authors contributed equally)
- This repo also refers to bpnet-lite and uses tangermeme for interpretation, two very useful repos by Jacob Schreiber
Reproduce Official ChromBPNet performance
Pearson correlation on counts prediction of peaks
official chrombpnet (left) vs pytorch chrombpnet (right)
Attribution score
Here is the genome browser to compare the profile prediction and attribution scores between official ChromBPNet and pytorch implementation with n_filters = 512 and 128
Table of contents
Installation
Install from pypi
pip install chrombpnet-pytorch
Install from source
pip install git+https://github.com/jsxlei/chrombpnet-pytorch.git
QuickStart
Before training
- Download the genome or use your own genome data
- Download the ENCODE K562 ATAC data or use your own ATAC data
Bias-factorized ChromBPNet training
Please refer to data_config to define your own dataset or pass them through command.
if your <data_path> contains: peaks.bed, negatives.bed, and unstranded.bw, and bias_scaled.h5 as well.
chrombpnet train --data_dir <data_path>
Otherwise
chrombpnet train --peaks <peak_file> --negatives <negative_file> --bigwig <unstrand.bw> --bias <bias_scaled.h5> --adjust_bias
Predict with pretrained model in .h5 format or .cpkt or .pt
chrombpnet predict --data_dir <data_path> --checkpoint chrombpnet_wo_bias.h5/best_model.cpkt/chrombpnet_wo_bias.pt -o <output_path>
Interpret by calculating attribution
chrombpnet interpret --data_dir <data_path> --checkpoint <model_cpkt/model_h5> -o <output_path>
Run full pipeline including training, predicting and interpreting
chrombpnet --data_dir <data_path> -o <output_path>
Finetune model
chrombpnet finetune --data_dir <data_path> --checkpoint <model>.h5/pt -o <output_path>
Variant scoring
snp_score
-l $snps \
-g $ref_fasta \
-pg $ref_fasta_peaks \
-s $chrom_sizes \
-ps $chrom_sizes_peaks \
-m $model \
-p $peaks \
-o $out_prefix \
-t 2 \
-li \
-sc chrombpnet
Input Format
--bigwig--peaks--negatives
Output Format
The ouput directory will be populated as follows with fold_0 chromosome splits -
fold_0\
checkpoints\
best_model.cpkt
last.cpkt
chrombpnet_nobias.pt (pytorch i.e model to predict bias corrected accessibility profile)
train.log
predict.log
evaluation\
eval\
all_regions.counts_pearsonr.png
all_regions_jsd.profile_jsd.png
peaks.counts_pearsonr.png
peaks_jsd.profile_jsd.png
regions.csv
metrics.json
interpret\
counts\
How to Cite
If you're using ChromBPNet in your work, please cite as follows:
@article {Pampari2024.12.25.630221,
author = {Pampari, Anusri and Shcherbina, Anna and Kvon, Evgeny and Kosicki, Michael and Nair, Surag and Kundu, Soumya and Kathiria, Arwa S. and Risca, Viviana I. and Kuningas, Kristiina and Alasoo, Kaur and Greenleaf, William James and Pennacchio, Len A. and Kundaje, Anshul},
title = {ChromBPNet: bias factorized, base-resolution deep learning models of chromatin accessibility reveal cis-regulatory sequence syntax, transcription factor footprints and regulatory variants},
elocation-id = {2024.12.25.630221},
year = {2024},
doi = {10.1101/2024.12.25.630221},
publisher = {Cold Spring Harbor Laboratory},
URL = {https://www.biorxiv.org/content/early/2024/12/25/2024.12.25.630221},
eprint = {https://www.biorxiv.org/content/early/2024/12/25/2024.12.25.630221.full.pdf},
journal = {bioRxiv}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chrombpnet_pytorch-0.0.3.tar.gz.
File metadata
- Download URL: chrombpnet_pytorch-0.0.3.tar.gz
- Upload date:
- Size: 1.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
55fc6c4779389a6b288f57e21db2f35edc9dfce7d430ca9e1700909b0462859a
|
|
| MD5 |
05ddcae5b7e6d12e4a18e394b41e3ea2
|
|
| BLAKE2b-256 |
ad706b6b2151d2e1a77f7da8819d95bc5606ba8fdf60084b1deca8037e4fe01e
|
File details
Details for the file chrombpnet_pytorch-0.0.3-py3-none-any.whl.
File metadata
- Download URL: chrombpnet_pytorch-0.0.3-py3-none-any.whl
- Upload date:
- Size: 68.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
71402dfbe02b0f90d7fa22719eef430a51b4d769bd1bb45e0508093c832f987e
|
|
| MD5 |
ffa60d855fcb5b0cb0d20cdcec042567
|
|
| BLAKE2b-256 |
db6084ac307477a3585b19ae6c007fdac84bd06859b5a3e29ac00406462fb467
|