D-Sites: Hybrid TFBS predictor (PWM + DNA shape + RF)
Project description
D-Sites: Hybrid TFBS Predictor for Bacterial Genomes
A comprehensive computational tool for predicting transcription factor binding sites (TFBS) in bacterial genomes using hybrid PWM, DNA shape features, and Random Forest classification.
🚀 Quick Start
Installation
## Quick Start
git clone https://github.com/pankaj357/D-Sites.git
cd dsites
pip install -r requirements.txt
Basic Prediction
Minimal Command
python src/D-Sites.py --fasta <genome.fasta> \
--gff <annotation.gff> \
--motif <motif_file> \
--gene <TF_name> \
--genome_accession <accession_id>
Complete Example
python src/D-Sites.py \
--fasta <path_to_genome.fasta> \
--gff <path_to_annotation.gff> \
--motif <path_to_motif_file> \
--gene <TF_NAME> \
--genome_accession <GENOME_ACCESSION> \
--outdir results \
--n_trees 300 \
--neg_ratio 5 \
--prob_cutoff 0.5 \
--pad 10 \
--seed 42 \
--batch 10000 \
--up 300 \
--down 50 \
--auto_cutoff
Command Breakdown
Required Arguments
--fasta: Genome FASTA file path
--gff: Genome annotation file (GFF3 format)
--motif: TF motif file (JASPAR or MEME format)
--gene: Transcription factor name
--genome_accession: Genome accession ID
Optional Arguments with Defaults
--outdir results: Output directory
--n_trees 300: Number of Random Forest trees
--neg_ratio 5: Negative:Positive ratio
--prob_cutoff 0.5: Probability cutoff
--pad 10: Window padding around known sites
--seed 42: Random seed
--batch 10000: Batch size for processing
--up 300: Upstream promoter size
--down 50: Downstream promoter size
📈 Performance
D-Sites demonstrates:
- 3-4× higher precision in top predictions
- 3.02-3.42× enrichment in promoter regions
📝 Citation
If you use D-Sites in your research, please cite:
Pankaj et al. (2025). D-Sites: A hybrid machine-learning framework for prediction of transcription factor binding sites in bacterial genomes. Information Sciences (Under Review),2024.
📄 License
MIT License - see LICENSE for details.
💬 Contact
For questions and support, contact:
- Pankaj: ft.pank@gmail.com
- Kanaka KK: kkokay07@gmail.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dsites-1.1.1.tar.gz.
File metadata
- Download URL: dsites-1.1.1.tar.gz
- Upload date:
- Size: 132.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
87cdab7ce2acc4ff7fc21e819be90e2df77347ae0d2b687cb5603e6e18add361
|
|
| MD5 |
21cfd48420da463777c33e4fc949183a
|
|
| BLAKE2b-256 |
98dca1f236d068dc1db969feeaa680914fb9bcf49b9458938c3a24db3f6450bc
|
File details
Details for the file dsites-1.1.1-py3-none-any.whl.
File metadata
- Download URL: dsites-1.1.1-py3-none-any.whl
- Upload date:
- Size: 131.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
088fee63a5ec57dd9f2dc23bfcd5a7c01348edac524ad523e6ed1b62a50f8123
|
|
| MD5 |
66d5e4f0b2cbc435f3c9e86c43872e5f
|
|
| BLAKE2b-256 |
4199a83cb0701267273aa057473e0f4c051a0046d37581d1e67970e7b627b126
|