Bootstrap JTK analysis for circadian rhythm detection
Project description
BooteJTK
BooteJTK is an implementation of empirical JTK (eJTK) on parametrically bootstrapped resamplings of time series, used for detecting circadian rhythms in genomic data.
Based on BooteJTK by Alan Hutchison et al.; this fork improves Python 3 compatibility and integration with LIMBR.
References
- Hutchison AL et al. (2016), "BooteJTK: Improved Rhythm Detection via Bootstrapping", bioRxiv.
- Hutchison AL, Maienschein-Cline M, Chiang AH et al. "Improved statistical methods enable greater sensitivity in rhythm detection for genome-wide data." PLoS Computational Biology 2015 11(3): e1004094. doi:10.1371/journal.pcbi.1004094
Installation
pip install bootjtk
Requires Python 3.8 or later. All dependencies (numpy, scipy, pandas, matplotlib, statsmodels) are installed automatically.
Quick start
Run 10 bootstrap resamplings on data with 2 replicates per timepoint:
bootejtk-calcp -f example/TestInput4.txt -x MYPREFIX -r 2 -z 10
The -p (period), -s (phases), and -a (asymmetries) ref-file arguments default to the standard 24 h files bundled with the package, so they can be omitted for typical circadian analyses.
Usage
bootejtk-calcp — full pipeline
This is the main entry point. It runs BooteJTK bootstrapping followed by p-value calculation.
bootejtk-calcp -f <input_file> -x <prefix> -r <replicates> -z <bootstraps> [options]
| Option | Description | Default |
|---|---|---|
-f / --filename |
Input data file (tab-delimited, header row starting with # or ID) |
required |
-x / --prefix |
Output file prefix | required |
-r / --reps |
Number of replicates per timepoint | 1 |
-z / --size |
Number of bootstrap resamplings | 500 |
-j / --workers |
Worker processes (0 = all CPUs) |
1 |
-w / --waveform |
Reference waveform shape (see below) | cosine |
-p / --period |
Period reference file | bundled 24 h file |
-s / --phase |
Phase reference file | bundled 0–22 h by 2 file |
-a / --width |
Asymmetry reference file | bundled 2–22 h by 2 file |
Run bootejtk-calcp --help to see all options and current defaults.
bootejtk — core analysis only
Runs the BooteJTK analysis step without the CalcP p-value fitting step. Useful if you want to run CalcP separately or with custom settings.
bootejtk -f <input_file> -x <prefix> -r <replicates> -z <bootstraps> [options]
Waveform shapes
| Value | Shape |
|---|---|
cosine (default) |
Smooth sinusoidal peak |
trough |
Triangular trough |
impulse |
Narrow spike |
step |
Rectangular step |
Parallel processing
Use -j to speed up large datasets by distributing genes across CPUs:
bootejtk-calcp -f example/TestInput4.txt -r 2 -z 50 -j 8
-j value |
Behaviour |
|---|---|
1 (default) |
Sequential, single process |
N > 1 |
Use N worker processes |
0 |
Use all available CPUs |
Input format
Tab-delimited text file. The header row must start with # or ID; subsequent columns are zeitgeber time labels (ZT0, ZT2, …). Each data row begins with a gene/feature identifier.
# ZT0 ZT2 ZT4 ZT6 ...
gene1 1.23 2.45 3.10 2.88 ...
gene2 5.01 4.87 3.92 4.10 ...
Time labels can use decimal values (e.g. ZT14.7) and do not need to be evenly spaced.
Output files
Running bootejtk-calcp produces five output files, all prefixed with the value passed to -x:
| File | Contents |
|---|---|
*_GammaP.txt |
BooteJTK output with Gamma-fitted p-values |
*.txt |
Main BooteJTK output (best-matching waveform per gene, feeds into CalcP) |
*_order_probs.pkl |
Pickle: per-gene [means, stds, ns] and rank-order bootstrap frequencies |
*_order_probs_vars.pkl |
Pickle: per-gene tau and phase probability distributions |
*_NULL1000.txt |
Randomly generated null time series used to fit the null tau distribution |
Running the example command on an already-existing output directory appends
_1to output filenames.
FAQ
Can I use non-integer or uneven time intervals (e.g. ZT14.7)?
Yes. The label just needs to start with ZT or CT; decimal values are read correctly.
Does BooteJTK handle uneven sampling intervals? Yes. All timepoints in the header are used as given.
Why does BooteJTK report phases like 14.4 that don't match my sampling intervals? BooteJTK runs bootstrap resamplings and reports the mean phase across those resamplings. For example, if 8 of 10 resamplings give phase 14 and 2 give phase 16, the reported mean phase is 14.4.
Do the phase/asymmetry search intervals need to match the sampling intervals? No. You can sample every hour but only search for phases every two hours, for example.
Development
git clone https://github.com/aleccrowell/BooteJTK-c
cd BooteJTK-c
pip install poetry
poetry install
poetry run pytest tests/ -v
License
Released under the MIT License. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bootjtk-1.1.0.tar.gz.
File metadata
- Download URL: bootjtk-1.1.0.tar.gz
- Upload date:
- Size: 55.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.3 CPython/3.13.7 Linux/6.17.0-1011-raspi
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d644bd859df7808e459c00b4733e6720f363f9314cbdaffa1424a6e29008f135
|
|
| MD5 |
07663797e1aaa31ca66d1e24e199f6cf
|
|
| BLAKE2b-256 |
b21742231a2ba23d9d398847411519955bfd69a37d50535426a3bf324c85ef55
|
File details
Details for the file bootjtk-1.1.0-py3-none-any.whl.
File metadata
- Download URL: bootjtk-1.1.0-py3-none-any.whl
- Upload date:
- Size: 58.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.3 CPython/3.13.7 Linux/6.17.0-1011-raspi
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4347c07251235e3f3bfbddcb2f5f88c418e1fc0f1d0e266b18833ff9acfdcfb
|
|
| MD5 |
bbe5bc5075c49fb70b93952e80966d43
|
|
| BLAKE2b-256 |
6c3c6a0da3cb149d888cd4ffbf39d4e4ae18fab8bc81f227ce23e7cff82477a7
|