Skip to main content

Bootstrap JTK analysis for circadian rhythm detection

Project description

BooteJTK

PyPI version Python Versions License: MIT

BooteJTK is an implementation of empirical JTK (eJTK) on parametrically bootstrapped resamplings of time series, used for detecting circadian rhythms in genomic data.

Based on BooteJTK by Alan Hutchison et al.; this fork improves Python 3 compatibility and integration with LIMBR.

References

  • Hutchison AL et al. (2016), "BooteJTK: Improved Rhythm Detection via Bootstrapping", bioRxiv.
  • Hutchison AL, Maienschein-Cline M, Chiang AH et al. "Improved statistical methods enable greater sensitivity in rhythm detection for genome-wide data." PLoS Computational Biology 2015 11(3): e1004094. doi:10.1371/journal.pcbi.1004094

Installation

pip install bootjtk

Requires Python 3.8 or later. All dependencies (numpy, scipy, pandas, matplotlib, statsmodels) are installed automatically.

Quick start

Run 10 bootstrap resamplings on data with 2 replicates per timepoint:

bootejtk-calcp -f example/TestInput4.txt -x MYPREFIX -r 2 -z 10

The -p (period), -s (phases), and -a (asymmetries) ref-file arguments default to the standard 24 h files bundled with the package, so they can be omitted for typical circadian analyses.

Usage

bootejtk-calcp — full pipeline

This is the main entry point. It runs BooteJTK bootstrapping followed by p-value calculation.

bootejtk-calcp -f <input_file> -x <prefix> -r <replicates> -z <bootstraps> [options]
Option Description Default
-f / --filename Input data file (tab-delimited, header row starting with # or ID) required
-x / --prefix Output file prefix required
-r / --reps Number of replicates per timepoint 1
-z / --size Number of bootstrap resamplings 500
-j / --workers Worker processes (0 = all CPUs) 1
-w / --waveform Reference waveform shape (see below) cosine
-p / --period Period reference file bundled 24 h file
-s / --phase Phase reference file bundled 0–22 h by 2 file
-a / --width Asymmetry reference file bundled 2–22 h by 2 file

Run bootejtk-calcp --help to see all options and current defaults.

bootejtk — core analysis only

Runs the BooteJTK analysis step without the CalcP p-value fitting step. Useful if you want to run CalcP separately or with custom settings.

bootejtk -f <input_file> -x <prefix> -r <replicates> -z <bootstraps> [options]

Waveform shapes

Value Shape
cosine (default) Smooth sinusoidal peak
trough Triangular trough
impulse Narrow spike
step Rectangular step

Parallel processing

Use -j to speed up large datasets by distributing genes across CPUs:

bootejtk-calcp -f example/TestInput4.txt -r 2 -z 50 -j 8
-j value Behaviour
1 (default) Sequential, single process
N > 1 Use N worker processes
0 Use all available CPUs

Input format

Tab-delimited text file. The header row must start with # or ID; subsequent columns are zeitgeber time labels (ZT0, ZT2, …). Each data row begins with a gene/feature identifier.

#	ZT0	ZT2	ZT4	ZT6	...
gene1	1.23	2.45	3.10	2.88	...
gene2	5.01	4.87	3.92	4.10	...

Time labels can use decimal values (e.g. ZT14.7) and do not need to be evenly spaced.

Output files

Running bootejtk-calcp produces five output files, all prefixed with the value passed to -x:

File Contents
*_GammaP.txt BooteJTK output with Gamma-fitted p-values
*.txt Main BooteJTK output (best-matching waveform per gene, feeds into CalcP)
*_order_probs.pkl Pickle: per-gene [means, stds, ns] and rank-order bootstrap frequencies
*_order_probs_vars.pkl Pickle: per-gene tau and phase probability distributions
*_NULL1000.txt Randomly generated null time series used to fit the null tau distribution

Running the example command on an already-existing output directory appends _1 to output filenames.

FAQ

Can I use non-integer or uneven time intervals (e.g. ZT14.7)? Yes. The label just needs to start with ZT or CT; decimal values are read correctly.

Does BooteJTK handle uneven sampling intervals? Yes. All timepoints in the header are used as given.

Why does BooteJTK report phases like 14.4 that don't match my sampling intervals? BooteJTK runs bootstrap resamplings and reports the mean phase across those resamplings. For example, if 8 of 10 resamplings give phase 14 and 2 give phase 16, the reported mean phase is 14.4.

Do the phase/asymmetry search intervals need to match the sampling intervals? No. You can sample every hour but only search for phases every two hours, for example.

Development

git clone https://github.com/aleccrowell/BooteJTK-c
cd BooteJTK-c
pip install poetry
poetry install
poetry run pytest tests/ -v

License

Released under the MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bootjtk-1.1.0.tar.gz (55.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bootjtk-1.1.0-py3-none-any.whl (58.5 kB view details)

Uploaded Python 3

File details

Details for the file bootjtk-1.1.0.tar.gz.

File metadata

  • Download URL: bootjtk-1.1.0.tar.gz
  • Upload date:
  • Size: 55.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.3 CPython/3.13.7 Linux/6.17.0-1011-raspi

File hashes

Hashes for bootjtk-1.1.0.tar.gz
Algorithm Hash digest
SHA256 d644bd859df7808e459c00b4733e6720f363f9314cbdaffa1424a6e29008f135
MD5 07663797e1aaa31ca66d1e24e199f6cf
BLAKE2b-256 b21742231a2ba23d9d398847411519955bfd69a37d50535426a3bf324c85ef55

See more details on using hashes here.

File details

Details for the file bootjtk-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: bootjtk-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 58.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.3 CPython/3.13.7 Linux/6.17.0-1011-raspi

File hashes

Hashes for bootjtk-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d4347c07251235e3f3bfbddcb2f5f88c418e1fc0f1d0e266b18833ff9acfdcfb
MD5 bbe5bc5075c49fb70b93952e80966d43
BLAKE2b-256 6c3c6a0da3cb149d888cd4ffbf39d4e4ae18fab8bc81f227ce23e7cff82477a7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page