Skip to main content

Map and process ChIP-seq, RNA-seq, ATAC-seq, MNase-seq, HiC and shotgun reads

Project description

tinyMapper

A minimalist yet versatile workflow to process ChIP-seq (with or without input/spikein), RNA-seq, MNase-seq, ATAC-seq, Hi-C and shotgun sequencing data. Hi-C mode delegates to hicstuff and cooler.

tinyMapper supports both paired-end and single-end reads. Hi-C mode requires paired-end data. Spikein calibration (ChIP) also requires paired-end. For single-end MNase, fragment-size filtering is skipped and only a standard CPM track is produced.

Note: tinyMapper is a Python package that orchestrates external CLI tools (bowtie2, STAR, samtools, deeptools, macs3, hicstuff). It does not re-implement alignment or peak-calling.

DISCLAIMER:

  • This is by no means the "best" or "only" way to process sequencing data. Feedback and suggestions are welcome.
  • This workflow does NOT include QC / validation. Run fastqc on raw reads at a minimum.

Installation

tinyMapper is a Python package. The recommended install creates a micromamba environment that bundles the Python package together with all bioinformatics tools (bowtie2, STAR, samtools, deeptools, macs3, hicstuff, cooler, bedtools).

Recommended — full install via micromamba

Requires micromamba.

micromamba env create -n tinymapper -f https://raw.githubusercontent.com/js2264/tinyMapper/refs/heads/master/env/conda-lock.yml -y
micromamba activate tinymapper
tinymapper --help

Alternative — Python package only

If all bioinformatics tools are already available in your environment:

uv venv
uv pip install git+https://github.com/js2264/tinyMapper.git
tinymapper --help

Invocation

After activating the environment, there are two equivalent ways to call tinyMapper:

Command Description
tinymapper --mode ChIP ... Primary Python CLI (recommended)
tinyMapper.sh --mode ChIP ... Legacy bash wrapper — forwards all arguments verbatim to tinymapper

Both accept exactly the same flags. tinyMapper.sh is kept for compatibility with existing Slurm scripts and autotinymapper.


Usage

 Usage: tinymapper [OPTIONS]

 tinyMapper — map and process sequencing reads.
 Modes:
   ChIP    — ChIP-seq (bowtie2 → samtools → bamCoverage → macs3)
   RNA     — RNA-seq  (STAR → samtools → bamCoverage × 3)
   ATAC    — ATAC-seq (bowtie2 → samtools → bamCoverage → macs3)
   MNase   — MNase-seq (bowtie2 → samtools → size filter → 3 tracks)
   HiC     — Hi-C     (hicstuff pipeline → cooler → mcool)
   shotgun — Shotgun  (bowtie2 single-end → samtools → bamCoverage)


 Examples:
   tinymapper -m ChIP -s ~/HB44 -g ~/genomes/R64-1-1/R64-1-1 -o ~/results
   tinymapper -m RNA  -s ~/AB4  -g ~/genomes/W303/W303 -o ~/results
   tinymapper -m HiC  -s ~/CH266 -g ~/genomes/W303/W303 --binning 1000

╭─ Required ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *  --mode    -m  [chip|rna|atac|mnase|hic|shotgun]  Mapping mode (ChIP, MNase, ATAC, RNA, HiC, shotgun). [required]                                            │
│ *  --sample  -s  TEXT                               Path prefix to sample FASTQ files.  For ~/reads/JS001_R{1,2}.fq.gz use --sample ~/reads/JS001 [required]   │
│ *  --genome  -g  TEXT                               Path prefix to reference genome.  For ~/genome/W303/W303.fa use --genome ~/genome/W303/W303 [required]     │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Core optional ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --output       -o  PATH     Directory to store results. [default: results]                                                                                     │
│ --input        -i  TEXT     (ChIP) Path prefix to input/control sample.                                                                                        │
│ --calibration  -c  TEXT     (ChIP) Path prefix to spikein/calibration genome.                                                                                  │
│ --threads      -t  INTEGER  Number of CPU threads. [default: 8]                                                                                                │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Alignment / filtering ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --alignment   -a   TEXT  Extra options passed to bowtie2 (use single quotes). [default: --maxins 1000]                                                         │
│ --filter      -f   TEXT  Filtering options for samtools view (use single quotes). [default: -f 0x001 -f 0x002 -F 0x004 -F 0x008 -q 10]                         │
│ --blacklist   -bl  TEXT  BED file of blacklist regions for bamCoverage.                                                                                        │
│ --gsize       -gs  TEXT  Effective genome size for macs3 peak calling. [default: 13000000]                                                                     │
│ --duplicates  -d         Keep duplicate reads (default: remove duplicates).                                                                                    │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ HiC ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --hicstuff     -hic  TEXT  Extra arguments passed to hicstuff pipeline. [default: --mapping iterative --duplicates --filter --plot --no-cleanup]               │
│ --restriction  -re   TEXT  Restriction enzyme(s) for HiC (e.g. DpnII,HinfI). [default: HpaII,HinfI]                                                            │
│ --binning      -b    TEXT  Minimum bin resolution for HiC matrix (bp); comma-separated for multi-res. [default: 500]                                           │
│ --balance      -ba   TEXT  Balancing options for cooler zoomify. [default: --cis-only --min-nnz 3 --mad-max 7]                                                 │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ MNase ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --MNaseSizes  -M  TEXT  Min,Max fragment size for MNase track. [default: 130,200]                                                                              │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Output ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --keepIntermediate  -k  Keep intermediate SAM / unmapped FASTQ files.                                                                                          │
│ --dry-run               Log commands without executing them.                                                                                                   │
│ --help              -h  Show this message and exit.                                                                                                            │
│ --version           -v  Show the version and exit.                                                                                                             │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

FASTQ files are detected automatically from the sample prefix. tinyMapper tries paired-end patterns first, then falls back to single-end:

Paired-end patterns (both R1 and R2 must exist):

  • <SAMPLE>_R1.fq.gz / <SAMPLE>_R2.fq.gz (preferred)
  • <SAMPLE>_R1.fastq.gz / <SAMPLE>_R2.fastq.gz
  • <SAMPLE>_nxq_R1.fq.gz / <SAMPLE>_nxq_R2.fq.gz
  • <SAMPLE>.end1.fq.gz / <SAMPLE>.end2.fq.gz
  • <SAMPLE>.end1.gz / <SAMPLE>.end2.gz
  • Illumina <SAMPLE>_S##_R1_*.gz / <SAMPLE>_S##_R2_*.gz

Single-end fallback (R2 not found — only R1 required):

  • <SAMPLE>_R1.fq.gz
  • <SAMPLE>_R1.fastq.gz
  • <SAMPLE>_nxq_R1.fq.gz
  • <SAMPLE>.fq.gz
  • <SAMPLE>.fastq.gz
Mode SE support Notes
ChIP Yes input control supported; spikein calibration requires PE
RNA Yes forward/reverse strand tracks still produced
ATAC Yes peaks called with --format BAM instead of BAMPE
MNase Yes fragment-size filter and nucleosome tracks skipped; CPM track only
shotgun Yes always single-end (R1+R2 concatenated as -U if both present)
HiC No paired-end required

Examples

ChIP-seq

# Sample only (no input, no calibration)
tinymapper -m ChIP \
    -s ~/reads/JS001 \
    -g ~/genomes/R64-1-1/R64-1-1 \
    -o ~/results

# With input control
tinymapper -m ChIP \
    --sample ~/reads/JS001_IP \
    --input  ~/reads/JS001_input \
    --genome ~/genomes/R64-1-1/R64-1-1 \
    --output ~/results

# With input and spikein calibration
tinymapper -m ChIP \
    --sample      ~/reads/JS001_IP \
    --input       ~/reads/JS001_input \
    --genome      ~/genomes/R64-1-1/R64-1-1 \
    --calibration ~/genomes/Cglabrata/Cglabrata \
    --output      ~/results

RNA-seq

tinymapper -m RNA -s ~/reads/JS001 -g ~/genomes/W303/W303 -o ~/results

MNase-seq

tinymapper -m MNase -s ~/reads/JS001 -g ~/genomes/W303/W303 -o ~/results \
    --MNaseSizes 70,250

ATAC-seq

tinymapper -m ATAC -s ~/reads/JS001 -g ~/genomes/W303/W303 -o ~/results

Hi-C

tinymapper -m HiC \
    -s ~/reads/JS001 \
    -g ~/genomes/W303/W303 \
    -o ~/results \
    --binning 1000,2000,8000 \
    --restriction 'DpnII,HinfI'

Shotgun

tinymapper -m shotgun -s ~/reads/JS001 -g ~/genomes/W303/W303 -o ~/results

Output layout

Results are written under --output with the following structure:

<output>/
  bam/genome/          filtered BAM files (genome)
  bam/spikein/         filtered BAM files (spikein, ChIP only)
  tracks/              BigWig coverage tracks (CPM, calibrated, fwd/rev for RNA)
  peaks/               MACS3 peak files (ChIP, ATAC)
  pairs/               contact pairs (Hi-C only)
  matrices/            .cool matrices (Hi-C only)
  logs/                per-run log and command files
  tmp/                 temporary files (removed on success unless --keepIntermediate)

Files follow the naming convention <sample>^<operation>^<hash>.<ext> where <hash> is a 6-character alphanumeric string unique to each run.


Running on a Slurm cluster (e.g. Maestro)

Activate the environment and submit with sbatch:

micromamba activate tinymapper

# Generic
sbatch --mem 40G -c 10 --wrap \
    "tinymapper --mode ChIP --sample <SAMPLE> --genome <GENOME> --output <OUTPUT> --threads 8"

# ChIP examples
sbatch --mem 40G -c 10 --wrap \
    "tinymapper -m ChIP -s ~/reads/JS001_IP -g ~/genomes/S288c/S288c --threads 8"
sbatch --mem 40G -c 10 --wrap \
    "tinymapper -m ChIP -s ~/reads/JS001_IP -i ~/reads/JS001_input -g ~/genomes/S288c/S288c --threads 8"

# RNA
sbatch --mem 40G -c 10 --wrap \
    "tinymapper -m RNA -s ~/reads/JS001 -g ~/genomes/S288c/S288c --threads 8"

# Hi-C
sbatch --mem 40G -c 10 --wrap \
    "tinymapper -m HiC -s ~/reads/JS001 -g ~/genomes/S288c/S288c --threads 8"

tinyMapper.sh can be used as a drop-in replacement for the legacy command surface (e.g. from autotinymapper Slurm scripts):

sbatch --mem 40G -c 10 --wrap \
    "tinyMapper.sh -m ChIP -s ~/reads/JS001_IP -g ~/genomes/S288c/S288c --threads 8"

Development cycle

  • Regenerate uv.lock and env.lock after any dependency changes.
uv lock
uv run \
    conda-lock lock \
        --update \
        --micromamba \
        --file env/tinymapper.yaml \
        --platform linux-64 \
        --lockfile env/conda-lock.yml
conda-lock install --name <ENV_NAME> env/conda-lock.yml

Acknowledgments

  • A. Cournac, A. Bignaud & F. Girard for tests.
  • H. Bordelet for sharing her mapping scripts and configuration.
  • L. Meneu for suggestions of improvements in documentation and raising bugs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tinymapper-0.14.25.tar.gz (174.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tinymapper-0.14.25-py3-none-any.whl (43.5 kB view details)

Uploaded Python 3

File details

Details for the file tinymapper-0.14.25.tar.gz.

File metadata

  • Download URL: tinymapper-0.14.25.tar.gz
  • Upload date:
  • Size: 174.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.15 {"installer":{"name":"uv","version":"0.11.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for tinymapper-0.14.25.tar.gz
Algorithm Hash digest
SHA256 c10b4868447dc45b4d49bbbf8c72f1c295c507fbfaf9ed76f957763797d515f3
MD5 0c7ff7c2f2d0794b889efc1b39e4875c
BLAKE2b-256 1b72b77d9fc3f0a4c4ac93b78893561cfb0d4e83f50019e0c4c75d49f58d6b30

See more details on using hashes here.

File details

Details for the file tinymapper-0.14.25-py3-none-any.whl.

File metadata

  • Download URL: tinymapper-0.14.25-py3-none-any.whl
  • Upload date:
  • Size: 43.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.15 {"installer":{"name":"uv","version":"0.11.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for tinymapper-0.14.25-py3-none-any.whl
Algorithm Hash digest
SHA256 8329553ff90235db28678117ac03772a5ba419af2c1b7559db1441b5b7857af2
MD5 07faf55ed5a5378c9b71b31e53864641
BLAKE2b-256 bcdc7f6ea89ae235024693fc65340984da862df519315f569dfd74c8dccf306e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page