Skip to main content

Map and process ChIP-seq, RNA-seq, ATAC-seq, MNase-seq, HiC and shotgun reads

Project description

tinyMapper

A minimalist yet versatile workflow to process ChIP-seq (with or without input/spikein), RNA-seq, MNase-seq, ATAC-seq, Hi-C and shotgun sequencing data. Hi-C mode delegates to hicstuff and cooler.

tinyMapper supports both paired-end and single-end reads. Hi-C mode requires paired-end data. Spikein calibration (ChIP) also requires paired-end. For single-end MNase, fragment-size filtering is skipped and only a standard CPM track is produced.

Note: tinyMapper is a Python package that orchestrates external CLI tools (bowtie2, STAR, samtools, deeptools, macs3, hicstuff). It does not re-implement alignment or peak-calling.

DISCLAIMER:

  • This is by no means the "best" or "only" way to process sequencing data. Feedback and suggestions are welcome.
  • This workflow does NOT include QC / validation. Run fastqc on raw reads at a minimum.

Installation

tinyMapper is a Python package. The recommended install creates a micromamba environment that bundles the Python package together with all bioinformatics tools (bowtie2, STAR, samtools, deeptools, macs3, hicstuff, cooler, bedtools).

Recommended — full install via micromamba

Requires micromamba.

git clone https://github.com/js2264/tinyMapper.git
micromamba env create -y -f tinyMapper/env/tinymapper.yaml
micromamba activate tinymapper
tinymapper --help

Alternative — Python package only

If all bioinformatics tools are already available in your environment:

uv venv
uv pip install git+https://github.com/js2264/tinyMapper.git
tinymapper --help

Invocation

After activating the environment, there are two equivalent ways to call tinyMapper:

Command Description
tinymapper --mode ChIP ... Primary Python CLI (recommended)
tinyMapper.sh --mode ChIP ... Legacy bash wrapper — forwards all arguments verbatim to tinymapper

Both accept exactly the same flags. tinyMapper.sh is kept for compatibility with existing Slurm scripts and autotinymapper.


Usage

 Usage: tinymapper [OPTIONS]

 tinyMapper — map and process sequencing reads.
 Modes:
   ChIP    — ChIP-seq (bowtie2 → samtools → bamCoverage → macs3)
   RNA     — RNA-seq  (STAR → samtools → bamCoverage × 3)
   ATAC    — ATAC-seq (bowtie2 → samtools → bamCoverage → macs3)
   MNase   — MNase-seq (bowtie2 → samtools → size filter → 3 tracks)
   HiC     — Hi-C     (hicstuff pipeline → cooler → mcool)
   shotgun — Shotgun  (bowtie2 single-end → samtools → bamCoverage)


 Examples:
   tinymapper -m ChIP -s ~/HB44 -g ~/genomes/R64-1-1/R64-1-1 -o ~/results
   tinymapper -m RNA  -s ~/AB4  -g ~/genomes/W303/W303 -o ~/results
   tinymapper -m HiC  -s ~/CH266 -g ~/genomes/W303/W303 --binning 1000

╭─ Required ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *  --mode    -m  [chip|rna|atac|mnase|hic|shotgun]  Mapping mode (ChIP, MNase, ATAC, RNA, HiC, shotgun). [required]                                            │
│ *  --sample  -s  TEXT                               Path prefix to sample FASTQ files.  For ~/reads/JS001_R{1,2}.fq.gz use --sample ~/reads/JS001 [required]   │
│ *  --genome  -g  TEXT                               Path prefix to reference genome.  For ~/genome/W303/W303.fa use --genome ~/genome/W303/W303 [required]     │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Core optional ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --output       -o  PATH     Directory to store results. [default: results]                                                                                     │
│ --input        -i  TEXT     (ChIP) Path prefix to input/control sample.                                                                                        │
│ --calibration  -c  TEXT     (ChIP) Path prefix to spikein/calibration genome.                                                                                  │
│ --threads      -t  INTEGER  Number of CPU threads. [default: 8]                                                                                                │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Alignment / filtering ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --alignment   -a   TEXT  Extra options passed to bowtie2 (use single quotes). [default: --maxins 1000]                                                         │
│ --filter      -f   TEXT  Filtering options for samtools view (use single quotes). [default: -f 0x001 -f 0x002 -F 0x004 -F 0x008 -q 10]                         │
│ --blacklist   -bl  TEXT  BED file of blacklist regions for bamCoverage.                                                                                        │
│ --gsize       -gs  TEXT  Effective genome size for macs3 peak calling. [default: 13000000]                                                                     │
│ --duplicates  -d         Keep duplicate reads (default: remove duplicates).                                                                                    │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ HiC ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --hicstuff     -hic  TEXT  Extra arguments passed to hicstuff pipeline. [default: --mapping iterative --duplicates --filter --plot --no-cleanup]               │
│ --restriction  -re   TEXT  Restriction enzyme(s) for HiC (e.g. DpnII,HinfI). [default: HpaII,HinfI]                                                            │
│ --binning      -b    TEXT  Minimum bin resolution for HiC matrix (bp); comma-separated for multi-res. [default: 500]                                           │
│ --balance      -ba   TEXT  Balancing options for cooler zoomify. [default: --cis-only --min-nnz 3 --mad-max 7]                                                 │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ MNase ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --MNaseSizes  -M  TEXT  Min,Max fragment size for MNase track. [default: 130,200]                                                                              │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Output ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --keepIntermediate  -k  Keep intermediate SAM / unmapped FASTQ files.                                                                                          │
│ --dry-run               Log commands without executing them.                                                                                                   │
│ --help              -h  Show this message and exit.                                                                                                            │
│ --version           -v  Show the version and exit.                                                                                                             │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

FASTQ files are detected automatically from the sample prefix. tinyMapper tries paired-end patterns first, then falls back to single-end:

Paired-end patterns (both R1 and R2 must exist):

  • <SAMPLE>_R1.fq.gz / <SAMPLE>_R2.fq.gz (preferred)
  • <SAMPLE>_R1.fastq.gz / <SAMPLE>_R2.fastq.gz
  • <SAMPLE>_nxq_R1.fq.gz / <SAMPLE>_nxq_R2.fq.gz
  • <SAMPLE>.end1.fq.gz / <SAMPLE>.end2.fq.gz
  • <SAMPLE>.end1.gz / <SAMPLE>.end2.gz
  • Illumina <SAMPLE>_S##_R1_*.gz / <SAMPLE>_S##_R2_*.gz

Single-end fallback (R2 not found — only R1 required):

  • <SAMPLE>_R1.fq.gz
  • <SAMPLE>_R1.fastq.gz
  • <SAMPLE>_nxq_R1.fq.gz
  • <SAMPLE>.fq.gz
  • <SAMPLE>.fastq.gz
Mode SE support Notes
ChIP Yes input control supported; spikein calibration requires PE
RNA Yes forward/reverse strand tracks still produced
ATAC Yes peaks called with --format BAM instead of BAMPE
MNase Yes fragment-size filter and nucleosome tracks skipped; CPM track only
shotgun Yes always single-end (R1+R2 concatenated as -U if both present)
HiC No paired-end required

Examples

ChIP-seq

# Sample only (no input, no calibration)
tinymapper -m ChIP \
    -s ~/reads/JS001 \
    -g ~/genomes/R64-1-1/R64-1-1 \
    -o ~/results

# With input control
tinymapper -m ChIP \
    --sample ~/reads/JS001_IP \
    --input  ~/reads/JS001_input \
    --genome ~/genomes/R64-1-1/R64-1-1 \
    --output ~/results

# With input and spikein calibration
tinymapper -m ChIP \
    --sample      ~/reads/JS001_IP \
    --input       ~/reads/JS001_input \
    --genome      ~/genomes/R64-1-1/R64-1-1 \
    --calibration ~/genomes/Cglabrata/Cglabrata \
    --output      ~/results

RNA-seq

tinymapper -m RNA -s ~/reads/JS001 -g ~/genomes/W303/W303 -o ~/results

MNase-seq

tinymapper -m MNase -s ~/reads/JS001 -g ~/genomes/W303/W303 -o ~/results \
    --MNaseSizes 70,250

ATAC-seq

tinymapper -m ATAC -s ~/reads/JS001 -g ~/genomes/W303/W303 -o ~/results

Hi-C

tinymapper -m HiC \
    -s ~/reads/JS001 \
    -g ~/genomes/W303/W303 \
    -o ~/results \
    --binning 1000,2000,8000 \
    --restriction 'DpnII,HinfI'

Shotgun

tinymapper -m shotgun -s ~/reads/JS001 -g ~/genomes/W303/W303 -o ~/results

Output layout

Results are written under --output with the following structure:

<output>/
  bam/genome/          filtered BAM files (genome)
  bam/spikein/         filtered BAM files (spikein, ChIP only)
  tracks/              BigWig coverage tracks (CPM, calibrated, fwd/rev for RNA)
  peaks/               MACS3 peak files (ChIP, ATAC)
  pairs/               contact pairs (Hi-C only)
  matrices/            .cool matrices (Hi-C only)
  logs/                per-run log and command files
  tmp/                 temporary files (removed on success unless --keepIntermediate)

Files follow the naming convention <sample>^<operation>^<hash>.<ext> where <hash> is a 6-character alphanumeric string unique to each run.


Running on a Slurm cluster (e.g. Maestro)

Activate the environment and submit with sbatch:

micromamba activate tinymapper

# Generic
sbatch --mem 40G -c 10 --wrap \
    "tinymapper --mode ChIP --sample <SAMPLE> --genome <GENOME> --output <OUTPUT> --threads 8"

# ChIP examples
sbatch --mem 40G -c 10 --wrap \
    "tinymapper -m ChIP -s ~/reads/JS001_IP -g ~/genomes/S288c/S288c --threads 8"
sbatch --mem 40G -c 10 --wrap \
    "tinymapper -m ChIP -s ~/reads/JS001_IP -i ~/reads/JS001_input -g ~/genomes/S288c/S288c --threads 8"

# RNA
sbatch --mem 40G -c 10 --wrap \
    "tinymapper -m RNA -s ~/reads/JS001 -g ~/genomes/S288c/S288c --threads 8"

# Hi-C
sbatch --mem 40G -c 10 --wrap \
    "tinymapper -m HiC -s ~/reads/JS001 -g ~/genomes/S288c/S288c --threads 8"

tinyMapper.sh can be used as a drop-in replacement for the legacy command surface (e.g. from autotinymapper Slurm scripts):

sbatch --mem 40G -c 10 --wrap \
    "tinyMapper.sh -m ChIP -s ~/reads/JS001_IP -g ~/genomes/S288c/S288c --threads 8"

Development cycle

  • Regenerate uv.lock and env.lock after any dependency changes.
uv lock
uv run \
    conda-lock lock \
        --update \
        --micromamba \
        --file env/tinymapper.yaml \
        --platform linux-64 \
        --lockfile env/conda-lock.yml
conda-lock install --name <ENV_NAME> env/conda-lock.yml

Acknowledgments

  • A. Cournac, A. Bignaud & F. Girard for tests.
  • H. Bordelet for sharing her mapping scripts and configuration.
  • L. Meneu for suggestions of improvements in documentation and raising bugs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tinymapper-0.14.24.tar.gz (165.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tinymapper-0.14.24-py3-none-any.whl (43.4 kB view details)

Uploaded Python 3

File details

Details for the file tinymapper-0.14.24.tar.gz.

File metadata

  • Download URL: tinymapper-0.14.24.tar.gz
  • Upload date:
  • Size: 165.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for tinymapper-0.14.24.tar.gz
Algorithm Hash digest
SHA256 73faedf07f6d2ce34ab2d8d6e9ff7da53436eb66eba4a9f1fa27d6626177943a
MD5 8eb5f2ba48bf3deeb42d1d4228641352
BLAKE2b-256 8692be766227eeb4de2cf15f9486f8d04f0723f5919153edd18819d7cdd4b538

See more details on using hashes here.

File details

Details for the file tinymapper-0.14.24-py3-none-any.whl.

File metadata

  • Download URL: tinymapper-0.14.24-py3-none-any.whl
  • Upload date:
  • Size: 43.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for tinymapper-0.14.24-py3-none-any.whl
Algorithm Hash digest
SHA256 1c5ee83b487ced0019478cfb5aca1c7b656450496fea571ece55f202ab31b5c5
MD5 42b3467ea30acaf4adb519382f46fd1f
BLAKE2b-256 7deb9e3b96b1650f707dc9f8a1b5406f2e01b5485702fc98e6bb272a9bc8c31e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page