Skip to main content

Map and process ChIP-seq, RNA-seq, ATAC-seq, MNase-seq, HiC and shotgun reads

Project description

tinyMapper

A minimalist yet versatile workflow to process ChIP-seq (with or without input/spikein), RNA-seq, MNase-seq, ATAC-seq, Hi-C and shotgun sequencing data. Hi-C mode delegates to hicstuff and cooler.

tinyMapper supports both paired-end and single-end reads. Hi-C mode requires paired-end data. Spikein calibration (ChIP) also requires paired-end. For single-end MNase, fragment-size filtering is skipped and only a standard CPM track is produced.

Note: tinyMapper is a Python package that orchestrates external CLI tools (bowtie2, STAR, samtools, deeptools, macs2, hicstuff). It does not re-implement alignment or peak-calling.

DISCLAIMER:

  • This is by no means the "best" or "only" way to process sequencing data. Feedback and suggestions are welcome.
  • This workflow does NOT include QC / validation. Run fastqc on raw reads at a minimum.

Installation

tinyMapper is a Python package. The recommended install creates a micromamba environment that bundles the Python package together with all bioinformatics tools (bowtie2, STAR, samtools, deeptools, macs2, hicstuff, cooler, bedtools).

Recommended — full install via micromamba

Requires micromamba.

git clone https://github.com/js2264/tinyMapper.git
micromamba env create -y -f tinyMapper/env/tinymapper.yaml
micromamba activate tinymapper
tinymapper --help

Alternative — Python package only

If all bioinformatics tools are already available in your environment:

pip install git+https://github.com/js2264/tinyMapper.git
tinymapper --help

Invocation

After activating the environment, there are two equivalent ways to call tinyMapper:

Command Description
tinymapper --mode ChIP ... Primary Python CLI (recommended)
tinyMapper.sh --mode ChIP ... Legacy bash wrapper — forwards all arguments verbatim to tinymapper

Both accept exactly the same flags. tinyMapper.sh is kept for compatibility with existing Slurm scripts and autotinymapper.


Usage

 Usage: tinymapper [OPTIONS]

 tinyMapper — map and process sequencing reads.
 Modes:
   ChIP    — ChIP-seq (bowtie2 → samtools → bamCoverage → macs2)
   RNA     — RNA-seq  (STAR → samtools → bamCoverage × 3)
   ATAC    — ATAC-seq (bowtie2 → samtools → bamCoverage → macs2)
   MNase   — MNase-seq (bowtie2 → samtools → size filter → 3 tracks)
   HiC     — Hi-C     (hicstuff pipeline → cooler → mcool)
   shotgun — Shotgun  (bowtie2 single-end → samtools → bamCoverage)


 Examples:
   tinymapper -m ChIP -s ~/HB44 -g ~/genomes/R64-1-1/R64-1-1 -o ~/results
   tinymapper -m RNA  -s ~/AB4  -g ~/genomes/W303/W303 -o ~/results
   tinymapper -m HiC  -s ~/CH266 -g ~/genomes/W303/W303 --binning 1000

╭─ Required ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *  --mode    -m  [chip|rna|atac|mnase|hic|shotgun]  Mapping mode (ChIP, MNase, ATAC, RNA, HiC, shotgun). [required]                                            │
│ *  --sample  -s  TEXT                               Path prefix to sample FASTQ files.  For ~/reads/JS001_R{1,2}.fq.gz use --sample ~/reads/JS001 [required]   │
│ *  --genome  -g  TEXT                               Path prefix to reference genome.  For ~/genome/W303/W303.fa use --genome ~/genome/W303/W303 [required]     │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Core optional ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --output       -o  PATH     Directory to store results. [default: results]                                                                                     │
│ --input        -i  TEXT     (ChIP) Path prefix to input/control sample.                                                                                        │
│ --calibration  -c  TEXT     (ChIP) Path prefix to spikein/calibration genome.                                                                                  │
│ --threads      -t  INTEGER  Number of CPU threads. [default: 8]                                                                                                │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Alignment / filtering ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --alignment   -a   TEXT  Extra options passed to bowtie2 (use single quotes). [default: --maxins 1000]                                                         │
│ --filter      -f   TEXT  Filtering options for samtools view (use single quotes). [default: -f 0x001 -f 0x002 -F 0x004 -F 0x008 -q 10]                         │
│ --blacklist   -bl  TEXT  BED file of blacklist regions for bamCoverage.                                                                                        │
│ --gsize       -gs  TEXT  Effective genome size for macs2 peak calling. [default: 13000000]                                                                     │
│ --duplicates  -d         Keep duplicate reads (default: remove duplicates).                                                                                    │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ HiC ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --hicstuff     -hic  TEXT  Extra arguments passed to hicstuff pipeline. [default: --mapping iterative --duplicates --filter --plot --no-cleanup]               │
│ --restriction  -re   TEXT  Restriction enzyme(s) for HiC (e.g. DpnII,HinfI). [default: HpaII,HinfI]                                                            │
│ --binning      -b    TEXT  Minimum bin resolution for HiC matrix (bp); comma-separated for multi-res. [default: 500]                                           │
│ --balance      -ba   TEXT  Balancing options for cooler zoomify. [default: --cis-only --min-nnz 3 --mad-max 7]                                                 │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ MNase ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --MNaseSizes  -M  TEXT  Min,Max fragment size for MNase track. [default: 130,200]                                                                              │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Output ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --keepIntermediate  -k  Keep intermediate SAM / unmapped FASTQ files.                                                                                          │
│ --dry-run               Log commands without executing them.                                                                                                   │
│ --help              -h  Show this message and exit.                                                                                                            │
│ --version           -v  Show the version and exit.                                                                                                             │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

FASTQ files are detected automatically from the sample prefix. tinyMapper tries paired-end patterns first, then falls back to single-end:

Paired-end patterns (both R1 and R2 must exist):

  • <SAMPLE>_R1.fq.gz / <SAMPLE>_R2.fq.gz (preferred)
  • <SAMPLE>_R1.fastq.gz / <SAMPLE>_R2.fastq.gz
  • <SAMPLE>_nxq_R1.fq.gz / <SAMPLE>_nxq_R2.fq.gz
  • <SAMPLE>.end1.fq.gz / <SAMPLE>.end2.fq.gz
  • <SAMPLE>.end1.gz / <SAMPLE>.end2.gz
  • Illumina <SAMPLE>_S##_R1_*.gz / <SAMPLE>_S##_R2_*.gz

Single-end fallback (R2 not found — only R1 required):

  • <SAMPLE>_R1.fq.gz
  • <SAMPLE>_R1.fastq.gz
  • <SAMPLE>_nxq_R1.fq.gz
  • <SAMPLE>.fq.gz
  • <SAMPLE>.fastq.gz
Mode SE support Notes
ChIP Yes input control supported; spikein calibration requires PE
RNA Yes forward/reverse strand tracks still produced
ATAC Yes peaks called with --format BAM instead of BAMPE
MNase Yes fragment-size filter and nucleosome tracks skipped; CPM track only
shotgun Yes always single-end (R1+R2 concatenated as -U if both present)
HiC No paired-end required

Examples

ChIP-seq

# Sample only (no input, no calibration)
tinymapper -m ChIP \
    -s ~/reads/JS001 \
    -g ~/genomes/R64-1-1/R64-1-1 \
    -o ~/results

# With input control
tinymapper -m ChIP \
    --sample ~/reads/JS001_IP \
    --input  ~/reads/JS001_input \
    --genome ~/genomes/R64-1-1/R64-1-1 \
    --output ~/results

# With input and spikein calibration
tinymapper -m ChIP \
    --sample      ~/reads/JS001_IP \
    --input       ~/reads/JS001_input \
    --genome      ~/genomes/R64-1-1/R64-1-1 \
    --calibration ~/genomes/Cglabrata/Cglabrata \
    --output      ~/results

RNA-seq

tinymapper -m RNA -s ~/reads/JS001 -g ~/genomes/W303/W303 -o ~/results

MNase-seq

tinymapper -m MNase -s ~/reads/JS001 -g ~/genomes/W303/W303 -o ~/results \
    --MNaseSizes 70,250

ATAC-seq

tinymapper -m ATAC -s ~/reads/JS001 -g ~/genomes/W303/W303 -o ~/results

Hi-C

tinymapper -m HiC \
    -s ~/reads/JS001 \
    -g ~/genomes/W303/W303 \
    -o ~/results \
    --binning 1000,2000,8000 \
    --restriction 'DpnII,HinfI'

Shotgun

tinymapper -m shotgun -s ~/reads/JS001 -g ~/genomes/W303/W303 -o ~/results

Output layout

Results are written under --output with the following structure:

<output>/
  bam/genome/          filtered BAM files (genome)
  bam/spikein/         filtered BAM files (spikein, ChIP only)
  tracks/              BigWig coverage tracks (CPM, calibrated, fwd/rev for RNA)
  peaks/               MACS2 peak files (ChIP, ATAC)
  pairs/               contact pairs (Hi-C only)
  matrices/            .cool matrices (Hi-C only)
  logs/                per-run log and command files
  tmp/                 temporary files (removed on success unless --keepIntermediate)

Files follow the naming convention <sample>^<operation>^<hash>.<ext> where <hash> is a 6-character alphanumeric string unique to each run.


Running on a Slurm cluster (e.g. Maestro)

Activate the environment and submit with sbatch:

micromamba activate tinymapper

# Generic
sbatch --mem 40G -c 10 --wrap \
    "tinymapper --mode ChIP --sample <SAMPLE> --genome <GENOME> --output <OUTPUT> --threads 8"

# ChIP examples
sbatch --mem 40G -c 10 --wrap \
    "tinymapper -m ChIP -s ~/reads/JS001_IP -g ~/genomes/S288c/S288c --threads 8"
sbatch --mem 40G -c 10 --wrap \
    "tinymapper -m ChIP -s ~/reads/JS001_IP -i ~/reads/JS001_input -g ~/genomes/S288c/S288c --threads 8"

# RNA
sbatch --mem 40G -c 10 --wrap \
    "tinymapper -m RNA -s ~/reads/JS001 -g ~/genomes/S288c/S288c --threads 8"

# Hi-C
sbatch --mem 40G -c 10 --wrap \
    "tinymapper -m HiC -s ~/reads/JS001 -g ~/genomes/S288c/S288c --threads 8"

tinyMapper.sh can be used as a drop-in replacement for the legacy command surface (e.g. from autotinymapper Slurm scripts):

sbatch --mem 40G -c 10 --wrap \
    "tinyMapper.sh -m ChIP -s ~/reads/JS001_IP -g ~/genomes/S288c/S288c --threads 8"

Acknowledgments

  • A. Cournac, A. Bignaud & F. Girard for tests.
  • H. Bordelet for sharing her mapping scripts and configuration.
  • L. Meneu for suggestions of improvements in documentation and raising bugs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tinymapper-0.14.23.tar.gz (79.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tinymapper-0.14.23-py3-none-any.whl (41.6 kB view details)

Uploaded Python 3

File details

Details for the file tinymapper-0.14.23.tar.gz.

File metadata

  • Download URL: tinymapper-0.14.23.tar.gz
  • Upload date:
  • Size: 79.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for tinymapper-0.14.23.tar.gz
Algorithm Hash digest
SHA256 071a9eadfae519429819a409c66289902aa9607c4c90df58a87ce1617532a74d
MD5 e9149e2042478dce7803521ac7188bdb
BLAKE2b-256 8620f151ee8eb15dbaac20ae425c46c8bc93ea03679c498d3941ef36e8747930

See more details on using hashes here.

File details

Details for the file tinymapper-0.14.23-py3-none-any.whl.

File metadata

  • Download URL: tinymapper-0.14.23-py3-none-any.whl
  • Upload date:
  • Size: 41.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for tinymapper-0.14.23-py3-none-any.whl
Algorithm Hash digest
SHA256 b77a89adfc3b1921b72c6d3df7a72b6335f39517d20915d834ffbdd95d0c4fba
MD5 af07cbd4393163ef9043ec2ccd23e3f0
BLAKE2b-256 fee635ad15c391cc3c0458fcaba874d012ce34e3bbd9207f348a550e85b20e5a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page