Compartmental Refinement for Ultraprecise Stratification in Hi-C — A/B chromatin compartment analysis tool

These details have not been verified by PyPI

Project links

Project description

CRUSH — Compartmental Refinement for Ultraprecise Stratification in Hi-C

CRUSH Logo

Python 3.8+ Platform

CRUSH (Compartmental Refinement for Ultraprecise Stratification within Hi-C) is a command-line tool that identifies fine-scale A/B chromatin compartments from Hi-C contact matrices. It has successfully identified compartments in Hi-C, Micro-C, and Single-Cell Hi-C data, and specializes in calling compartments at high resolutions with significantly lower read depth than other compartment calling tools.

Manuscript in preparation — JRowleyLab, PI: Jordan Rowley

How It Works
Installation
Quick Start
Input Files
Output Files
Key Parameters
Test Dataset
Dependencies
Citation
Contact

How It Works

CRUSH workflow diagram

At its core, CRUSH asks a simple question for every genomic bin: does this bin interact more with A-type regions (iA) or B-type regions (iB)?

The algorithm walks from coarse resolutions down to your target resolution, using each level to refine A/B compartment assignments at the next finer level:

Eigenvector initialization — Computes principal components of the Hi-C contact matrix (or accepts a user-supplied eigenvector) to define initial A (iA) and B (iB) states.
CRUSH score calculation — At each resolution, calculates a Genome Interaction (GI) score per bin reflecting how much more it contacts iA regions versus iB regions.
Compartment reclassification — After each resolution pass, A/B bin assignments are updated based on the new scores, then used to seed the next finer resolution.
Resolution walking with midpoint shifting — A rolling-window alignment step adjusts finer-resolution scores against the coarser baseline, removing systematic biases between resolution levels.
Statistical filtering — Applies Benjamini–Hochberg FDR correction and outputs a q-value filtered bedGraph.

A compartments → positive CRUSH score (gene-rich, open chromatin, active transcription)
B compartments → negative CRUSH score (gene-poor, closed chromatin, transcriptionally silent)

Unlike eigenvector-based methods, you never need to flip CRUSH scores — A is always positive and B is always negative.

Installation

pip install CRUSH-hic

We recommend setting up a dedicated conda environment:

conda create -n crush_env python=3.10
conda activate crush_env
conda install -c bioconda bedtools
pip install CRUSH-hic hic-straw cooler numpy scipy pandas statsmodels tqdm

Dependencies

Tool	Purpose	Install
Python ≥ 3.8	Runtime	python.org
bedtools	Genomic intersections	`conda install -c bioconda bedtools`
mawk	Fast text processing	`sudo apt install mawk` / `brew install mawk`
hic-straw	Read `.hic` files	`pip install hic-straw`
cooler	Read `.mcool` files	`pip install cooler`
numpy / scipy / pandas	Numerical computing	`pip install numpy scipy pandas`
statsmodels	FDR correction	`pip install statsmodels`
tqdm	Progress bars	`pip install tqdm`

Verify installation

crush --help

Quick Start

With genome build shortcut (supported builds: `hg19`, `hg38`, `mm10`, `mm9`; res ≥ 500 bp)

crush \
  -i data.hic \
  -gb hg38 \
  -r 10000 \
  -c 8 \
  -o output_prefix_

With manual reference files (any genome, any resolution)

crush \
  -i data.hic \
  -g hg38.sizes \
  -a hg38_genes.bed \
  -b hg38.fa \
  -r 10000 \
  -c 8 \
  -o output_prefix_

Chromosome naming: CRUSH automatically detects and converts chromosome prefix mismatches between your Hi-C file and reference files (e.g., chr1 vs 1). If output is empty or unexpected, verify that your Hi-C file itself uses a consistent naming convention throughout.

Input Files

Always required

Flag	Description
`-i`	Hi-C file (`.hic` from Juicer or `.mcool` from cooler). Local path or HTTPS URL.
`-r`	Target resolution in base pairs (e.g., `10000` for 10 kb). Must exist in your Hi-C file.

Reference files — choose one of two paths

PATH A — genome build shortcut (res ≥ 500 bp only)

Flag Description

-gb Genome build shortcut. Supported builds: hg19, hg38, mm10, mm9. Auto-downloads chr.sizes, genes.bed, and Bbins.bed from JRowleyLab GitHub. Not available for res < 500 bp because the hosted Bbins.bed was pre-computed at 500 bp — for sub-500 bp analysis supply -g, -a, and -b (FASTA) manually so CRUSH can recompute Bbins at your exact resolution. Explicit -g/-a/-b flags override the auto-download for that specific file.

Flag	Description
`-gb`	Genome build shortcut. Supported builds: `hg19`, `hg38`, `mm10`, `mm9`. Auto-downloads chr.sizes, genes.bed, and Bbins.bed from JRowleyLab GitHub. Not available for res < 500 bp because the hosted Bbins.bed was pre-computed at 500 bp — for sub-500 bp analysis supply `-g`, `-a`, and `-b` (FASTA) manually so CRUSH can recompute Bbins at your exact resolution. Explicit `-g`/`-a`/`-b` flags override the auto-download for that specific file.

PATH B — manual reference files (any genome, any resolution)

Flag	Description
`-g`	Chromosome sizes file — two tab-separated columns: `chr_name` and `size` (bp). No header.
`-a`	BED file (≥ 3 columns) for A-compartment initialization. Gene annotations work well. ChIP-seq peaks for an active histone mark (e.g., H3K27ac) also work.
`-b`	Genome FASTA or pre-computed Bbins BED for B-compartment initialization. With FASTA, CRUSH generates Bbins at 500 bp (res ≥ 500 bp) or at the input resolution (res < 500 bp). With BED, the file is used directly as B-compartment seeds.

Optional

Flag	Description
`-e`	Pre-computed eigenvector bedGraph (4 columns: chr, start, end, value). Positive = A, Negative = B. Skips automatic eigenvector calculation.

Output Files

CRUSH produces four output files, each prefixed with whatever you supply via -o:

File	Description
`{prefix}CRUSHparamters.txt`	Record of all parameters used. Keep this for reproducibility.
`{prefix}mergedCrush_{res}.bedgraph`	Main output. CRUSH scores for every bin. Positive = A compartment, Negative = B compartment. Unlike eigenvectors, scores never need to be flipped.
`{prefix}mergedqvalue_{res}.bedgraph`	Estimated q-value (BH-corrected) for each bin's score.
`{prefix}mergedCrush_{res}_qfiltered_reprocess.bedgraph`	CRUSH scores filtered to bins passing the q-value threshold. Note: this filter can be overly stringent — excellent results are often obtained from the unfiltered `mergedCrush` file.

All bedGraph files include a UCSC track header for direct loading into genome browsers (IGV, UCSC, WashU).

While running, CRUSH creates a temporary working directory named CRUSHtmp_[randomnumber] in your current directory. This is removed automatically when the run completes. To keep it (e.g., for debugging), use -C 0. You can also name it yourself with -f.

Key Parameters

Flag	Default	Description
`-c`	`1`	Number of CPU threads. Set to number of chromosomes or available cores, whichever is smaller.
`-gb`	(none)	Genome build shortcut (`hg19`, `hg38`, `mm10`, `mm9`). Auto-downloads reference files. res ≥ 500 bp only.
`-o`	(none)	Output file prefix.
`-N`	`NONE`	Normalization: `NONE`, `VC`, `VC_SQRT`, `KR`, `SCALE`.
`-m`	`2500000`	Coarsest resolution to start walking from.
`-Z`	`100000`	Resolution for eigenvector calculation (100 kb recommended).
`-w`	`5`	Sliding window size (kb) for score averaging. Set to `1` to disable. Set to `0` for legacy auto-calculation from sequencing depth.
`-q`	`0.05`	Q-value threshold for filtered output. Set to `0` to disable filtering.
`-s`	`0`	Enable boundary smoothing (`1` = on).
`-A`	`0`	Adjust score distribution. Do not use when comparing samples.
`-C`	`1`	Clean up temp files after run (`0` = keep).
`-v`	`0`	Verbose output (`1` = on).

For the complete parameter reference, see the User Manual.

Test Dataset

A small test dataset covering chromosomes 17–19 of hg19 is provided in examples/TestData/:

File	Description
`hg19_c17_18_19_1kb.hic.gz`	Hi-C contact file
`hg19_c17_18_19_genes.bed.gz`	Gene annotations for A-state initialization
`hg19_c17_18.fa.gz`	Genome FASTA for GC-based B-state initialization
`hg19_c17_18.fa.fai`	FASTA index
`hg19_c17_18_19.sizes.gz`	Chromosome sizes
`Eigen_100kb_c17_18_19.bedgraph.gz`	Pre-computed eigenvector (optional `-e` input)
`Bbins_hg19_c17_18_19.bed.gz`	Pre-computed B-bins (alternative to FASTA for `-b`)

Run the test

# Decompress
gunzip examples/TestData/*.gz

# Run with FASTA-based B initialization
crush \
  -i examples/TestData/hg19_c17_18_19_1kb.hic \
  -g examples/TestData/hg19_c17_18_19.sizes \
  -a examples/TestData/hg19_c17_18_19_genes.bed \
  -b examples/TestData/hg19_c17_18.fa \
  -r 10000 \
  -c 4 \
  -o test_

Expected output: test_mergedCrush_10000.bedgraph, test_mergedqvalue_10000.bedgraph, and test_mergedCrush_10000_qfiltered_reprocess.bedgraph.

Load test_mergedCrush_10000.bedgraph into IGV or the UCSC browser to verify the A/B compartment pattern on chr17–19.

Citation

Manuscript in preparation. If you use CRUSH in your research, please check back for the citation or contact us directly.

Contact

JRowleyLab | PI: Jordan Rowley
For questions, bug reports, or feature requests, please open a GitHub Issue.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.1

Jun 10, 2026

This version

1.0.0

Jun 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crush_hic-1.0.0.tar.gz (40.8 kB view details)

Uploaded Jun 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

crush_hic-1.0.0-py3-none-any.whl (29.0 kB view details)

Uploaded Jun 10, 2026 Python 3

File details

Details for the file crush_hic-1.0.0.tar.gz.

File metadata

Download URL: crush_hic-1.0.0.tar.gz
Upload date: Jun 10, 2026
Size: 40.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.7.3

File hashes

Hashes for crush_hic-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`866e8f645a871f5cbe0205fc6a9ce5484e65675f1f29012be81eb2a22b840cff`
MD5	`933baf46aae6289966ede36c25302479`
BLAKE2b-256	`21de8cc1089fcc4f1c2cdb0a8c75d339afabb4797fca73ab6807765fc43a2001`

See more details on using hashes here.

File details

Details for the file crush_hic-1.0.0-py3-none-any.whl.

File metadata

Download URL: crush_hic-1.0.0-py3-none-any.whl
Upload date: Jun 10, 2026
Size: 29.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.7.3

File hashes

Hashes for crush_hic-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`87fb415ef4179359b73d0d78c824d9162814d031801303ff4df8e3067ed16cc1`
MD5	`15eeb464d36006d706fdc5958d5afb1a`
BLAKE2b-256	`139b8c2c9e74efc13b8fa2cbca4b61a9601217642323582a53dd0805b6f053d3`

See more details on using hashes here.

crush-hic 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CRUSH — Compartmental Refinement for Ultraprecise Stratification in Hi-C

Table of Contents

How It Works

Installation

Dependencies

Verify installation

Quick Start

With genome build shortcut (supported builds: hg19, hg38, mm10, mm9; res ≥ 500 bp)

With manual reference files (any genome, any resolution)

Input Files

Always required

Reference files — choose one of two paths

Optional

Output Files

Key Parameters

Test Dataset

Run the test

Citation

Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

With genome build shortcut (supported builds: `hg19`, `hg38`, `mm10`, `mm9`; res ≥ 500 bp)