A visualisation tool for Hi-C dataset.
Project description
HiCue
HiCue is a command-line tool for extracting, aggregating, and visualising
chromatin interaction data from Hi-C / Micro-C experiments stored in
Cooler (.cool / .mcool) files.
It supports pileup analysis (averaging submatrices centred on a set of genomic positions or regions), overlay of genomic tracks (BigWig), GFF/GTF annotation, and multiple separation strategies (strand, region, chromosome).
Table of contents
- Features
- Requirements
- Installation
- Quick start
- Commands
- Input formats
- Passing Cooler files
- Output structure
- Examples
- Architecture overview
- Known issues & fixes
- License
- Citation
Features
- Extract Hi-C submatrices centred on single genomic loci (BED/GFF) or on pairs of loci (BED2D / loop anchors).
- Compute pileup (aggregate) matrices by median, mean, or sum across all loci in a group.
- Extract and aggregate submatrices of variable genomic size using the
regionscommand, which rescales each submatrix to a common pixel dimension before aggregation. - Overlay BigWig genomic tracks on submatrix figures.
- Automatic P(s) detrending (distance-law normalisation) and optional patch detrending (null-model pileup subtraction).
- Separate results by strand direction, chromosome, or custom genomic regions.
- Multi-resolution support (
.mcoolfiles). - Pass multiple Cooler files at once — directly on the command line, as a comma-separated list, or via a plain-text file listing one path per line.
- Multi-threaded extraction via Python
threading+queue.
Requirements
All dependencies are installed automatically when you install HiCue via pip.
Installation
Option A — pip (recommended)
pip install hicue
It is strongly recommended to install inside a dedicated environment:
python -m venv hicue-env
source hicue-env/bin/activate # Linux / macOS
# hicue-env\Scripts\activate # Windows
pip install hicue
Option B — conda environment
conda create -n hicue python=3.11
conda activate hicue
pip install hicue
Why pip inside conda? Some HiCue dependencies (e.g.
pyBigWig) are not available on the default conda channels for all platforms. Using pip inside a conda environment gives you the isolation of conda with the full package availability of PyPI.
Option C — from source (development)
git clone https://github.com/Mae-4815162342/HiCue.git
cd HiCue
pip install -e .[dev]
Verify
hicue --help
Quick start
# 1 — Install
pip install hicue
# 2 — Pileup centred on loop anchors, 50 kb window, 1 kb resolution
hicue extract results/ anchors.bed experiment.mcool \
--windows 50000 \
--binnings 1000
# 3 — Same analysis on several experiments at once (text-file input)
hicue extract results/ anchors.bed experiments.txt \
--windows 50000 \
--binnings 1000
# 4 — Variable-size region pileup (e.g. TADs)
hicue regions results/ tads.bed experiment.mcool \
--expected_sizes 51 \
--padding 0.5 \
--binnings 5000
Commands
extract
Extract submatrices around fixed-size genomic positions and compute a pileup.
hicue extract OUTPUT_DIR POSITIONS COOL_FILES [OPTIONS]
| Option | Type | Default | Description |
|---|---|---|---|
--pileup / --no-pileup |
flag | on | Compute and display aggregate pileup figures |
--loci / --no-loci |
flag | off | Save individual submatrix figures |
--batch / --no-batch |
flag | off | Save batched submatrix figures (64 per page) |
-w / --windows |
int,… | 30000 | Half-window size(s) in bp |
-b / --binnings |
int,… | 1000 | Bin size(s) in bp (.mcool only) |
-d / --detrending |
choice | none |
none, ps (distance law), or patch (null-model subtraction) |
-m / --method |
choice | median |
Aggregation method: median, mean, or sum |
-f / --flip |
flag | off | Strand-normalise matrices (flip reverse-strand loci) |
-r / --raw |
flag | off | Use raw (unbalanced) contact counts |
--loops |
flag | off | Treat positions as loop anchors (BED2D mode) |
--trans |
flag | off | Include trans-chromosomal contacts in loop mode |
--min_dist |
int | 30000 | Minimum distance in bp between paired loci |
--diag_mask |
int | 0 | Mask the diagonal up to this distance (bp) |
--separate_by |
str,… | — | Separate results by direction, regions, or chroms |
--separation_regions |
path | — | CSV file defining regions for --separate_by regions |
--gff |
path | — | Annotate positions with a GFF/GTF file |
--tracks |
path | — | Annotate positions with a BigWig file |
--center |
choice | start |
Window anchor: start, center, or end of each feature |
--display_sense |
choice | forward |
Axis orientation: forward or reverse |
--display_strand |
flag | off | Overlay transcription-direction arrows on figures |
--cmap_color |
str | seismic |
Matplotlib colormap for pileup figures |
--cmap_limits |
float float | — | Fixed min/max for the pileup colormap |
--indiv_cmap_color |
str | afmhot_r |
Colormap for individual submatrix figures |
--indiv_cmap_limits |
float float | — | Fixed min/max for individual figures |
--format |
str,… | pdf |
Output figure format(s) (e.g. pdf,png) |
--threads |
int | 8 | Number of worker threads |
--nb_pos |
int | 2 | Random positions per locus for patch detrending |
--rand_max_dist |
int | 100000 | Max distance (bp) for random position selection |
--random_jitter |
int | 0 | Allowed jitter (bp) on random pair distances |
--random_path |
str | — | Path prefix to pre-computed random positions |
--circulars |
str,… | none |
Chromosomes to treat as circular |
--ps_all_chrom |
bool | True | Use all cis contacts for P(s) estimation |
--contact_range |
int int int | — | MIN MAX STEP (bp) for distance-based separation |
--overlap |
choice | strict |
Position-interval overlap: strict or flex |
--record_type |
choice | — | GFF record type to select (e.g. gene) |
--track_unit |
str | "" |
Label for the BigWig signal axis |
--save_tmp |
flag | off | Save intermediate CSV files to OUTPUT_DIR |
tracks
Equivalent to extract but derives genomic positions directly from peaks
detected in a BigWig signal file.
hicue tracks OUTPUT_DIR TRACK_FILE COOL_FILES [OPTIONS]
All options from extract are supported. Additional options:
| Option | Type | Default | Description |
|---|---|---|---|
-t / --threshold |
choice+float | — | min VALUE keeps positions above VALUE; max VALUE keeps positions below |
-p / --percentage |
choice+int | — | high N keeps the top N%, low N keeps the bottom N% of positions by signal |
--min_sep |
int | 1000 | Minimum distance in bp between two retained peaks |
--positions |
path | — | Restrict peak selection to positions in this BED/GFF file |
--gff_type |
str | "" |
Feature type to select when --positions is a GFF file |
regions
Extract and aggregate submatrices of variable genomic size, resizing each one to a common pixel dimension before computing the pileup. Designed for features with different lengths (TADs, genes, compartments).
hicue regions OUTPUT_DIR POSITIONS COOL_FILES [OPTIONS]
All options from extract are supported. Additional options:
| Option | Type | Default | Description |
|---|---|---|---|
-p / --padding |
float | 1.0 | Padding ratio added on each side of the region (e.g. 0.5 adds half the region size) |
-s / --min_region_size |
int | 20000 | Minimum region size in bp; smaller regions are skipped |
-e / --expected_sizes |
int,… | 51 | Target pixel dimension(s) for resizing (e.g. 20,51) |
Note: Small regions combined with a large bin size can introduce strong biases in the pileup. Use
--min_region_sizeto filter them out and prefer a bin size that yields at least ~5 bins per region.
Input formats
| Format | Extension(s) | Notes |
|---|---|---|
| BED | .bed |
3-column minimum. Column 4 = name, column 6 = strand (+/-). Strand required for --flip and --separate_by direction. |
| BED2D / BEDPE | .bed2d, .bedpe |
6-column: chrom1, start1, end1, chrom2, start2, end2. Used with --loops. |
| GFF / GTF | .gff, .gtf |
Gene/feature annotations for position extraction or labelling. |
| Cooler (single) | .cool |
Single-resolution matrix. --binnings is ignored. |
| Cooler (multi) | .mcool |
Multi-resolution matrix. Select resolution(s) with --binnings. |
| BigWig | .bw |
Continuous signal track. Primary input for tracks, or annotation overlay in extract/regions. |
| Regions CSV | .csv |
Comma-separated: Id,Chromosome,Start,End. Used with --separate_by regions. Multiple rows with the same Id define a discontinuous interval. |
Passing Cooler files
The COOL_FILES argument is flexible and accepts three forms:
1. A single file directly:
hicue extract results/ anchors.bed experiment.mcool --binnings 1000
2. A comma-separated list of files:
hicue extract results/ anchors.bed control.mcool,treated.mcool --binnings 1000
3. A plain-text file listing one Cooler path per line:
# experiments.txt
/data/project/control.mcool
/data/project/treated.mcool
/data/project/recovery.mcool
hicue extract results/ anchors.bed experiments.txt --binnings 1000
All three forms work identically with extract, tracks, and regions.
Each Cooler file produces its own sub-directory inside OUTPUT_DIR.
Output structure
OUTPUT_DIR/
└── {cool_name}/
└── {sep_id}/
└── binning_{binning}/
├── individual_{window}kb_window/
│ ├── GeneName.pdf ← per-locus submatrix
│ └── …
├── batched_{window}kb_window/
│ ├── batch#1.pdf ← 64-panel batch figure
│ ├── batch#1_references.csv
│ └── …
└── pileup_{window}kb_window.pdf
A timestamped log file (YYYYMMDD_HHMMSS_log.txt) recording the exact command
and all option values is written to OUTPUT_DIR for every run.
Intermediate CSV files (positions, formatted pairs, random positions for patch
detrending) can be saved with --save_tmp and reused in subsequent runs via
--random_path.
Examples
Fixed-window pileup on gene TSS
hicue extract results/tss/ genes.bed experiment.mcool \
--windows 50000 \
--binnings 1000 \
--center start
Strand-aware pileup with individual figures
hicue extract results/tss_stranded/ genes.bed experiment.mcool \
--windows 50000 \
--binnings 1000 \
--center start \
--flip \
--loci
Loop pileup with P(s) detrending, multiple windows
hicue extract results/loops/ loops.bed2d control.mcool treated.mcool \
--loops \
--detrending ps \
--windows 50000,100000 \
--binnings 1000 \
--format pdf,png
Patch detrending — generate then reuse random positions
# First run: compute and save random control positions
hicue extract results/ctrl/ loops.bed2d control.mcool \
--loops \
--detrending patch \
--nb_pos 3 \
--save_tmp
# Second run: reuse the same random positions for a paired comparison
hicue extract results/treated/ loops.bed2d treated.mcool \
--loops \
--detrending patch \
--nb_pos 3 \
--random_path results/ctrl/loops
Pileup from BigWig peaks
# Keep the top 20 % of ChIP-seq signal bins, minimum 5 kb between peaks
hicue tracks results/chip/ H3K27ac.bw experiment.mcool \
--percentage high 20 \
--min_sep 5000 \
--windows 50000 \
--binnings 1000
Variable-size region pileup (TADs)
hicue regions results/tads/ tads.bed experiment.mcool \
--expected_sizes 51 \
--padding 0.5 \
--binnings 5000
Multiple resolutions from a file list
hicue regions results/tads_multi/ tads.bed experiments.txt \
--expected_sizes 20,51 \
--binnings 5000,10000 \
--padding 1.0
Separating results by chromosome
hicue extract results/by_chrom/ genes.bed experiment.mcool \
--windows 50000 \
--binnings 5000 \
--separate_by chroms
Architecture overview
HiCue uses a multi-threaded producer–consumer pipeline:
FileStreamer ──► Parser(s) ──► Annotator(s) ──► positions DataFrame
└──► pairing Queue
pairing Queue ──► Separator(s) ──► PairFormater(s) ──► formated_pairs DataFrame
formated_pairs ──► Extracter ──► SubmatrixFormater(s) ──► Aggregator(s) ──► Pileup
└──► DisplayBatch / Display (async rendering)
Pileup Queue ──► Display (pileup rendering)
All inter-stage communication uses queue.Queue with a "DONE" sentinel.
Display workers (in classes/AsyncDisplays.py) run async matplotlib
functions on a dedicated per-thread event loop.
Known issues & fixes
Blank / corrupted figures with async display workers
Symptom: Output PDF/PNG files are blank, show the wrong data, or the
process crashes with a RuntimeError from the Agg renderer.
Root cause: The original code used asyncio.run() inside each
Display/DisplayBatch thread. asyncio.run() creates and destroys an
event loop on every call. The teardown of one loop races with plt.close()
called at the end of the previous coroutine, corrupting matplotlib's internal
"current figure" singleton.
Fix (applied in classes/AsyncDisplays.py):
Each worker thread now creates one persistent event loop
(asyncio.new_event_loop()) on construction and reuses it for every
rendering call via loop.run_until_complete(coro). The loop is closed in an
overridden join() method.
License
CC BY-NC 4.0 – see LICENSE for details.
Acknowledgments
A. Cournac for project supervision and tests. J. Serizay for primary documentation and test implementation. M. Perrot for provided data.
Citation
If you use HiCue in your research, please cite:
[Citation to be added upon publication]
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hicue-0.5.0.tar.gz.
File metadata
- Download URL: hicue-0.5.0.tar.gz
- Upload date:
- Size: 277.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb165cf6e757da0a54d0702bcca95f43dbf67550199949de4a4e93a9b2107334
|
|
| MD5 |
14843af557500151750c1ca0211a41f2
|
|
| BLAKE2b-256 |
6c39a204f648555096bffbc23041e82e2f1ab6f0f7255bf0d26e449ee579c0c8
|
File details
Details for the file hicue-0.5.0-py3-none-any.whl.
File metadata
- Download URL: hicue-0.5.0-py3-none-any.whl
- Upload date:
- Size: 80.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
747c995d29157bd460a176174a38066c562290030dd60e09e2dfeee0e72785e9
|
|
| MD5 |
70e033ae610e0037637b28bfcb747c1c
|
|
| BLAKE2b-256 |
446c9402baee36384a5486ef4e69c50a7630613157b810e4b208e5c0633acd8c
|