Hi-C analysis, snakemake, sequana, container, reproducibility
Project description
This is the Hi-C pipeline from the Sequana project.
- Overview:
Hi-C pipeline to capture 3D chromatin interactions in a genome
- Input:
Paired FastQ files and a reference genome in FASTA format
- Output:
Cooler contact matrices, Hi-C QC reports, and a MultiQC summary
- Status:
Beta
- Citation:
Cokelaer et al, (2017), ‘Sequana’: a Set of Snakemake NGS pipelines, Journal of Open Source Software, 2(16), 352, JOSS DOI https://doi:10.21105/joss.00352
Installation
If you already have all requirements, install the package with pip:
pip install sequana_hic --upgrade
You will need third-party tools (see Requirements below). Use apptainer images to avoid installing them locally.
Usage
Set up the pipeline directory with your input data and reference:
sequana_hic --input-directory DATAPATH --reference-file genome.fa sequana_hic --input-directory DATAPATH --reference-file genome.fa --aligner-choice bwa_split
This creates a hic/ directory containing the pipeline and configuration file. Execute the pipeline locally:
cd hic sh hic.sh
See .sequana/profile/config.yaml to tune Snakemake behaviour (cores, cluster settings, etc.).
Usage with apptainer
With Apptainer, initiate the working directory as follows:
sequana_hic --input-directory DATAPATH --reference-file genome.fa --use-apptainer
Images can be stored in a shared location:
sequana_hic --input-directory DATAPATH --reference-file genome.fa --use-apptainer --apptainer-prefix ~/.sequana/apptainers
then:
cd hic sh hic.sh
If running Snakemake manually, add apptainer options:
snakemake -s hic.rules --cores 4 --use-apptainer --apptainer-prefix ~/.sequana/apptainers --apptainer-args "-B /home:/home"
By default the home directory is already bound. Additional paths can be set via:
export APPTAINER_BINDPATH="-B /pasteur"
Requirements
This pipeline requires the following executables (install via bioconda/conda):
bwa — short-read aligner (default mapper)
samtools — BAM/SAM manipulation
pairtools — processing of Hi-C read pairs
cooler — storage and analysis of Hi-C contact matrices
qc3c — Hi-C quality control
fastqc — raw read quality control
multiqc — aggregate QC reports
Optional:
chromap — fast Hi-C aligner (experimental, use --aligner-choice chromap)
seqkit — split FastQ files (required for --aligner-choice bwa_split)
Pipeline description
FastQC — quality control on raw reads
Reference indexing — BWA index build from the provided FASTA reference
Alignment — BWA-MEM alignment with Hi-C-specific options (-5SP), producing sorted BAM files
Pairtools — parse alignments into Hi-C contact pairs, sort, deduplicate, and split
Cooler — load pairs into a contact matrix and generate multi-resolution .mcool file
qc3C — Hi-C library quality assessment (ligation efficiency, distance distribution)
Visualisation — contact matrix PNG at 5 kb resolution
MultiQC — aggregated QC report
Changelog
Version |
Description |
|---|---|
0.1.0 |
Migration to modern sequana_pipetools framework (get_shell/get_run, schema validation, apptainer support, Python 3.10+). |
0.0.1 |
First release. |
Contribute & Code of Conduct
To contribute to this project, please take a look at the Contributing Guidelines first. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sequana_hic-0.1.0.tar.gz.
File metadata
- Download URL: sequana_hic-0.1.0.tar.gz
- Upload date:
- Size: 132.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.0.1 CPython/3.10.14 Linux/6.14.5-100.fc40.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
acf3abee71fcb0c9922af513f65c85be9ea1355813b573e4c60d07dc199c073d
|
|
| MD5 |
d92a188f4359b189496aae54c229b331
|
|
| BLAKE2b-256 |
8c3f66772416d480370c734393b68f0855bf63e986ca829d93ba54b83d84a12a
|
File details
Details for the file sequana_hic-0.1.0-py3-none-any.whl.
File metadata
- Download URL: sequana_hic-0.1.0-py3-none-any.whl
- Upload date:
- Size: 132.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.0.1 CPython/3.10.14 Linux/6.14.5-100.fc40.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
04119e737846650914662455bfa0a2b714d54d46cd63601ae2c2652f694c89b7
|
|
| MD5 |
6a7e8c77f72c7e05251be668d9f1e7a0
|
|
| BLAKE2b-256 |
76b339cc0921a4f29298f8c76329c6dd4bc7752cd4c67eccf77f38b6c4ce8074
|