HiNT -- HiC for copy number vairations and translocations detection
Project description
HiNT
A computational method for detecting copy number variations and translocations from Hi-C data
Summary
HiNT (Hi-C for copy Number variation and Translocation detection), a computational method to detect CNVs and Translocations from Hi-C data. HiNT has three main components: HiNT-PRE, HiNT-CNV, and HiNT-TL. HiNT-PRE preprocesses Hi-C data and computes the contact matrix, which stores contact frequencies between any two genomic loci; both HiNT-CNV and HiNT-TL starts with HI-C contact matrix, predicts copy number segments, and inter-chromosomal translocations, respectively
Overview of HiNT workflow:
Installation
Dependencies
R and R packages
Python and Python packages
- python >= 3.5
- pyparix >= 0.3.0, cooler >= 0.7.4, pairtools >= 0.2.2, numpy, scipy, pandas, sklearn, multiprocessing
Java and related tools (Optional: required when want to process Hi-C data with juicer tools)
Perl
Other dependencies
- samtools (1.3.1+)
- BIC-seq2 (0.7.3) ! This is optional: if you don't want to run HiNT-CNV, you don't need this package. No need to install, just download BICseq2, unzip it, and give the path where you stored to HiNT.
- bwa (0.7.16+) ! This is optional: required only when your input is fastq
- tabix (0.2.6)
Install HiNT
-
Method1: Install from PyPI using pip.
$ pip install HiNT-Packages -
Method2: Install using conda (highly recommend)
$ conda install hint -
Method3: Install manually
- Install HiNT dependencies
- Download HiNT
git clone https://github.com/parklab/HiNT.git - Go to HiNT directory, install it by
$ python setup.py install
*** Type $ hint to test if HiNT successfully installed
Download reference files used in HiNT
- Download HiNT references HERE. Only hg19, hg38 and mm10 are available currently. Unzip it
$ unzip hg19.zip - Put reference files into the HiNT directory
$ mv hg19/* where_you_put_HiNT/HiNT/HiNT/references/
Quick Start
- Download the test datasets from HERE
HiNT-PRE
HiNT pre: Preprocessing Hi-C data. HiNT pre does alignment, contact matrix creation and normalization in one command line.
$ hint pre -d /path/to/hic_1.fastq.gz,/path/to/hic_2.fastq.gz -i /path/to/bwaIndex --informat fastq --outformat cooler -g hg19 -n test -o /path/to/outputdir --pairsampath /path/to/pairsamtools
see details and more options
$ hint pre -h
HiNT-CNV
HiNT cnv: prediction of copy number information, as well as segmentation from Hi-C.
$ hint cnv -m contactMatrix.mcool -f cooler -r 50 -g hg19 -n test -o /path/to/outputDir
see details and more options
$ hint cnv -h
HiNT-TL
HiNT tl: interchromosomal translocations and breakpoints detection from Hi-C inter-chromosomal interaction matrices.
$ hint tl -m /path/to/data_1Mb.cool,/path/to/data_100kb.cool -c chimericReads.pairsam -f cooler -g hg19 -n test -o /path/to/outputDir
see details and more options
$ hint tl -h
Output of HiNT
HiNT-PRE output
In the HiNT-PRE output directory, you will find
jobname.bamaligned lossless file in bam formatjobname_merged_valid.pairs.gzreads pairs in pair formatjobname_chimeric.sorted.pairsam.gzambiguous chimeric read pairs used for breakpoint detection in pairsam formatjobname_valid.sorted.deduped.pairsam.gzvalid read pairs used for Hi-C contact matrix creation in pairsam formatjobname.mcoolHi-C contact matrix in cool formatjobname.hicHi-C contact matrix in hic format
HiNT-CNV output
In the HiNT-CNV output directory, you will find
jobname_GAMPoisson.pdfthe GAM regression resultsegmentation/jobname_bicsq_allchroms.txtCNV segments with log2 copy ratio and p-values in txt filesegmentation/jobname_resolution_CNV_segments.pngfigure to visualize CNV segmentssegmentation/jobname_bicseq_allchroms.l2r.pdffigure to visualize log2 copy ration in each bin (bin size = resolution you set)segmentation/other_filesintermediate files used to run BIC-seqjonname_dataForRegression/*data used for regression as well as residuals after removing Hi-C biases
HiNT-TL output
In the HiNT-TL output directory, you will find
jobname_Translocation_IntegratedBP.txtthe final integrated translocation breakpointjobname_chrompairs_rankProduct.txtrank product predicted potential translocated chromosome pairsotherFoldersintermediate files used to identify the translocation breakpoints
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file HiNT-Package-2.0.9.tar.gz.
File metadata
- Download URL: HiNT-Package-2.0.9.tar.gz
- Upload date:
- Size: 47.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa7afcecf90d71c74801e47d6f035e1e5fb4d22f1441fba6bd1bf25073f3b0a7
|
|
| MD5 |
fced3532b6538877026471d3a429ae42
|
|
| BLAKE2b-256 |
1c63e50dae56e677aa1075f5d41c42930e640106a5b45d31ef96bd1c03cd9fb6
|
File details
Details for the file HiNT_Package-2.0.9-py3-none-any.whl.
File metadata
- Download URL: HiNT_Package-2.0.9-py3-none-any.whl
- Upload date:
- Size: 54.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0e8f2e8500cc18651f30b715d017a33cd8b8de831f5bfaee29d9a0226238a584
|
|
| MD5 |
66c9fe587e60e26e285762b64df7da72
|
|
| BLAKE2b-256 |
cebf890fc73450da7df10f5e368432b4fc9d85667b78a2e6e885d52b19cb2720
|