Skip to main content

HiNT -- HiC for copy number vairations and translocations detection

Project description

HiNT

A computational method for detecting copy number variations and translocations from Hi-C data

Summary

HiNT (Hi-C for copy Number variation and Translocation detection), a computational method to detect CNVs and Translocations from Hi-C data. HiNT has three main components: HiNT-PRE, HiNT-CNV, and HiNT-TL. HiNT-PRE preprocesses Hi-C data and computes the contact matrix, which stores contact frequencies between any two genomic loci; both HiNT-CNV and HiNT-TL starts with HI-C contact matrix, predicts copy number segments, and inter-chromosomal translocations, respectively

Overview of HiNT workflow:

Installation

Dependencies

R and R packages

  1. R >= 3.4
  2. mgcv, strucchange, doParallel, Cairo, foreach

Python and Python packages

  1. python >= 3.5
  2. pyparix >= 0.3.0, cooler >= 0.7.4, pairtools >= 0.2.2, numpy, scipy, pandas, sklearn, multiprocessing

Java and related tools (Optional: required when want to process Hi-C data with juicer tools)

  1. Java (version >= 1.7)
  2. Juicer tools (1.8.9 is recommended)

Perl

  1. Perl (version >= 5)

Other dependencies

  1. samtools (1.3.1+)
  2. BIC-seq2 (0.7.3) ! This is optional: if you don't want to run HiNT-CNV, you don't need this package. No need to install, just download BICseq2, unzip it, and give the path where you stored to HiNT.
  3. bwa (0.7.16+) ! This is optional: required only when your input is fastq
  4. tabix (0.2.6)

Install HiNT

  • Method1: Install from PyPI using pip.

    $ pip install HiNT-Packages

  • Method2: Install using conda (highly recommend)

    $ conda install hint

  • Method3: Install manually

    1. Install HiNT dependencies
    2. Download HiNT git clone https://github.com/parklab/HiNT.git
    3. Go to HiNT directory, install it by $ python setup.py install

*** Type $ hint to test if HiNT successfully installed

Download reference files used in HiNT

  1. Download HiNT references HERE. Only hg19, hg38 and mm10 are available currently. Unzip it $ unzip hg19.zip
  2. Put reference files into the HiNT directory $ mv hg19/* where_you_put_HiNT/HiNT/HiNT/references/

Quick Start

  • Download the test datasets from HERE

HiNT-PRE

HiNT pre: Preprocessing Hi-C data. HiNT pre does alignment, contact matrix creation and normalization in one command line.

$ hint pre -d /path/to/hic_1.fastq.gz,/path/to/hic_2.fastq.gz -i /path/to/bwaIndex --informat fastq --outformat cooler -g hg19 -n test -o /path/to/outputdir --pairsampath /path/to/pairsamtools

see details and more options

$ hint pre -h

HiNT-CNV

HiNT cnv: prediction of copy number information, as well as segmentation from Hi-C.

$ hint cnv -m contactMatrix.mcool -f cooler -r 50 -g hg19 -n test -o /path/to/outputDir

see details and more options

$ hint cnv -h

HiNT-TL

HiNT tl: interchromosomal translocations and breakpoints detection from Hi-C inter-chromosomal interaction matrices.

$ hint tl -m /path/to/data_1Mb.cool,/path/to/data_100kb.cool -c chimericReads.pairsam -f cooler -g hg19 -n test -o /path/to/outputDir

see details and more options

$ hint tl -h

Output of HiNT

HiNT-PRE output

In the HiNT-PRE output directory, you will find

  1. jobname.bam aligned lossless file in bam format
  2. jobname_merged_valid.pairs.gz reads pairs in pair format
  3. jobname_chimeric.sorted.pairsam.gz ambiguous chimeric read pairs used for breakpoint detection in pairsam format
  4. jobname_valid.sorted.deduped.pairsam.gz valid read pairs used for Hi-C contact matrix creation in pairsam format
  5. jobname.mcool Hi-C contact matrix in cool format
  6. jobname.hic Hi-C contact matrix in hic format

HiNT-CNV output

In the HiNT-CNV output directory, you will find

  1. jobname_GAMPoisson.pdf the GAM regression result
  2. segmentation/jobname_bicsq_allchroms.txt CNV segments with log2 copy ratio and p-values in txt file
  3. segmentation/jobname_resolution_CNV_segments.png figure to visualize CNV segments
  4. segmentation/jobname_bicseq_allchroms.l2r.pdf figure to visualize log2 copy ration in each bin (bin size = resolution you set)
  5. segmentation/other_files intermediate files used to run BIC-seq
  6. jonname_dataForRegression/* data used for regression as well as residuals after removing Hi-C biases

HiNT-TL output

In the HiNT-TL output directory, you will find

  1. jobname_Translocation_IntegratedBP.txt the final integrated translocation breakpoint
  2. jobname_chrompairs_rankProduct.txt rank product predicted potential translocated chromosome pairs
  3. otherFolders intermediate files used to identify the translocation breakpoints

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

HiNT-Package-2.0.9.tar.gz (47.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

HiNT_Package-2.0.9-py3-none-any.whl (54.7 kB view details)

Uploaded Python 3

File details

Details for the file HiNT-Package-2.0.9.tar.gz.

File metadata

  • Download URL: HiNT-Package-2.0.9.tar.gz
  • Upload date:
  • Size: 47.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7

File hashes

Hashes for HiNT-Package-2.0.9.tar.gz
Algorithm Hash digest
SHA256 aa7afcecf90d71c74801e47d6f035e1e5fb4d22f1441fba6bd1bf25073f3b0a7
MD5 fced3532b6538877026471d3a429ae42
BLAKE2b-256 1c63e50dae56e677aa1075f5d41c42930e640106a5b45d31ef96bd1c03cd9fb6

See more details on using hashes here.

File details

Details for the file HiNT_Package-2.0.9-py3-none-any.whl.

File metadata

  • Download URL: HiNT_Package-2.0.9-py3-none-any.whl
  • Upload date:
  • Size: 54.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7

File hashes

Hashes for HiNT_Package-2.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 0e8f2e8500cc18651f30b715d017a33cd8b8de831f5bfaee29d9a0226238a584
MD5 66c9fe587e60e26e285762b64df7da72
BLAKE2b-256 cebf890fc73450da7df10f5e368432b4fc9d85667b78a2e6e885d52b19cb2720

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page