Skip to main content

Extract data quickly from Juicebox hic files via straw

Project description

Straw

Straw enables programmatic access to .hic files. .hic files store the contact matrices from Hi-C experiments and the normalization and expected vectors, along with meta-data in the header.

The main function, straw, takes in the normalization, the filename or URL, chromosome1 (and optional range), chromosome2 (and optional range), whether the bins desired are fragment or base pair delimited, and bin size.

It then reads the header, follows the various pointers to the desired matrix and normalization vector, and stores as [x, y, count]

Usage: straw <NONE/VC/VC_SQRT/KR> <hicFile(s)> <chr1>[:x1:x2] <chr2>[:y1:y2] <BP/FRAG> <binsize>

See https://github.com/theaidenlab/straw/wiki/Python for more documentation

Install

Straw uses the requests library for support of URLs. Be sure it is installed.

Examples

  • Extract all reads on chromosome X at 1MB resolution with no normalization in local file "HIC001.hic"

    from straw import straw
    result = straw.straw('NONE', 'HIC001.hic', 'X', 'X', 'BP', 1000000)
    # the values returned are in x / y / counts
    for i in range(len(result[0])):
       print("{0}\t{1}\t{2}".format(result[0][i], result[1][i], result[2][i]))
    
  • Extract all reads from chromosome 4 at 500KB resolution with VC (coverage) normalization from the combined MAPQ 30 map from Rao and Huntley et al. 2014

    from straw import straw
    result = straw.straw('VC', 'https://hicfiles.s3.amazonaws.com/hiseq/gm12878/in-situ/combined_30.hic', '4', '4', 'BP', 500000)
    # the values returned are in x / y / counts
    for i in range(len(result[0])):
       print("{0}\t{1}\t{2}".format(result[0][i], result[1][i], result[2][i]))
    
  • Extract reads between 1MB and 7.5MB on chromosome 1 at 25KB resolution with KR (balanced) normalization and write to a file:

    from straw import straw
    straw.printme("KR", "HIC001.hic", "1:1000000:7500000", "1:1000000:7500000", "BP", 25000, 'out.txt')
    
  • Extract all interchromosomal reads between chromosome 5 and chromosome 12 at 500 fragment resolution with VC (vanilla coverage) normalization:

    from straw import straw
    
    result = straw.straw("VC", "HIC001.hic", "5", "12", "FRAG", 500)
    # the values returned are in results
    for i in range(len(result[0])):
       print("{0}\t{1}\t{2}".format(result[0][i], result[1][i], result[2][i]))
    

See the script straw.py for an example of how to print the results to a file.

Read header

See the file read_hic_header.py for a Python script that reads the header of a hic file and outputs the information (including resolutions).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hic-straw-0.0.6.tar.gz (7.2 kB view details)

Uploaded Source

Built Distribution

hic_straw-0.0.6-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file hic-straw-0.0.6.tar.gz.

File metadata

  • Download URL: hic-straw-0.0.6.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7

File hashes

Hashes for hic-straw-0.0.6.tar.gz
Algorithm Hash digest
SHA256 bc79be50b076a3820852cd0e3b0d743df719b78d49510bf43ac63241f03ab81a
MD5 05bb911adabf50a1da778a5c3bd76173
BLAKE2b-256 5812da9e47221a82011797d19832a2dbc8843b0a96c483874f17010b4921c993

See more details on using hashes here.

File details

Details for the file hic_straw-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: hic_straw-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 8.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7

File hashes

Hashes for hic_straw-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 fd355fddfb43bb7a152125200a96a959bbe2fa3e4f5e1d41d0af8f27ebecf69e
MD5 397f002b6eda54d3a7eeb7af349cb9cb
BLAKE2b-256 a4891fc2a4e6d9b5379ac4ea0add62658d519095d22a8fef6feb32d1f6d04341

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page