Skip to main content

Python library for detecting Frequently Interacting REgions (FIREs) from Hi-C data

Project description

Detect Frequently Interacting REgions (FIREs) in Python

PyPI

The project is a port of the R package for detecting frequently interacting regions (FIREs) from Hi-C data to Python. FIRE is described in A Compendium of Chromatin Contact Maps Reveal Spatially Active Regions in the Human Genome paper.

Command line usage

Install the package:

python3 -m pip install FIREcaller

Download some HiC experiment results from the 4D Nucleome database. Choose Contact Matrix (.mcool) as a file type.

Download the mappability file from Yunjiang's website. Ensure the genomic assembly and the bin size matches your HiC files. Add header line chr start end F GC M if missing.

Perform FIRE calling:

FIREcaller \
  --cooler_filenames 4DNFIT5YVTLO.mcool \
  --cooler_filenames 4DNFIJTOIGOI.mcool \
  --mappability_filename F_GC_M_Hind3_10Kb_el.GRCh38.txt \
  --bin_size 10000 \
  --output_filename fires.csv

The output file will consist of the genomic regions and their corresponding FIRE scores and log p-values for each HiC file:

chr start end F GC M 0_count_neig 1_count_neig 0_fire 1_fire 0_logpvalue 1_logpvalue
chr1 1970000 1980000 2000 0.5175 1.0000 5365 2376 0.9515 0.8702 0.5861 0.4317
chr1 2020000 2030000 3000 0.6287 0.9907 4305 2005 0.5806 0.5831 0.1128 0.1144
chr1 2060000 2070000 4000 0.4770 0.9210 4029 2171 0.6880 0.8678 0.1954 0.4277
(...)

{n}_fire column stores the FIIRE score for the n-th cooler file provided as --cooler_filenames argument

{n}_logpvalue column stores the log p-value for the n-th cooler file provided as --cooler_filenames argument

Python usage

Use the FIREcaller.calc_fires() function to perform FIRE calling from your Python program:

calc_fires(mappability_filename, cooler_filenames, bin_size, neighborhood_region, perc_threshold=.25, avg_mappability_threshold=0.9)

mappability_filename : str - Path to mappability file

cooler_filenames : str - List of paths to HiC experiment results in cooler format

bin_size : int - Bin size

neighborhood_region : int - Size of neighbor region

perc_threshold : float - Maximum ratio of "bad" neighbors allowed

avg_mappability_threshold : float - Minimum mappability allowed

The function returns the Pandas DataFrame matrix.

Verification

In order to compare the results derived from this project with the ones obtained from the FIREcaller R package please follow the steps described in the README.md file to produce the FIRE_ANALYSIS_40KB.txt file.

Now download and uncompress the Hippo_Hi-C_inputs_chr1_22.tar.gz and F_GC_M_HindIII_40KB_hg19.txt.gz files.

Convert the HiC files to get the hippo.mcool cooler file using the scripts/build_hippo_cool.py python script:

python scripts/build_hippo_cool.py

Run the FIRE calling:

FIREcaller \
  --cooler_filename hippo.mcool \
  --mappability_filename F_GC_M_HindIII_40KB_hg19.txt \
  --bin_size 40000 \
  --output_filename hippo_fires.csv

Compare the FIRE_ANALYSIS_40KB.txt and fires.csv files.

TODO:

  • Add option to remove the MHC regions
  • Calculate the Super FIREs
  • Convert the script into the Python package

Let me know if you need any of the above and I'll be happy to implement it. I also accept PRs and comments. Enjoy.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

FIREcaller-1.0.0.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

FIREcaller-1.0.0-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file FIREcaller-1.0.0.tar.gz.

File metadata

  • Download URL: FIREcaller-1.0.0.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for FIREcaller-1.0.0.tar.gz
Algorithm Hash digest
SHA256 070b1a93318ec4cd633d6ffcd4e2a3402b453aef8138f1a477d9d6d2e15f138e
MD5 88810494c5f742ab0575dfbc0720bab8
BLAKE2b-256 32f442292deadc0fb36e21232f932ca97871dcaf5fe88423165c85dc72582116

See more details on using hashes here.

File details

Details for the file FIREcaller-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: FIREcaller-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 10.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for FIREcaller-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 716c5113f7d16a762df5d9582577fe1bbb770f83278e885a37de980f25ce67dc
MD5 f8de41f7a6e90ce5a806f3f648b564f0
BLAKE2b-256 dacf0169d3c879bc5d189373fec6791417c9c64bb9d1a15a71a0469ed5f19bf2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page