Reimplementation of the hicrep with added support for sparse matrix and multiple chromosomes.

These details have not been verified by PyPI

Project links

Homepage

Project description

hicreppy

cmdoret

This is a python reimplementation of hicrep's algorithm with added support for sparse matrices (in .cool format).

hicrep measures similarity between Hi-C samples by computing a stratum-adjusted correlation coefficient (SCC). In this implementation, the SCC is computed separately for each chromosome and the chromosome length-weighted average of SCCs is computed.

hicrep is published at:

HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient. Tao Yang, Feipeng Zhang, Galip Gurkan Yardimci, Ross C Hardison, William Stafford Noble, Feng Yue, Qunhua Li, 2017, Genome Research, doi: 10.1101/gr.220640.117

The original implementation, in R can be found at https://github.com/MonkeyLB/hicrep

Installation

You can install the package using pip:

pip install --user hicreppy

Usage

To find the optimal value for smoothing parameter h, you can use the htrain subcommand:


Usage: hicreppy htrain [OPTIONS] COOL1 COOL2

  Find the optimal value for smoothing parameter h. The optimal h-value is
  printed to stdout. Run informations are printed to stderr.

Options:
  -r, --h-max INTEGER     Maximum value of the smoothing parameter h to
                          explore. All consecutive integer values from 0 to
                          this value will be tested.  [default: 10]
  -m, --max-dist INTEGER  Maximum distance at which to compute the SCC, in
                          basepairs.  [default: 100000]
  -b, --blacklist TEXT    Exclude those chromosomes in the analysis. List of
                          comma-separated chromosome names.
  -w, --whitelist TEXT    Only include those chromosomes in the analysis. List
                          of comma-separated chromosome names.
  --help                  Show this message and exit.

To compute the SCC between two matrices, use the scc subcommand. The optimal h value obtained with htrain should be provided to the flag -v:


Usage: hicreppy scc [OPTIONS] COOL1 COOL2

  Compute the stratum-adjusted correlation coefficient for input matrices

Options:
  -v, --h-value INTEGER    Value of the smoothing parameter h to use. Should
                           be an integer value >= 0.  [default: 10]
  -m, --max-dist INTEGER   Maximum distance at which to compute the SCC, in
                           basepairs.  [default: 100000]
  -s, --subsample INTEGER  Subsample contacts from both matrices to target
                           value. Leave to 0 to disable subsampling.
                           [default: 0]
  -b, --blacklist TEXT     Exclude those chromosomes in the analysis. List of
                           comma-separated chromosome names.
  -w, --whitelist TEXT     Only include those chromosomes in the analysis.
                           List of comma-separated chromosome names.
  --help                   Show this message and exit.

When running multiple pairwise comparisons, compute the optimal h value once between two highly similar samples and reuse the h value for all scc commands

Contributing

All contributions are welcome. We use the numpy standard for docstrings when documenting functions.

The code formatting standard we use is black, with --line-length=79 to follow PEP8 recommendations. We use pytest with the pytest-doctest and pytest-pylint plugins as our testing framework. Ideally, new functions should have associated unit tests, placed in the tests folder.

To test the code, you can run:

pytest --doctest-modules --pylint --pylint-error-types=EF --pylint-rcfile=.pylintrc hicreppy tests

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.0

Apr 25, 2022

0.0.6

May 20, 2020

0.0.5

Dec 12, 2019

0.0.3

Dec 3, 2019

0.0.2

Dec 3, 2019

0.0.1

Dec 3, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hicreppy-0.1.0.tar.gz (25.2 kB view details)

Uploaded Apr 25, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hicreppy-0.1.0-py3-none-any.whl (25.5 kB view details)

Uploaded Apr 25, 2022 Python 3

File details

Details for the file hicreppy-0.1.0.tar.gz.

File metadata

Download URL: hicreppy-0.1.0.tar.gz
Upload date: Apr 25, 2022
Size: 25.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for hicreppy-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`c384a8166f7047b317d1e194d3c679f2da08d360117f03ec6ef9ce610df85d66`
MD5	`a7448f4f43673b55a5328a09bce33c92`
BLAKE2b-256	`32376d42aa1d6f4c092c53d60b2bc0e53342f5816652624b2c3e13b29f62f8f8`

See more details on using hashes here.

File details

Details for the file hicreppy-0.1.0-py3-none-any.whl.

File metadata

Download URL: hicreppy-0.1.0-py3-none-any.whl
Upload date: Apr 25, 2022
Size: 25.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for hicreppy-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`95402d33d078380e335fb5ecdad64f3136bbbf2de1f0ae168cdae3e7d3337524`
MD5	`d5cede14dd162b0ea0bb418215d02c69`
BLAKE2b-256	`6a7e0e91fd832bbc8768c43ba45a66748dfc8ef85c4feefed73e55567d56f1b0`

See more details on using hashes here.

hicreppy 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

hicreppy

Installation

Usage

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes