cooler

Sparse binary format for genomic interaction matrices

These details have not been verified by PyPI

Project links

Homepage

Project description

# Cooler

[![Build Status](https://travis-ci.org/mirnylab/cooler.svg?branch=master)](https://travis-ci.org/mirnylab/cooler)
[![Documentation Status](https://readthedocs.org/projects/cooler/badge/?version=latest)](http://cooler.readthedocs.org/en/latest/)
[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat-square)](http://bioconda.github.io/recipes/cooler/README.html)
[![Binder](http://mybinder.org/badge.svg)](https://github.com/mirnylab/cooler-binder)
[![Join the chat at https://gitter.im/mirnylab/cooler](https://badges.gitter.im/mirnylab/cooler.svg)](https://gitter.im/mirnylab/cooler?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
[![DOI](https://zenodo.org/badge/49553222.svg)](https://zenodo.org/badge/latestdoi/49553222)

## A cool place to store your Hi-C

Cooler is a support library for a **sparse, compressed, binary** persistent storage format, called _cool_, used to store genomic interaction data, such as Hi-C contact matrices.

The _cool_ file format is a reference implementation of a genomic matrix data model using [HDF5](https://en.wikipedia.org/wiki/Hierarchical_Data_Format) as the container format.

The `cooler` package aims to provide the following functionality:

- Build contact matrices at any resolution from a [list of contacts](https://github.com/4dn-dcic/pairix).
- Query a contact matrix.
- Export and visualize the data.
- Perform efficient out-of-core operations, such as aggregation and contact matrix normalization (a.k.a. balancing).
- Provide a clean and well-documented Python API to facilitate working with potentially larger-than-memory data.

To get started:

- Read the [documentation](http://cooler.readthedocs.org/en/latest/).
- See the Jupyter Notebook [walkthrough](https://github.com/mirnylab/cooler-binder).
- _cool_ files from published Hi-C data sets are available at `ftp://cooler.csail.mit.edu/coolers`.

Related projects:

- Process Hi-C data with [distiller](https://github.com/mirnylab/distiller).
- Downstream analysis with [cooltools](https://github.com/mirnylab/cooltools) (WIP).
- Visualize your Cooler data with [HiGlass](http://higlass.io)!

### Installation

Requirements:

- Python 2.7/3.4+
- libhdf5 and Python packages `numpy`, `scipy`, `pandas`, `h5py`. We highly recommend using the `conda` package manager to install scientific packages like these. To get it, you can either install the full [Anaconda](https://www.continuum.io/downloads) Python distribution or just the standalone [conda](http://conda.pydata.org/miniconda.html) package manager.

Install from PyPI using pip.
```sh
$ pip install cooler
```

If you are using `conda`, you can alternatively install `cooler` from the [bioconda](https://bioconda.github.io/index.html) channel.
```sh
$ conda install -c conda-forge -c bioconda cooler
```

See the [docs](http://cooler.readthedocs.org/en/latest/) for more information.

### Command line interface

The `cooler` package includes command line tools for creating, querying and manipulating _cool_ files.

```bash
$ cooler makebins $CHROMSIZES_FILE $BINSIZE > bins.10kb.bed
$ cooler cload bins.10kb.bed $CONTACTS_FILE out.cool
$ cooler balance -p 10 out.cool
$ cooler dump -b -t pixels --header --join -r chr3:10,000,000-12,000,000 -r2 chr17 out.cool | head
```

```
chrom1 start1 end1 chrom2 start2 end2 count balanced
chr3 10000000 10010000 chr17 0 10000 1 0.810766
chr3 10000000 10010000 chr17 520000 530000 1 1.2055
chr3 10000000 10010000 chr17 640000 650000 1 0.587372
chr3 10000000 10010000 chr17 900000 910000 1 1.02558
chr3 10000000 10010000 chr17 1030000 1040000 1 0.718195
chr3 10000000 10010000 chr17 1320000 1330000 1 0.803212
chr3 10000000 10010000 chr17 1500000 1510000 1 0.925146
chr3 10000000 10010000 chr17 1750000 1760000 1 0.950326
chr3 10000000 10010000 chr17 1800000 1810000 1 0.745982
```

See also:

- [CLI Reference](http://cooler.readthedocs.io/en/latest/cli.html).
- Jupyter Notebook [walkthrough](https://github.com/mirnylab/cooler-binder/blob/master/cooler_cli.ipynb).

### Python API

The `cooler` library provides a thin wrapper over the excellent [h5py](http://docs.h5py.org/en/latest/) Python interface to HDF5. It supports creation of cooler files and the following types of **range queries** on the data:

- Tabular selections are retrieved as Pandas DataFrames and Series.
- Matrix selections are retrieved as NumPy arrays or SciPy sparse matrices.
- Metadata is retrieved as a json-serializable Python dictionary.
- Range queries can be supplied using either integer bin indexes or genomic coordinate intervals. Note that queries with coordinate intervals that are not multiples of the bin size will return the range of shortest range bins that fully contains the open interval [start, end).

```python

>>> import cooler
>>> import matplotlib.pyplot as plt
>>> c = cooler.Cooler('bigDataset.cool')
>>> resolution = c.info['bin-size']
>>> mat = c.matrix(balance=True).fetch('chr5:10,000,000-15,000,000')
>>> plt.matshow(np.log10(mat), cmap='YlOrRd')
```

```python
>>> import multiprocessing as mp
>>> import h5py
>>> pool = mp.Pool(8)
>>> f = h5py.File('bigDataset.cool', 'r')
>>> weights, stats = cooler.ice.iterative_correction(f, map=pool.map, ignore_diags=3, min_nnz=10)
```

See also:

- [API Reference](http://cooler.readthedocs.io/en/latest/api.html).
- Jupyter Notebook [walkthrough](https://github.com/mirnylab/cooler-binder/blob/master/cooler_api.ipynb).

### Schema

The _cool_ format implements a simple [data model](http://cooler.readthedocs.io/en/latest/datamodel.html) that stores a genomic matrix in a sparse representation, crucial for developing robust tools for use on increasingly high resolution Hi-C data sets, including streaming and [out-of-core](https://en.wikipedia.org/wiki/Out-of-core_algorithm) algorithms.

The data tables in a _cool_ file are stored in a **columnar** representation as HDF5 groups of 1D array datasets of equal length. The contact matrix itself is stored as a single table containing only the **nonzero upper triangle** pixels.

### Contributing

[Pull requests](https://akrabat.com/the-beginners-guide-to-contributing-to-a-github-project/) are welcome. The current requirements for testing are `nose` and `mock`.

For development, clone and install in "editable" (i.e. development) mode with the `-e` option. This way you can also pull changes on the fly.
```sh
$ git clone https://github.com/mirnylab/cooler.git
$ cd cooler
$ pip install -e .
```

### License

BSD (New)

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.10.4

Jul 21, 2025

0.10.3

Dec 21, 2024

0.10.2

Jun 17, 2024

0.10.1

Jun 17, 2024

0.10.0

May 21, 2024

0.9.3

Sep 11, 2023

0.9.2

Jun 1, 2023

0.9.1

Jan 23, 2023

0.9.0

Jan 19, 2023

0.8.11

Apr 1, 2021

0.8.10

Sep 25, 2020

0.8.9

Jul 18, 2020

0.8.8

Jun 24, 2020

0.8.7

Jan 13, 2020

0.8.6.post0

Aug 13, 2019

0.8.6

Aug 13, 2019

0.8.5

Apr 8, 2019

0.8.4

Apr 5, 2019

0.8.3

Feb 11, 2019

0.8.2

Jan 20, 2019

0.8.1

Jan 3, 2019

0.8.0

Dec 31, 2018

This version

0.7.11

Aug 17, 2018

0.7.10

May 7, 2018

0.7.9

Mar 30, 2018

0.7.8

Mar 18, 2018

0.7.7

Mar 16, 2018

0.7.6

Oct 31, 2017

0.7.5

Jul 13, 2017

0.7.4

May 25, 2017

0.7.3

May 23, 2017

0.7.2

May 9, 2017

0.7.1

Apr 29, 2017

0.7.0

Apr 27, 2017

0.6.6

Mar 22, 2017

0.6.5

Mar 18, 2017

0.6.4

Mar 17, 2017

0.6.3

Feb 22, 2017

0.6.2

Feb 12, 2017

0.6.1

Feb 6, 2017

0.6.0

Feb 4, 2017

0.5.3

Sep 11, 2016

0.5.2

Aug 26, 2016

0.5.1

Aug 24, 2016

0.5.0

Aug 24, 2016

0.4.1

Aug 24, 2016

0.4.0

Aug 19, 2016

0.3.0

Feb 18, 2016

0.2.1

Feb 7, 2016

0.2

Jan 18, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cooler-0.7.11.tar.gz (55.4 MB view details)

Uploaded Aug 17, 2018 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cooler-0.7.11-py2.py3-none-any.whl (90.0 kB view details)

Uploaded Aug 17, 2018 Python 2Python 3

File details

Details for the file cooler-0.7.11.tar.gz.

File metadata

Download URL: cooler-0.7.11.tar.gz
Upload date: Aug 17, 2018
Size: 55.4 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/38.4.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.6.4

File hashes

Hashes for cooler-0.7.11.tar.gz
Algorithm	Hash digest
SHA256	`1dd61256c43f9704610189447ed01404206103c0e0975fc25c1a1bd56db76522`
MD5	`8ad6a9a680d62136d52da61d8a692bc3`
BLAKE2b-256	`be6d67706a0ecb2e39620df1ec4049bb3e79e32093977846ccc25968f129dd60`

See more details on using hashes here.

File details

Details for the file cooler-0.7.11-py2.py3-none-any.whl.

File metadata

Download URL: cooler-0.7.11-py2.py3-none-any.whl
Upload date: Aug 17, 2018
Size: 90.0 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/38.4.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.6.4

File hashes

Hashes for cooler-0.7.11-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`8c511607f8a4adc415aa50eb9bf3e929e593b7adc31edb2cdbcdcd548b33877a`
MD5	`799a777b3d6bb7b1f576d4a41793a232`
BLAKE2b-256	`a3df536701ee4b39b5e8fe648d49635a7935242d8506de7badb87434952a80bf`

See more details on using hashes here.

cooler 0.7.11

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes