Skip to main content

Python package wrapping ENCODE epigenomic data for a number of reference cell lines.

Project description

Travis CI build SonarCloud Quality SonarCloud Maintainability Codacy Maintainability Maintainability Pypi project Pypi total project downloads

Python package wrapping ENCODE epigenomic data for several reference cell lines.

How do I install this package?

As usual, just download it using pip:

pip install epigenomic_dataset

Tests Coverage

Since some software handling coverages sometimes get slightly different results, here’s three of them:

Coveralls Coverage SonarCloud Coverage Code Climate Coverate

Preprocessed data for cis-regulatory regions

We have already downloaded and obtained mean and max for each promoter and enhancer region for the cell lines A549, GM12878, H1, HEK293, HepG2, K562, MCF7, taking in consideration all the targets listed in the complete table of epigenomes.

The promoters and enhancers considered are taken from FANTOM, as can be downloaded by using the crr_labels pipeline.

The specific bed files and labels used can be found here for the promoters and here for the enhancers.

Cell line

Promoters epigenomes

Enhancers epigenomes

Features number

A549

Download

Download

54

GM12878

Download

Download

110

H1

Download

Download

43

HEK293

Download

Download

207

HepG2

Download

Download

209

K562

Download

Download

320

MCF7

Download

Download

101

All cell lines

Download

Download

1044

Pipeline

The considered raw data are from this query from the ENCODE project

You can find the complete table of the available epigenomes here. These datasets were selected to have (at time of the writing, 07/02/2020) the least possible amount of known problems, such as low read resolution.

You can run the pipeline as follows: suppose you want to extract the epigenomic features for the cell lines HepG2 and H1:

from epigenomic_dataset import build

build(
    bed_path="path/to/my/bed/file.bed",
    cell_lines=["HepG2", "H1]
)

If you want to specify where to store the files use:

from epigenomic_dataset import build

build(
    bed_path="path/to/my/bed/file.bed",
    cell_lines=["HepG2", "H1"],
    path="path/to/my/target"
)

By default, the downloaded bigWig files are not deleted. You can choose to delete the files as follows:

from epigenomic_dataset import build

build(
    bed_path="path/to/my/bed/file.bed",
    cell_lines=["HepG2", "H1"],
    path="path/to/my/target",
    clear_download=True
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

epigenomic_dataset-1.0.3.tar.gz (65.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page