Skip to main content

ML model for predicting ChIP-seq peaks in new cell types from ENCODE cell lines

Project description

Epitome

Full pipeline for learning TFBS from epigenetic datasets.

Epitome Diagram

Epitome leverages chromatin accessibility data to predict transcription factor binding sites on a novel cell type of interest. Epitome computes the chromatin similarity between 11 cell types in ENCODE and the novel cell types, and uses chromatin similarity to transfer binding information in known cell types to a novel cell type of interest.

Requirements:

Setup:

  1. Create and activate a conda venv:
conda create --name EpitomeEnv python=3.6
source activate EpitomeEnv
  1. setup:
pip install -e .

Note: Epitome is configured for tensorflow 1.12/Cuda 9. If you have a different version of cuda, update tensorflow-gpu version accordingly.

To check your Cuda version:

nvcc --version

Configuring data

To download and format training data, run the bin/get_deepsea_data.py script:

python bin/get_deepsea_data.py 
usage: get_deepsea_data.py [-h] --output_path OUTPUT_PATH

Training a Model

    from epitome.models import *
    model = VLP(data
            test_celltypes,
            matrix,
            assaymap,
            cellmap,
            shuffle_size=2, 
            batch_size=64)
    model.train(10000)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

epitome-0.0.1a0.tar.gz (27.4 kB view hashes)

Uploaded Source

Built Distribution

epitome-0.0.1a0-py3.7.egg (62.4 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page