ML model for predicting ChIP-seq peaks in new cell types from ENCODE cell lines
Project description
Epitome
Full pipeline for learning TFBS from epigenetic datasets.
Epitome leverages chromatin accessibility data to predict transcription factor binding sites on a novel cell type of interest. Epitome computes the chromatin similarity between 11 cell types in ENCODE and the novel cell types, and uses chromatin similarity to transfer binding information in known cell types to a novel cell type of interest.
Requirements:
- conda
- python > 3.6
Setup:
- Create and activate a conda venv:
conda create --name EpitomeEnv python=3.6
source activate EpitomeEnv
- setup:
pip install -e .
Note: Epitome is configured for tensorflow 1.12/Cuda 9. If you have a different version of cuda, update tensorflow-gpu version accordingly.
To check your Cuda version:
nvcc --version
Configuring data
To download and format training data, run the bin/get_deepsea_data.py script:
python bin/get_deepsea_data.py
usage: get_deepsea_data.py [-h] --output_path OUTPUT_PATH
Training a Model
from epitome.models import *
model = VLP(data
test_celltypes,
matrix,
assaymap,
cellmap,
shuffle_size=2,
batch_size=64)
model.train(10000)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.