Skip to main content

A package for training and interpreting an ensemble of neural networks for chromatin accessibility

Project description

deepaccess-package

PyPI version Anaconda-Server Badge

This is the code for training and interpretation of an ensemble of convolutional neural networks for multi-task classification. Instructions for downloading and getting started with the current release are available at https://cgs.csail.mit.edu/deepaccess-package/. deepaccess is available via pip and bioconda. The DeepAccess model trained on ATAC-seq data from 10 mouse cell types is available as a zenodo record.

Dependencies

To run DeepAccess with regions (bedfile format) you must install bedtools and add it to your path. Bedtools binaries are available here.

After installation, you can add bedtools to your path via the terminal or modifying your ~/.bashrc

export PATH="/path/to/bedtools:$PATH"

Installation

deepaccess is available on the Python Package Index (PyPI) and can be installed with pip:

pip install deepaccess

and via bioconda:

conda install -c bioconda deepaccess

Training

To train a DeepAccess model for a new task

usage: deepaccess train [-h] -l LABELS [LABELS ...]
       		  -out OUT [-ref REFFASTA]
		  [-g GENOME] [-beds BEDFILES [BEDFILES ...]]
		  [-fa FASTA] [-fasta_labels FASTA_LABELS]
                  [-f FRAC_RANDOM] [-nepochs NEPOCHS]
		  [-ho HOLDOUT] [-seed SEED] [-verbose]

optional arguments:
  -h, --help            show this help message and exit
  -l LABELS [LABELS ...], --labels LABELS [LABELS ...]
  -out OUT, --out OUT
  -ref REFFASTA, --refFasta REFFASTA
  -g GENOME, --genome GENOME
                        genome chrom.sizes file
  -beds BEDFILES [BEDFILES ...], --bedfiles BEDFILES [BEDFILES ...]
  -fa FASTA, --fasta FASTA
  -fasta_labels FASTA_LABELS, --fasta_labels FASTA_LABELS
  -f FRAC_RANDOM, --frac_random FRAC_RANDOM
  -nepochs NEPOCHS, --nepochs NEPOCHS
  -ho HOLDOUT, --holdout HOLDOUT
                        chromosome to holdout
  -seed SEED, --seed SEED
  -verbose, --verbose   Print training progress

Arguments

Argument Description Example
-h, --help show this help message and exit NA
-l --labels list of labels for each bed file C1 C2 C3
-out --out output folder name myoutput
-ref --ref reference fasta; required with bed input mm10.fa
-g --genome genome chromosome sizes; required with bed input default/mm10.chrom.sizes
-beds --bedfiles list of bed files; one of beds or fa input required C1.bed C2.bed C3.bed
-fa --fasta fasta file; one of beds or fa input required C1C2C3.fa
-fasta_labels --fasta_labels text file containing tab delimited labels (0 or 1) for each fasta line with one column for each class C1C2C3.txt
-f --frac_random for bed file input fraction of random outgroup regions to add to training 0.1
-nepochs --nepochs number of training iterations 1
-ho --holdout chromosome name to hold out (only with bed input) chr19
-verbose --verbose print training and evaluation progress NA
-seed --seed set tensorflow seed 2021

Interpretation

To run interpretation of a DeepAccess model

usage: deepaccess interpret [-h] -trainDir TRAINDIR
       		  [-fastas FASTAS [FASTAS ...]]
		  [-l LABELS [LABELS ...]] [
		  -c COMPARISONS [COMPARISONS ...]]
		  [-evalMotifs EVALMOTIFS]
                  [-evalPatterns EVALPATTERNS]
		  [-p POSITION] [-saliency]
		  [-subtract] [-bg BACKGROUND] [-vis]

optional arguments:
  -h, --help            show this help message and exit
  -trainDir TRAINDIR, --trainDir TRAINDIR
  -fastas FASTAS [FASTAS ...], --fastas FASTAS [FASTAS ...]
  -l LABELS [LABELS ...], --labels LABELS [LABELS ...]
  -c COMPARISONS [COMPARISONS ...], --comparisons COMPARISONS [COMPARISONS ...]
  -evalMotifs EVALMOTIFS, --evalMotifs EVALMOTIFS
  -evalPatterns EVALPATTERNS, --evalPatterns EVALPATTERNS
  -p POSITION, --position POSITION
  -saliency, --saliency
  -subtract, --subtract
  -bg BACKGROUND, --background BACKGROUND
  -vis, --makeVis

Arguments

Argument Description Example
-h, --help show this help message and exit NA
-trainDir --trainDir directory containing trained DeepAccess model test/ASCL1vsCTCF
-fastas --fastas list of fasta files to evaulate test/ASCL1vsCTCF/test.fa
-l --labels list of labels for each bed file C1 C2 C3
-c --comparisons list of comparisons between different labels ASCL1vsCTCF ASCL1vsNone runs differential EPE between ASCL1 and CTCF and EPE on ASCL1; C1,C2vsC3 runs differential EPE for (C1 and C2) vs C3
-evalMotifs --evalMotifs PWM or PCM data base of DNA sequence motifs default/HMv11_MOUSE.txt
-evalPatterns --evalPatterns fasta file containing DNA sequence patterns data/ASCL1_space.fa
-bg --bg fasta file containning background sequences default/backgrounds.fa
-saliency --saliency calculate per base nucleotide importance NA
-subtract --subtract use subtraction instead of ratio for EPE / DEPE False
-vis --makeVis to be used with saliency to make plot visualizing results NA

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepaccess-0.1.3.tar.gz (273.2 kB view details)

Uploaded Source

Built Distribution

deepaccess-0.1.3-py3-none-any.whl (290.5 kB view details)

Uploaded Python 3

File details

Details for the file deepaccess-0.1.3.tar.gz.

File metadata

  • Download URL: deepaccess-0.1.3.tar.gz
  • Upload date:
  • Size: 273.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.9.2

File hashes

Hashes for deepaccess-0.1.3.tar.gz
Algorithm Hash digest
SHA256 f5d4daabc3c193e21301c90e74e62bb0ebb797aaac5f1f06e2c061492dd021d6
MD5 f45c1c62d306ac1d1db205684a1259ed
BLAKE2b-256 95126f12e6e725576e2361df81885c7b4838f35bfc99588b3b7ccaa01bd39424

See more details on using hashes here.

File details

Details for the file deepaccess-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: deepaccess-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 290.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.9.2

File hashes

Hashes for deepaccess-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2980c61552fa7c39372ff51d657bb9bdf3a7c51490dfe667e5ad31c7be158540
MD5 2a739c29509ddd4753d15c3433d74e2c
BLAKE2b-256 ec2f18e3a3447aa9c9444cdfe457f8f20eb2c4151b969340e426e17f1c241655

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page