Skip to main content

Enhancer hijacking detection from WGS and RNAseq.

Project description

pyjacker

This is a tool to detect enhancer hijacking events in a cohort profiled with WGS and RNA-seq. It does not require matched normals and can detect enhancer hijacking events occurring in a single sample. Briefly, it looks for outlier high and monoallelic expression of a gene in a sample which has a breakpoint close to the gene.

Usage

In an environment with python>=3.7:

pip install pyjacker
python pyjacker.py config.yaml

The config file indicates all the parameters, including paths to the input files (see config_AML.yaml as an example). Alternatively, we provide a nextflow workflow that generates pyjacker's inputs from bam files, and run pyjacker: https://github.com/CompEpigen/wf_WGS.

Inputs

Gene expression table (required)

Rows are genes (ensembl IDs) and columns are samples. The expression data must be provided in TPM. See data/TPM_ckAML.tsv for an example.

Breakpoints (required)

tsv file with columns: sample, chr1, pos1, chr2, pos2. The fields chr2 and pos2 are optional (for example if you only have copy number data). See data/breakpoints.tsv for an example.

TADs (optional)

A bed file of topologically-associating domains can be used, in which case only the breakpoints in the same TAD as a gene are considered in the search for enhancer hijacking events. One TAD file is provided in the data directory. If not TAD file is provided, pyjacker will instead look for breakpoints within a fixed distance to the gene (1.5Mb by default). See data/HSPC_TADs.bed for an example.

Allelic read counts at SNPs in RNAseq (optional)

This is used to detect monoallelic expression. This requires files generated by fast_ase or GATK ASEReadCounter.

Copy number alterations (optional)

tsv file with the following columns: sample, chr, start, end, cn. See data/CNAs.tsv for an example. If provided, this will be used to:

  • correct gene expression based on copy number (so high expression because of amplification will not be reported)
  • filter out SNPs within deletions from the monoallelic expression detection

Fusion transcripts

Fusion transcripts can also lead to aberrant high and monoallelic expression of a gene. If a list of fusion transcripts detected from RNAseq is provided, they will be used to annotate candidate enhancer hijacking events which are actually due to a fusion. See data/fusions.tsv for an example file.

Enhancers

A file of scored enhancers, generated by ROSE. See data/enhancers_myeloid.tsv for an example.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyjacker-1.0.2.tar.gz (20.3 kB view details)

Uploaded Source

Built Distribution

pyjacker-1.0.2-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file pyjacker-1.0.2.tar.gz.

File metadata

  • Download URL: pyjacker-1.0.2.tar.gz
  • Upload date:
  • Size: 20.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.8.16

File hashes

Hashes for pyjacker-1.0.2.tar.gz
Algorithm Hash digest
SHA256 5d3b966104061380c90cf91bcc535c47bea55732bf2485b29d52083f98878ac5
MD5 9ba63f494d1441840c1be22c4ab64fac
BLAKE2b-256 3f4077941e208f175be50e1c2a390d7cd55f16293e3ccb42c0ab8dd5a7d70914

See more details on using hashes here.

File details

Details for the file pyjacker-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: pyjacker-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 21.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.8.16

File hashes

Hashes for pyjacker-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3519a9693269198ced47c049f4b07e3d830a1f313ddf5cb257d2983691c7afd2
MD5 116eb81bc0193ec7330802e014abf90a
BLAKE2b-256 7f0e1a427b023b70265f998f1b96045f2218eacf6eb0664df98c60c6d5aa814f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page