Skip to main content

Associate outliers with rare variation

Project description

Cursory use of ORE (outlier-RV enrichment) is provided here, visit the latest ORE documentation for more details. Confirm the following are installed:

Then, on the command line, install with

pip install ore

Example run

ore --vcf test.vcf.gz \
    --bed test.bed.gz \
    --output ore_results \
    --distribution normal \
    --threshold 2 3 4 \
    --max_outliers_per_id 500 \
    --af_rare 0.05 0.01 1e-3 \
    --tss_dist 5000

Variants and gene expression are specified with --vcf (line 1) and --bed (line 2), respectively. The output prefix is provided with --output (line 3). In this example, the outlier specifications --distribution (line 4), --threshold (line 5), and --max_outliers_per_id (line 6) indicate that outliers are defined using a normal distribution with a z-score more extreme than two, and samples with more than 500 outliers are excluded. Variant information is specified with --af_rare (line 7) and --tss_dist (line 8) to encode that variants are defined as rare with a intra-cohort allele frequency at varying thresholds (≤ 0.05, 0.01, and 0.001), and to only use variants within 5 kb of the TSS.

Usage, visit the latest ORE documentation for more

ore [-h] [--version] -v VCF -b BED [-o OUTPUT]
         [--outlier_output OUTLIER_OUTPUT] [--enrich_file ENRICH_FILE]
         [--extrema] [--distribution {normal,rank,custom}]
         [--threshold [THRESHOLD [THRESHOLD ...]]]
         [--max_outliers_per_id MAX_OUTLIERS_PER_ID]
         [--af_rare [AF_RARE [AF_RARE ...]]] [--af_vcf]
         [--intracohort_rare_ac INTRACOHORT_RARE_AC] [--gq GQ] [--dp DP]
         [--aar AAR AAR] [--tss_dist [TSS_DIST [TSS_DIST ...]]] [--upstream]
         [--downstream] [--annovar]
         [--variant_class {intronic,intergenic,exonic,UTR5,UTR3,splicing,upstream,ncRNA,ncRNA_exonic}]
         [--exon_class {nonsynonymous,intergenic,nonframeshift,frameshift,stopgain,stoploss}]
         [--refgene] [--ensgene] [--annovar_dir ANNOVAR_DIR]
         [--humandb_dir HUMANDB_DIR] [--processes PROCESSES] [--clean_run]
Required arguments:
-v VCF, --vcf VCF

Location of VCF file. Must be tabixed!

-b BED, --bed BED

Gene expression file location. Must be tabixed!

Optional file locations:
-o OUTPUT, --output OUTPUT

Output prefix (default is VCF prefix)

--outlier_output OUTLIER_OUTPUT

Outlier filename (default is VCF prefix)

--enrich_file ENRICH_FILE

Output file for enrichment odds ratios and p-values (default is VCF prefix)

Optional outlier arguments:
--extrema

Only the most extreme value is an outlier

--distribution DISTRIBUTION

Outlier distribution. Options: {normal,rank,custom}

--threshold THRESHOLD

Expression threshold for defining outliers. Must be greater than 0 for normal or (0,0.5) non-inclusive with rank. Ignored with custom

--max_outliers_per_id MAX_OUTLIERS_PER_ID

Maximum number of outliers per ID

Optional variant-related arguments:
--af_rare AF_RARE

AF cut-off below which a variant is considered rare (space separated list e.g., 0.1 0.05)

--af_vcf

Use the VCF AF field to define an allele as rare.

--intracohort_rare_ac INTRACOHORT_RARE_AC

Allele COUNT to be used instead of intra-cohort allele frequency. (still uses af_rare for population level AF cut-off)

--af_min AF_MIN

Lower bound on AF cut-offs for –af_rare, must be same length as –af_rare (e.g., with –af_rare 0.01 0.5 and –af_min 0 0.05 ORE will compare variants within [0,0.01] and [0.05,0.5] to other variants).

--gq GQ

Minimum genotype quality each variant in each individual

--dp DP

Minimum depth per variant in each individual

--aar AAR

Alternate allelic ratio for heterozygous variants (provide two space-separated numbers between 0 and 1, e.g., 0.2 0.8)

--tss_dist TSS_DIST

Variants within this distance of the TSS are considered

--upstream

Only variants UPstream of TSS

--downstream

Only variants DOWNstream of TSS

Optional arguments for using ANNOVAR:
--annovar

Use ANNOVAR to specify allele frequencies and functional class

--variant_class

Only variants in these classes will be considered. Options: {intronic,intergenic,exonic,UTR5,UTR3,splicing,upstream,ncRNA}

--exon_class

Only variants with these exonic impacts will be considered. Options: {nonsynonymous,intergenic,nonframeshift,frameshift,stopgain,stoploss}

--refgene

Filter on RefGene function.

--ensgene

Filter on ENSEMBL function.

--annovar_dir ANNOVAR_DIR

Directory of the table_annovar.pl script

--humandb_dir HUMANDB_DIR

Directory of ANNOVAR data (refGene, ensGene, and gnomad_genome)

optional arguments:
-h, --help

show this help message and exit

--version

show program’s version number and exit

--processes PROCESSES

Number of CPU processes

--clean_run

Delete temporary files from the previous run

Felix Richter <felix.richter@icahn.mssm.edu>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ore-0.2.1.tar.gz (236.4 kB view details)

Uploaded Source

Built Distribution

ore-0.2.1-py3-none-any.whl (11.8 MB view details)

Uploaded Python 3

File details

Details for the file ore-0.2.1.tar.gz.

File metadata

  • Download URL: ore-0.2.1.tar.gz
  • Upload date:
  • Size: 236.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.19.8 CPython/3.6.5

File hashes

Hashes for ore-0.2.1.tar.gz
Algorithm Hash digest
SHA256 4c4a6682f0a2ce15605e21a3d8da5a50ffd160701bff28a54e620fb8dae876f5
MD5 58cdc87b27f556c0e56e80fde0668657
BLAKE2b-256 c9be7c23284c391a724d0f8514f59e9dddc6c9e9c200fb2b11a3de9c7d3245f4

See more details on using hashes here.

File details

Details for the file ore-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: ore-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 11.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.19.8 CPython/3.6.5

File hashes

Hashes for ore-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fb710e028e907a93e11bdd7f8068c7f4564b2ef4703fd6e5d17b8f12e4386ff9
MD5 498d8beb44026179b72d283831e19cdc
BLAKE2b-256 fdde66a1c8267ac53bbd1a31c71fb992c0bd8fc6784f7c9f03954d20dea47536

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page