Skip to main content

eQTL analysis using region-based aggregation of rare variants.

Project description

AeQTL

eQTL analysis using region-based aggregation of rare variants.

Requirements

  • python 3.5
  • pip
  • bx_interval_tree (see installation instructions below)
  • git (optional)

Installation

First, install IntervalTree from bx-python. We strongly recommend using a standalone package called bx_interval_tree which is smaller and easier to compile than bx-python.

git clone https://github.com/ccwang002/bx_interval_tree
cd bx_interval_tree
python setup.py install
cd ..

Continue to install AeQTL by choosing one of the options below.

(1) From PyPI

The easiest way to install AeQTL is from PyPI.

pip install aeqtl

(2) From source code

Alternatively, download the source code of AeQTL

git clone https://github.com/Huang-lab/AeQTL

Then install AeQTL

cd AeQTL
pip install .

Optional (but recommended)

Append the path to AeQTL to your PATH environment variable

export PATH=/path/to/AeQTL/bin:$PATH

Run

aeqtl -v <vcf file> -b <bed file> -e <expression file> \
	  -cn <numerical covariates> -cc <categorical covariates> -s <covariate file> \
      -o <output directory>

Input data format

Note: demo input files with compatible format can be found in the "demo" folder

VCF file

A standard multi-sample VCF file with file extension .vcf (or .vcf.gz). Sample IDs in VCF file, expression file, and covariate file should match exactly.

BED file

A BED file (tab separated) with at least four columns and without header. The format of the file should follow:

<chromosome>	<start>		<end>		<region_name>		<tested_genes>

An example row:

chr17			41197693	41197821	BRCA1				BRCA1;SLC25A39;HEXIM2

The first four columns are required. The fifth column is a list of genes separated by ";". If the fifth column (tested_genes) is not provided, AeQTL by default will test each region with every gene from the expression file.

Expression file

A matrix-format, tab separated .tsv file with gene expression from RNA-seq. The first row (header) of the file should follow:

gene_id		<sample_id_1>		<sample_id_2>		<sample_id_3>		...

and the first column of the file should follow:

gene_id
<gene_1>
<gene_2>
...

Covariate file

A tab separated .tsv file with column names corresponding to covariates. A column of sample IDs with column name "sample_id" is required. Covariates entered in AeQTL and their corresponding column names must match exactly. However, the covariate file can contain other unused columns as well. If entering a categorical covariate, please make sure each category has the same value throughout the file (i.e. avoid instances such as having both "FEMALE" and "female" in the same column).

Output data format

A tab separated .tsv file of summary statistics (up to 5 digits after the decimal point). Each row is an eQTL test between a region and a gene. The file contains the following fields:

  • region
  • gene
  • coef_intercept
  • coef_genotype
  • coef_<covariate> (for each covariate)
  • pvalue_intercept
  • pvalue_genotype
  • pvalue_<covariate> (for each covariate)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aeqtl-0.1.0.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

aeqtl-0.1.0-py3-none-any.whl (8.0 kB view details)

Uploaded Python 3

File details

Details for the file aeqtl-0.1.0.tar.gz.

File metadata

  • Download URL: aeqtl-0.1.0.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.14.2 setuptools/40.5.0 requests-toolbelt/0.8.0 tqdm/4.25.0 CPython/3.5.5

File hashes

Hashes for aeqtl-0.1.0.tar.gz
Algorithm Hash digest
SHA256 589d5e3aaa78db4ba954b4aaf17e4e8a0e5bf647c695fe8aa2eadcae08dc8048
MD5 18b7b1e5ce81a5d6c5743a0d2c42bacb
BLAKE2b-256 336a335e99bd7bf6a258217d22c88185dc4b9174c90481be62fb3f1a06508097

See more details on using hashes here.

File details

Details for the file aeqtl-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: aeqtl-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 8.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.14.2 setuptools/40.5.0 requests-toolbelt/0.8.0 tqdm/4.25.0 CPython/3.5.5

File hashes

Hashes for aeqtl-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 db11198579cb99bb428bff10e655d17769adcf3b3ce0e5d059f16e206a9e4299
MD5 53e1aa75ae7d63eced64a09b3c87e1a7
BLAKE2b-256 79c28cf5ec53a8ec9bb9661a2480861be591ead90406d22ea0aefead146bbc88

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page