Description of your package
Project description
gene4mVCF
Introduction
gene4mVCF
is a Python package that allows you to extract variant entries for specific genes or a list of genes from a VCF (Variant Call Format) file. It utilizes tools like bcftools
, tabix
, and Python libraries like pysam
, pandas
, pybedtools
, tqdm
, and gffutils
to efficiently parse and extract variants.
Installation
You can install gene4mVCF
via pip:
$ pip install gene4mVCF
After installation please download the four required bed files and place inside the folder /gene4mVCF
'hg19.ensGene.bed' --> https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/genes/hg19.ensGene.gtf.gz
'hg19.ncbiRefSeq.bed' --> https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/genes/hg19.ncbiRefSeq.gtf.gz
'hg38.ensGene.bed' --> https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/genes/hg38.ensGene.gtf.gz
'hg38.ncbiRefSeq.bed'--> https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/genes/hg38.ncbiRefSeq.gtf.gz
Usage
Extract variant entries for a specific gene or list of genes from a VCF file.
usage: $ gene4mVCF [-h] -i INPUT -g GENE [-r REFERENCE]
Needed arguments:
-i INPUT, --input INPUT, is bgzip compressed VCF file
-g GENE, --genes GENE, is either a Gene name, Ensembl gene ID, or path to a genelist file with *.txt extension
optional arguments:
-h, --help show this help message and exit
-r REFERENCE, --ref REFERENCE, is 37 or 38, based on the reference genome used for creating the VCF file
-f FEATURE, --feature FEATURE, is feature type(s) to filter for (e.g., exon, CDS, 5UTR, 3UTR, transcript)
Examples
Extract variants for a single gene using gene name:
$ gene4mVCF -i input.vcf.gz -g EGFR
Extract variants for a single using ensembl gene id:
$ gene4mVCF -i input.vcf.gz -g ENSG00000168878
Extract variants for multiple genes listed in a file:
$ gene4mVCF -i input.vcf.gz -g acmg-genes.txt
For more options and details, refer to the help message.
Example showing extraction of variants of ACTB gene
input command
Output file in the tab-separated format
Support
For any issues or inquiries, please open an issue on the GitHub repository https://github.com/vkulaganathan/gene4mVCF
Installation
Installation via pip:
$ pip install gene4mVCF
current version='1.1.5'
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file gene4mvcf-1.1.5.tar.gz
.
File metadata
- Download URL: gene4mvcf-1.1.5.tar.gz
- Upload date:
- Size: 4.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 33b7e8b0afcfcb9859faad6969a59a22af6491d4b2003fb0b9d8674a372480b2 |
|
MD5 | 160f29f5089b6ab08dd5fa4dcb3f4901 |
|
BLAKE2b-256 | 75b3dd12fed7ec65cc9cdfbb41bf3354e1f9a042e82287960594ff917ad35b75 |
File details
Details for the file gene4mVCF-1.1.5-py3-none-any.whl
.
File metadata
- Download URL: gene4mVCF-1.1.5-py3-none-any.whl
- Upload date:
- Size: 4.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c51694a9df5113ff084237623f9130104233265e5564461aabb4d92936c8b75c |
|
MD5 | 7595ea84ddc2f31b69c37ce7fb9be58e |
|
BLAKE2b-256 | e924e5c53cf6902366fe270463add346612b92f9a924ed6c2f9e46c2559b9bd2 |