Skip to main content

Description of your package

Project description

gene4mVCF

Introduction

gene4mVCF is a Python package that allows you to extract variant entries for specific genes or a list of genes from a VCF (Variant Call Format) file. It utilizes tools like bcftools, tabix, and Python libraries like pysam, pandas, pybedtools, tqdm, and gffutils to efficiently parse and extract variants.

Installation

You can install gene4mVCF via pip:

$ pip install gene4mVCF

After installation please download the four required bed files and place inside the folder /gene4mVCF
'hg19.ensGene.bed' --> https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/genes/hg19.ensGene.gtf.gz

'hg19.ncbiRefSeq.bed' --> https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/genes/hg19.ncbiRefSeq.gtf.gz

'hg38.ensGene.bed' --> https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/genes/hg38.ensGene.gtf.gz

'hg38.ncbiRefSeq.bed'--> https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/genes/hg38.ncbiRefSeq.gtf.gz

Usage

Extract variant entries for a specific gene or list of genes from a VCF file.
usage: $ gene4mVCF [-h] -i INPUT -g GENE [-r REFERENCE]

Needed arguments:
-i INPUT, --input INPUT, is bgzip compressed VCF file

-g GENE, --genes GENE, is either a Gene name, Ensembl gene ID, or path to a genelist file with *.txt extension


optional arguments:

-h, --help show this help message and exit

-r REFERENCE, --ref REFERENCE, is 37 or 38, based on the reference genome used for creating the VCF file

-f FEATURE, --feature FEATURE, is feature type(s) to filter for (e.g., exon, CDS, 5UTR, 3UTR, transcript)

Examples

Extract variants for a single gene using gene name: $ gene4mVCF -i input.vcf.gz -g EGFR

Extract variants for a single using ensembl gene id: $ gene4mVCF -i input.vcf.gz -g ENSG00000168878

Extract variants for multiple genes listed in a file: $ gene4mVCF -i input.vcf.gz -g acmg-genes.txt

For more options and details, refer to the help message.

Example showing extraction of variants of ACTB gene


input command
Example Image


Output file in the tab-separated format
Example Image

Support

For any issues or inquiries, please open an issue on the GitHub repository https://github.com/vkulaganathan/gene4mVCF

Installation

Installation via pip:

$ pip install gene4mVCF

current version='1.1.5'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gene4mvcf-1.1.5.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

gene4mVCF-1.1.5-py3-none-any.whl (4.8 kB view details)

Uploaded Python 3

File details

Details for the file gene4mvcf-1.1.5.tar.gz.

File metadata

  • Download URL: gene4mvcf-1.1.5.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.6

File hashes

Hashes for gene4mvcf-1.1.5.tar.gz
Algorithm Hash digest
SHA256 33b7e8b0afcfcb9859faad6969a59a22af6491d4b2003fb0b9d8674a372480b2
MD5 160f29f5089b6ab08dd5fa4dcb3f4901
BLAKE2b-256 75b3dd12fed7ec65cc9cdfbb41bf3354e1f9a042e82287960594ff917ad35b75

See more details on using hashes here.

File details

Details for the file gene4mVCF-1.1.5-py3-none-any.whl.

File metadata

  • Download URL: gene4mVCF-1.1.5-py3-none-any.whl
  • Upload date:
  • Size: 4.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.6

File hashes

Hashes for gene4mVCF-1.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 c51694a9df5113ff084237623f9130104233265e5564461aabb4d92936c8b75c
MD5 7595ea84ddc2f31b69c37ce7fb9be58e
BLAKE2b-256 e924e5c53cf6902366fe270463add346612b92f9a924ed6c2f9e46c2559b9bd2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page