Python (version <=3.12) package for parsing the genomics and transcriptomics VCF data.

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

everestial vuown

These details have not been verified by PyPI

Project description

vcfparser

PyPI version

Python (version >=3.6) package for parsing the genomics and transcriptomics VCF data.

Free software: MIT license
Documentation: https://vcfparser.readthedocs.io.

Features

No external dependency except python (version >=3.6).
Minimalistic in nature.
Provides a lot of features to API users.
Cython compiling is provided to optimize performance.

Installation

Method A:

VCFsimplify <https://github.com/everestial/VCF-Simplify>_ uses vcfparser API, so the package is readily available if VCFsimplify is already installed.

This is only preferred while developing/optimizing VcfSimplify along with vcfparser.

Navigate to the VCFsimplify directory -> activate python -> call the 'vcfparser' package.

    $ C:\Users\>cd VCF-Simplify
    $ C:\Users\>cd VCF-Simplify>dir
      Volume in drive C is StorageDrive
      Volume Serial Number is .........

      Directory of C:\Users\VCF-Simplify

      07/12/2020  10:14 AM    <DIR>          .
      07/12/2020  10:14 AM    <DIR>          ..
      07/12/2020  08:55 AM    <DIR>          .github
      ............................
      ............................
      07/12/2020  10:42 AM    <DIR>          vcfparser
      07/12/2020  08:55 AM             1,494 VcfSimplify.py
              11 File(s)     20,873,992 bytes
              13 Dir(s)  241,211,793,408 bytes free

    $ C:\Users\VCF-Simplify>python
    Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 22:39:24) [MSC v.1916 (Intel)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from vcfparser import VcfParser
    >>>

Method B (preferred method): Pip is the preferred method of installing and using vcfparser API if custom python scripts/app are being developed.

    $ pip install vcfparser

Method C:

For offline install, or in order to build from the source code, follow :ref:advance install <advanced-install>.

Cythonize (optional but helpful)

The installed "vcfparser" package can be cythonized to optimize performance. Cythonizing the package can increase the speed of the parser by about x.x - y.y (?) times.

TODO: Bhuwan - add required cython method in here

Usage

from vcfparser import VcfParser
vcf_obj = VcfParser('input_test.vcf')

Get metadata information from the vcf file

metainfo = vcf_obj.parse_metadata()
metainfo.fileformat
# Output: 'VCFv4.2'

metainfo.filters
# Output: [{'ID': 'LowQual', 'Description': 'Low quality'}, {'ID': 'my_indel_filter', 'Description': 'QD < 2.0 || FS > 200.0 || ReadPosRankSum < -20.0'}, {'ID': 'my_snp_filter', 'Description': 'QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0'}]

metainfo.alt_
# Output: [{'ID': 'NON_REF', 'Description': 'Represents any possible alternative allele at this location'}]

metainfo.sample_names
# Output: ['ms01e', 'ms02g', 'ms03g', 'ms04h', 'MA611', 'MA605', 'MA622']

metainfo.record_keys
# Output: ['CHROM', 'POS', 'ID', 'REF', 'ALT', 'QUAL', 'FILTER', 'INFO', 'FORMAT', 'ms01e', 'ms02g', 'ms03g', 'ms04h', 'MA611', 'MA605', 'MA622']

Get Records from the vcf file

records = vcf_obj.parse_records()
# Note: Records are returned as a generator.

first_record = next(records)
first_record.CHROM
# Output: '2'

first_record.POS
# Output: '15881018'

first_record.REF
# Output: 'G'

first_record.ALT
# Output: 'A,C'

first_record.QUAL
# Output: '5082.45'

first_record.FILTER
# Output: ['PASS']

first_record.get_mapped_samples()
# Output: {'ms01e': {'GT': './.', 'PI': '.', 'GQ': '.', 'PG': './.', 'PM': '.', 'PW': './.', 'AD': '0,0', 'PL': '0,0,0,.,.,.', 'DP': '0', 'PB': '.', 'PC': '.'},
#           'ms02g': {'GT': './.', 'PI': '.', 'GQ': '.', 'PG': './.', 'PM': '.', 'PW': './.', 'AD': '0,0', 'PL': '0,0,0,.,.,.', 'DP': '0', 'PB': '.', 'PC': '.'},
#           'ms03g': {'GT': './.', 'PI': '.', 'GQ': '.', 'PG': './.', 'PM': '.', 'PW': './.', 'AD': '0,0', 'PL': '0,0,0,.,.,.', 'DP': '0', 'PB': '.', 'PC': '.'},
#           'ms04h': {'GT': '1/1', 'PI': '.', 'GQ': '6', 'PG': '1/1', 'PM': '.', 'PW': '1/1', 'AD': '0,2', 'PL': '49,6,0,.,.,.', 'DP': '2', 'PB': '.', 'PC': '.'},
#           'MA611': {'GT': '0/0', 'PI': '.', 'GQ': '78', 'PG': '0/0', 'PM': '.', 'PW': '0/0', 'AD': '29,0,0', 'PL': '0,78,1170,78,1170,1170', 'DP': '29', 'PB': '.', 'PC': '.'},
#           'MA605': {'GT': '0/0', 'PI': '.', 'GQ': '9', 'PG': '0/0', 'PM': '.', 'PW': '0/0', 'AD': '3,0,0', 'PL': '0,9,112,9,112,112', 'DP': '3', 'PB': '.', 'PC': '.'},
#           'MA622': {'GT': '0/0', 'PI': '.', 'GQ': '99', 'PG': '0/0', 'PM': '.', 'PW': '0/0', 'AD': '40,0,0', 'PL': '0,105,1575,105,1575,1575', 'DP': '40', 'PB': '.', 'PC': '.\n'}}

TODO: Bhuwan (priority - high) The very last example "first_record.get_mapped_samples()" is returning the value of the last sample/key with "\n". i.e: 'PC': '.\n' Please fix that issue - strip('\n') in the line before parsing.

Alternately, we can loop over each record by using a for-loop:

    for record in records:
        chrom = record.CHROM
        pos = record.POS
        id = record.ID
        ref = record.REF
        alt = record.ALT
        qual = record.QUAL
        filter = record.FILTER
        format_ = record.format_
        infos = record.get_info_dict()
        mapped_sample = record.get_mapped_samples()

For more specific use cases please check the examples in the following section:
For tutorials in metadata, please follow :ref:Metadata Tutorial <metadata-tutorial>.
For tutorials in record parser, please follow :ref:Record Parser Tutorial <record-parser-tutorial>.

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

everestial vuown

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.2

Sep 11, 2024

0.2.1

Dec 4, 2019

0.1.15

Nov 27, 2019

0.1.13

Nov 26, 2019

0.1.12

Nov 26, 2019

0.1.11

Nov 26, 2019

0.1.9

Sep 23, 2019

0.1.8

Sep 23, 2019

0.1.7

Sep 22, 2019

0.1.6

Sep 21, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vcfparser-0.2.2.tar.gz (19.6 kB view details)

Uploaded Sep 11, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vcfparser-0.2.2-py3-none-any.whl (17.8 kB view details)

Uploaded Sep 11, 2024 Python 3

File details

Details for the file vcfparser-0.2.2.tar.gz.

File metadata

Download URL: vcfparser-0.2.2.tar.gz
Upload date: Sep 11, 2024
Size: 19.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for vcfparser-0.2.2.tar.gz
Algorithm	Hash digest
SHA256	`476db6e7601675c94f5450dadf83dabc5e9b75062712ed72abfab85dd7c727e3`
MD5	`a921c81355660ee7e3fbc99034fb6eec`
BLAKE2b-256	`4bb83e2746566a07cb11ceec1015edbb8b353c0d8e4132e2742c6b7580027b90`

See more details on using hashes here.

File details

Details for the file vcfparser-0.2.2-py3-none-any.whl.

File metadata

Download URL: vcfparser-0.2.2-py3-none-any.whl
Upload date: Sep 11, 2024
Size: 17.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for vcfparser-0.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`81cfa15b41e8d7ebacc8745fea4b524a2cf721d4593c0ca87c8fd2e8a327a829`
MD5	`286c596fbca93167e05001d1262f66aa`
BLAKE2b-256	`9b259a25f4c345497f4d0029f84a128b994ace8e3ca4b6594fa4c2816a7560d7`

See more details on using hashes here.

vcfparser 0.2.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

vcfparser

Features

Installation

Cythonize (optional but helpful)

Usage

Get metadata information from the vcf file

Get Records from the vcf file

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes