Skip to main content

Inputing a VCF file, it returns the genomic sequence at the specified length (31 by default).

Project description

vcf2seq

Aim

Similar to seqtailor [PMID:31045209] : reads a VCF file, outputs a genomic sequence (default length: 31)

Unlike seqtailor, all sequences will have the same length. Moreover, it is possible to have an absence character (by default the dot . ) for indels.

  • When a insertion is larger than --size parameter, only first --size nucleotides are outputed.
  • Sequence headers are formated as "_".

VCF format specifications: https://github.com/samtools/hts-specs/blob/master/VCFv4.4.pdf

Installation

pip install vcf2seq

usage

usage: vcf2seq.py [-h] -g genome [-s SIZE] [-t {alt,ref,both}] [-b BLANK] [-a ADD_COLUMNS [ADD_COLUMNS ...]] [-o OUTPUT] [-v] vcf


positional arguments:
  vcf                   vcf file (mandatory)

options:
  -h, --help            show this help message and exit
  -g genome, --genome genome
                        genome as fasta file (mandatory)
  -s SIZE, --size SIZE  size of the output sequence (default: 31)
  -t {alt,ref,both}, --type {alt,ref,both}
                        alt, ref, or both output? (default: alt)
  -b BLANK, --blank BLANK
                        Missing nucleotide character, default is dot (.)
  -a ADD_COLUMNS [ADD_COLUMNS ...], --add-columns ADD_COLUMNS [ADD_COLUMNS ...]
                        Add one or more columns to header (ex: '-a 3 AA' will add columns 3 and 27). The first column is '1' (or 'A')
  -o OUTPUT, --output OUTPUT
                        Output file (default: <input_file>-vcf2seq.fa/tsv)
  -f {fa,tsv}, --output-format {fa,tsv}
                        Output file format (default: fa)
  -v, --version         show program's version number and exit

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vcf2seq-0.7.2.tar.gz (19.9 kB view details)

Uploaded Source

Built Distribution

vcf2seq-0.7.2-py3-none-any.whl (20.7 kB view details)

Uploaded Python 3

File details

Details for the file vcf2seq-0.7.2.tar.gz.

File metadata

  • Download URL: vcf2seq-0.7.2.tar.gz
  • Upload date:
  • Size: 19.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/43.0 requests-toolbelt/0.9.1 requests/2.31.0 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for vcf2seq-0.7.2.tar.gz
Algorithm Hash digest
SHA256 ea088704984c0b756964ab9ec0ae7723379dacb7aa5a59331d3c66cf900b78f7
MD5 cb34523220a32dd2495340b5edbcd401
BLAKE2b-256 2c6a931e0c4d7823b25e5422e44455b860a19b57a239d2b6ecb66fb3cd0fa992

See more details on using hashes here.

File details

Details for the file vcf2seq-0.7.2-py3-none-any.whl.

File metadata

  • Download URL: vcf2seq-0.7.2-py3-none-any.whl
  • Upload date:
  • Size: 20.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/43.0 requests-toolbelt/0.9.1 requests/2.31.0 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for vcf2seq-0.7.2-py3-none-any.whl
Algorithm Hash digest
SHA256 40b21e06505afb8cc793b1eac48f584f9bf7ea1185cbd50ed96b3124f38a9bf2
MD5 3e84e5527fa9e4d76a5e5a673fa585c6
BLAKE2b-256 6322eebc0d70b830daa63d63081a772f60431cc216d8e0bb0bcd0a810afe53a4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page