Skip to main content

like seqTailor, give a VCF file, it return genomic sequence.

Project description

vcf2seq

Aim

Similar to seqtailor [PMID:31045209] : reads a VCF file, outputs a genomic sequence (default length: 31)

Unlike seqtailor, all sequences will have the same length. Moreover, it is possible to have an absence character (by default the dot . ) for indels.

  • When a insertion is larger than --size parameter, only first --size nucleotides are outputed.
  • Sequence headers are formated as "_".

VCF format specifications: https://github.com/samtools/hts-specs/blob/master/VCFv4.4.pdf

Installation

pip install vcf2seq

usage

usage: vcf2seq.py [-h] -g genome [-s SIZE] [-t {alt,ref,both}] [-b BLANK] [-a ADD_COLUMNS [ADD_COLUMNS ...]] [-o OUTPUT] [-v] vcf


positional arguments:
  vcf                   vcf file (mandatory)

options:
  -h, --help            show this help message and exit
  -g genome, --genome genome
                        genome as fasta file (mandatory)
  -s SIZE, --size SIZE  size of the output sequence (defalt: 31)
  -t {alt,ref,both}, --type {alt,ref,both}
                        alt, ref, or both output? (default: alt)
  -b BLANK, --blank BLANK
                        Missing nucleotide character, default is dot (.)
  -a ADD_COLUMNS [ADD_COLUMNS ...], --add-columns ADD_COLUMNS [ADD_COLUMNS ...]
                        Add one or more columns to header (ex: '-a 3 AA' will add columns 3 and 27). The first column is '1' (or 'A')
  -o OUTPUT, --output OUTPUT
                        Output file (default: <input_file>-vcf2seq.fa)
  -v, --version         show program's version number and exit

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vcf2seq-0.4.1a0.tar.gz (18.9 kB view details)

Uploaded Source

Built Distribution

vcf2seq-0.4.1a0-py3-none-any.whl (19.9 kB view details)

Uploaded Python 3

File details

Details for the file vcf2seq-0.4.1a0.tar.gz.

File metadata

  • Download URL: vcf2seq-0.4.1a0.tar.gz
  • Upload date:
  • Size: 18.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.5

File hashes

Hashes for vcf2seq-0.4.1a0.tar.gz
Algorithm Hash digest
SHA256 afae606928664523165b62b122ea79ce3e361872f94b40f97e06f60821a40e10
MD5 26b7e3921ddb6082c973828e03f1c06b
BLAKE2b-256 3cc61bfb652650f4f623a794a184f8be9448a824b1b33c8a128423dbd56952d2

See more details on using hashes here.

File details

Details for the file vcf2seq-0.4.1a0-py3-none-any.whl.

File metadata

  • Download URL: vcf2seq-0.4.1a0-py3-none-any.whl
  • Upload date:
  • Size: 19.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.5

File hashes

Hashes for vcf2seq-0.4.1a0-py3-none-any.whl
Algorithm Hash digest
SHA256 e470c29135164c1f50badb10da9dde260bf3c25f3e0973fd67a916e97734ec3f
MD5 a6524d96bdb2f8a10e41e39df883cc40
BLAKE2b-256 7d021d08615ed0467ac1b57cf52e7fc129a6a6a505577912de5267c445a89828

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page