Skip to main content

Inputing a VCF file, it returns the genomic sequence at the specified length (31 by default).

Project description

vcf2seq

Aim

Similar to seqtailor [PMID:31045209] : reads a VCF file, outputs a genomic sequence (default length: 31)

Unlike seqtailor, all sequences will have the same length. Moreover, it is possible to have an absence character (by default the dot . ) for indels.

  • When a insertion is larger than --size parameter, only first --size nucleotides are outputed.
  • Sequence headers are formated as "_".

VCF format specifications: https://github.com/samtools/hts-specs/blob/master/VCFv4.4.pdf

Installation

pip install vcf2seq

usage

usage: vcf2seq.py [-h] -g genome [-s SIZE] [-t {alt,ref,both}] [-b BLANK] [-a ADD_COLUMNS [ADD_COLUMNS ...]] [-o OUTPUT] [-v] vcf


positional arguments:
  vcf                   vcf file (mandatory)

options:
  -h, --help            show this help message and exit
  -g genome, --genome genome
                        genome as fasta file (mandatory)
  -s SIZE, --size SIZE  size of the output sequence (defalt: 31)
  -t {alt,ref,both}, --type {alt,ref,both}
                        alt, ref, or both output? (default: alt)
  -b BLANK, --blank BLANK
                        Missing nucleotide character, default is dot (.)
  -a ADD_COLUMNS [ADD_COLUMNS ...], --add-columns ADD_COLUMNS [ADD_COLUMNS ...]
                        Add one or more columns to header (ex: '-a 3 AA' will add columns 3 and 27). The first column is '1' (or 'A')
  -o OUTPUT, --output OUTPUT
                        Output file (default: <input_file>-vcf2seq.fa)
  -v, --version         show program's version number and exit

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vcf2seq-0.6.1a0.tar.gz (19.4 kB view details)

Uploaded Source

Built Distribution

vcf2seq-0.6.1a0-py3-none-any.whl (20.3 kB view details)

Uploaded Python 3

File details

Details for the file vcf2seq-0.6.1a0.tar.gz.

File metadata

  • Download URL: vcf2seq-0.6.1a0.tar.gz
  • Upload date:
  • Size: 19.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.5

File hashes

Hashes for vcf2seq-0.6.1a0.tar.gz
Algorithm Hash digest
SHA256 6bfce746f66ca8f7af053de6ec5f5bc8e50fe79c89d633bdfb8ed130738d633a
MD5 5d5574de2dfc3be0dfd1cc0b9ff40f3e
BLAKE2b-256 236b21c274c0c22492bf31d9f664e0f5c7a4837b4d4b6714d875cd63caecaed9

See more details on using hashes here.

File details

Details for the file vcf2seq-0.6.1a0-py3-none-any.whl.

File metadata

  • Download URL: vcf2seq-0.6.1a0-py3-none-any.whl
  • Upload date:
  • Size: 20.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.5

File hashes

Hashes for vcf2seq-0.6.1a0-py3-none-any.whl
Algorithm Hash digest
SHA256 c11f043eb028098c5e25e7e6da7cb3c1eaff968297c46efe993fc2916c4f6e0f
MD5 abfa4d9f4e765b05160dd67a836244c1
BLAKE2b-256 09e2d4a8b85718239d48844fd2aa1c83c1dd2cdd71d32458e74f520f77151c5c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page