Inputing a VCF file, it returns the genomic sequence at the specified length (31 by default).
Project description
vcf2seq
Aim
Similar to seqtailor [PMID:31045209] : reads a VCF file, outputs a genomic sequence (default length: 31)
Unlike seqtailor, all sequences will have the same length. Moreover, it is possible to have an absence character (by default the dot .
) for indels.
- When a insertion is larger than
--size
parameter, only first--size
nucleotides are outputed. - Sequence headers are formated as "_".
VCF format specifications: https://github.com/samtools/hts-specs/blob/master/VCFv4.4.pdf
Installation
pip install vcf2seq
usage
usage: vcf2seq.py [-h] -g genome [-s SIZE] [-t {alt,ref,both}] [-b BLANK] [-a ADD_COLUMNS [ADD_COLUMNS ...]] [-o OUTPUT] [-v] vcf
positional arguments:
vcf vcf file (mandatory)
options:
-h, --help show this help message and exit
-g genome, --genome genome
genome as fasta file (mandatory)
-s SIZE, --size SIZE size of the output sequence (default: 31)
-t {alt,ref,both}, --type {alt,ref,both}
alt, ref, or both output? (default: alt)
-b BLANK, --blank BLANK
Missing nucleotide character, default is dot (.)
-a ADD_COLUMNS [ADD_COLUMNS ...], --add-columns ADD_COLUMNS [ADD_COLUMNS ...]
Add one or more columns to header (ex: '-a 3 AA' will add columns 3 and 27). The first column is '1' (or 'A')
-o OUTPUT, --output OUTPUT
Output file (default: <input_file>-vcf2seq.fa/tsv)
-f {fa,tsv}, --output-format {fa,tsv}
Output file format (default: fa)
-v, --version show program's version number and exit
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
vcf2seq-0.7.2.tar.gz
(19.9 kB
view details)
Built Distribution
vcf2seq-0.7.2-py3-none-any.whl
(20.7 kB
view details)
File details
Details for the file vcf2seq-0.7.2.tar.gz
.
File metadata
- Download URL: vcf2seq-0.7.2.tar.gz
- Upload date:
- Size: 19.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/43.0 requests-toolbelt/0.9.1 requests/2.31.0 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ea088704984c0b756964ab9ec0ae7723379dacb7aa5a59331d3c66cf900b78f7 |
|
MD5 | cb34523220a32dd2495340b5edbcd401 |
|
BLAKE2b-256 | 2c6a931e0c4d7823b25e5422e44455b860a19b57a239d2b6ecb66fb3cd0fa992 |
File details
Details for the file vcf2seq-0.7.2-py3-none-any.whl
.
File metadata
- Download URL: vcf2seq-0.7.2-py3-none-any.whl
- Upload date:
- Size: 20.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/43.0 requests-toolbelt/0.9.1 requests/2.31.0 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 40b21e06505afb8cc793b1eac48f584f9bf7ea1185cbd50ed96b3124f38a9bf2 |
|
MD5 | 3e84e5527fa9e4d76a5e5a673fa585c6 |
|
BLAKE2b-256 | 6322eebc0d70b830daa63d63081a772f60431cc216d8e0bb0bcd0a810afe53a4 |