Convert SNP array to VCF
Project description
Array As VCF
array-as-vcf
is a small library and tool to
convert common SNP array formats to VCF format.
There are four currently supported array formats:
- Affymetrix (TSV export)
- Cytoscan HD Array (TSV export)
- Lumi 317k array (TSV export)
- Lumi 370k array (TSV export)
- Multi-sample OpenArray (TSV export)
Binary formats are not (yet) supported.
Requirements
- Python 3.6
- requests
CLI usage
The array-as-vcf
tool will convert array files to VCF format.
It will auto-detect the type of array file, and throw an error if it can't
determine it.
The generated VCF file is printed to stdout.
A sample name to be used in the VCF file must be supplied.
The REF and ALT alleles will be queried from Ensembl if no lookup-table
is
supplied. This requires a working internet connection, and can be quite slow
due the amount of HTTP requests that are necessary.
When supplied with lookup-table
, no requests are made for the rsIDs
which exist within the lookup table. The lookup table is a JSON file,
containing a single large object of shape:
{
"rs0": "{ref_allele}:{alt_alleles}:{ref_is_minor_allele}"
}
E.g.
{
"rs1000003": "A:G:F"
}
If you have never run array-as-vcf
before , you can run array-as-vcf
sans lookup table
and dump
the generated internal lookup table to a file for next iterations.
Usage: array-as-vcf [OPTIONS]
Options:
-p, --path PATH Path to array file [required]
-b, --build [GRCh37|GRCh38]
-s, --sample-name TEXT Name of sample in VCF file
-c, --chr-prefix TEXT Optional prefix to chromosome names
-l, --lookup-table PATH Optional path to existing lookup table for
rsIDs.
-d, --dump PATH Optional path to write generated lookup table
--help Show this message and exit.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file array-as-vcf-1.0.0.tar.gz
.
File metadata
- Download URL: array-as-vcf-1.0.0.tar.gz
- Upload date:
- Size: 12.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.40.0 CPython/3.6.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 31c158031b2ce4ae8ba79d4e38a2cc9252a535cdbb17ec8cf1a42409b49c0454 |
|
MD5 | 395e12e997021bf43940f330aaa7ea0e |
|
BLAKE2b-256 | a2b59483cafad1e76081654d13002409dae9ad2d2fc3e83d95740967f1be585a |
File details
Details for the file array_as_vcf-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: array_as_vcf-1.0.0-py3-none-any.whl
- Upload date:
- Size: 13.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.40.0 CPython/3.6.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 179add234c5f0f50fc344175002ad85cf45f4732854a69d135529d6f27979357 |
|
MD5 | 932ce9b8cdb478936399dcf44afd68f5 |
|
BLAKE2b-256 | 8d35b932cd88cecdfb475464680cda684faf810e49004617e5d14c115ad94cf0 |