Load numpy arrays from a VCF (variant call file).
Project description
Load numpy arrays from a VCF (variant call file).
Installation
Installation requires numpy and cython:
$ pip install vcfnp
…or:
$ git clone --recursive git://github.com/alimanfoo/vcfnp.git $ cd vcfnp $ python setup.py build_ext --inplace
Usage
import sys
import vcfnp
import numpy as np
import matplotlib.pyplot as plt
filename = '/path/to/my.vcf'
# load data from fixed fields (except INFO)
v = vcfnp.variants(filename).view(np.recarray)
# print some simple variant metrics
print 'found %s variants (%s SNPs)' % (v.size, np.count_nonzero(v.is_snp))
print 'QUAL mean (std): %s (%s)' % (np.mean(v.QUAL), np.std(v.QUAL))
# load data from INFO field
i = vcfnp.info(filename).view(np.recarray)
# plot a histogram of variant depth
fig = plt.figure(1)
ax = fig.add_subplot(111)
ax.hist(i.DP)
ax.set_title('DP histogram')
ax.set_xlabel('DP')
plt.show()
# load data from sample columns
c = vcfnp.calldata(filename).view(np.recarray)
c = vcfnp.view2d(c)
# print some simple genotype metrics
count_phased = np.count_nonzero(c.is_phased)
count_variant = np.count_nonzero(np.any(c.genotype > 0, axis=2))
count_missing = np.count_nonzero(~c.is_called)
print 'calls (phased, variant, missing): %s (%s, %s, %s)' % (c.flatten().size, count_phased, count_variant, count_missing)
# plot a histogram of genotype quality
fig = plt.figure(2)
ax = fig.add_subplot(111)
ax.hist(c.GQ.flatten())
ax.set_title('GQ histogram')
ax.set_xlabel('GQ')
plt.show()
Acknowledgments
Based on Erik Garrison’s vcflib.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
vcfnp-0.13.tar.gz
(414.3 kB
view details)
File details
Details for the file vcfnp-0.13.tar.gz.
File metadata
- Download URL: vcfnp-0.13.tar.gz
- Upload date:
- Size: 414.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
58e9011eca5cecca30dc4164ce28a58020db7e9f4d7ee988314ac8052c88a235
|
|
| MD5 |
f03d721fd5114dfcf39857159f2393a9
|
|
| BLAKE2b-256 |
a98f48fab6cbdb657d6255cf1a695fc40d0469e320f163be7e6c4028eb7fbd42
|