Interface with dbSNP VCF data
Project description
pydbsnp
Interface with dbSNP VCF data
Installation
Step 0 (optional): If you don't want to bother with environment variables
and don't care about how pydbsnp
works under the hood, skip this step.
If you wish, you can determine the location where pydbsnp
looks for relevant
data using four environment variables: PYDBSNP_VCF_GRCH37
,
PYDBSNP_RSID_GRCH37
, PYDBSNP_VCF_GRCH38
, PYDBSNP_RSID_GRCH38
. The VCF
variables determine the location of the VCF data, the RSID
variables
determine the location of the rsid indices. For example, you could add this
to your .bash_profile
:
export PYDBSNP_VCF_GRCH37=<path of your choice>
export PYDBSNP_RSID_GRCH37=<path of your choice>
export PYDBSNP_VCF_GRCH38=<path of your choice>
export PYDBSNP_RSID_GRCH38=<path of your choice>
If you set these variables before continuing to the next step, pydbsnp
will
use them to determine where it places downloaded VCF files and RSID indices.
Step 1: install the python package via pip3
pip install pydbsnp
or
pip install --user pydbsnp
Step 2: Once the python package is installed, download and index the dbSBP VCF data:
pydbsnp-download
pydbsnp-index
For hg19/GRCh37 coordinates:
pydbsnp-download --reference-build GRCh37
pydbsnp-index
Command line usage
pydbsnp-query -h
pydbsnp-query rs231361
pydbsnp-query chr8:118184783
pydbsnp-query --reference-build GRCh37 rs231361
pydbsnp-query rs231361 chr8:118184783 rs7903146
API
Two classes are provided: Variant
and GeneralizedVariant
.
An object of the Variant
class has an attribute for each relevant field
of the VCF.
from pydbsnp import Variant
v = Variant(id='rs8056814')
print(v.chrom, v.pos, v.id, v.ref, v.alt)
print(v.info)
w = Variant(id='rs8056814', reference_build='GRCh37')
print(w.chrom, w.pos)
x = Variant('chr16', 75218429)
print(x)
help(Variant)
An object of the GeneralizedVariant
class is similar, but each attribute
is a tuple which may have multiple items. For example, one RSID may map
to two sets of coordinates.
gv = GeneralizedVariant(id='rs8056814')
print(gv.chrom, gv.pos, gv.id, gv.ref, gv.alt)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pydbsnp-2.0.2.tar.gz
.
File metadata
- Download URL: pydbsnp-2.0.2.tar.gz
- Upload date:
- Size: 8.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 07259a3bd1412e72664c908f364dccb7f6a5ae28e44a012620fd8959a4aa296b |
|
MD5 | f86159fde3a27d862f37dfb8a700c700 |
|
BLAKE2b-256 | 4364ce2bc4b12d024b96f391b8eb677a733f8512ce68461afcfe31c66b010c60 |
File details
Details for the file pydbsnp-2.0.2-py3-none-any.whl
.
File metadata
- Download URL: pydbsnp-2.0.2-py3-none-any.whl
- Upload date:
- Size: 8.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6d37025b61f87dd50abfd576879ea66b9a9b04a1a01ced243427f0e73d157abd |
|
MD5 | 663e735ff77f022605efe5f6a5343149 |
|
BLAKE2b-256 | 8154ec1ca49c7e0ad91f1907801cc4dbda43c6c71c79da763e16c92993c77059 |