Skip to main content

Interface with dbSNP VCF data

Project description

pydbsnp

Interface with dbSNP VCF data

Installation

Step 0 (optional): If you don't want to bother with environment variables and don't care about how pydbsnp works under the hood, skip this step.

If you wish, you can determine the location where pydbsnp looks for relevant data using four environment variables: PYDBSNP_VCF_GRCH37, PYDBSNP_RSID_GRCH37, PYDBSNP_VCF_GRCH38, PYDBSNP_RSID_GRCH38. The VCF variables determine the location of the VCF data, the RSID variables determine the location of the rsid indices. For example, you could add this to your .bash_profile:

export PYDBSNP_VCF_GRCH37=<path of your choice>
export PYDBSNP_RSID_GRCH37=<path of your choice>
export PYDBSNP_VCF_GRCH38=<path of your choice>
export PYDBSNP_RSID_GRCH38=<path of your choice>

If you set these variables before continuing to the next step, pydbsnp will use them to determine where it places downloaded VCF files and RSID indices.

Step 1: install the python package via pip3

pip install pydbsnp

or

pip install --user pydbsnp

Step 2: Once the python package is installed, download and index the dbSBP VCF data:

pydbsnp-download
pydbsnp-index

For hg19/GRCh37 coordinates:

pydbsnp-download --reference-build GRCh37
pydbsnp-index

Command line usage

pydbsnp-query -h
pydbsnp-query rs231361
pydbsnp-query chr8:118184783
pydbsnp-query --reference-build GRCh37 rs231361
pydbsnp-query rs231361 chr8:118184783 rs7903146

API

Two classes are provided: Variant and GeneralizedVariant.

An object of the Variant class has an attribute for each relevant field of the VCF.

from pydbsnp import Variant
v = Variant(id='rs8056814')
print(v.chrom, v.pos, v.id, v.ref, v.alt)
print(v.info)
w = Variant(id='rs8056814', reference_build='GRCh37')
print(w.chrom, w.pos)
x = Variant('chr16', 75218429)
print(x)
help(Variant)

An object of the GeneralizedVariant class is similar, but each attribute is a tuple which may have multiple items. For example, one RSID may map to two sets of coordinates.

gv = GeneralizedVariant(id='rs8056814')
print(gv.chrom, gv.pos, gv.id, gv.ref, gv.alt)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydbsnp-2.0.2.tar.gz (8.1 kB view details)

Uploaded Source

Built Distribution

pydbsnp-2.0.2-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file pydbsnp-2.0.2.tar.gz.

File metadata

  • Download URL: pydbsnp-2.0.2.tar.gz
  • Upload date:
  • Size: 8.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.10

File hashes

Hashes for pydbsnp-2.0.2.tar.gz
Algorithm Hash digest
SHA256 07259a3bd1412e72664c908f364dccb7f6a5ae28e44a012620fd8959a4aa296b
MD5 f86159fde3a27d862f37dfb8a700c700
BLAKE2b-256 4364ce2bc4b12d024b96f391b8eb677a733f8512ce68461afcfe31c66b010c60

See more details on using hashes here.

File details

Details for the file pydbsnp-2.0.2-py3-none-any.whl.

File metadata

  • Download URL: pydbsnp-2.0.2-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.10

File hashes

Hashes for pydbsnp-2.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6d37025b61f87dd50abfd576879ea66b9a9b04a1a01ced243427f0e73d157abd
MD5 663e735ff77f022605efe5f6a5343149
BLAKE2b-256 8154ec1ca49c7e0ad91f1907801cc4dbda43c6c71c79da763e16c92993c77059

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page