Skip to main content

HGVS variant description extractor

Project description

# HGVS variant description extractor

Unambiguous sequence variant descriptions are important in reporting the outcome of clinical diagnostic DNA tests. The standard nomenclature of the Human Genome Variation Society (HGVS) describes the observed variant sequence relative to a given reference sequence. We propose an efficient algorithm for the extraction of HGVS descriptions from two sequences with three main requirements in mind: minimizing the length of the resulting descriptions, minimizing the computation time, and keeping the unambiguous descriptions biologically meaningful.

This algorithm is able to compute the HGVS descriptions of complete chromosomes or other large DNA strings in a reasonable amount of computation time and its resulting descriptions are relatively small. Additional applications include updating of gene variant database contents and reference sequence liftovers.

>>> from extractor import describe_dna
>>> print describe_dna('TAACAATGGAAC', 'TAAACAATTGAA')
[3dup;8G>T;12del]

## Implementation

The core algorithm is implemented in C++ with a Python wrapper providing a developer friendly interface.

## Installation

### Python package

You need [SWIG](http://www.swig.org/) installed. Then:

pip install description-extractor

### C++ library only

Run make.

Optionally set the __debug__ flag to trace the algorithm.

For direct use within a C/C++ environment just #include “extractor.h” and add extractor.cc to your project’s source files.

## Testing

There are some unit tests for the Python interface. After installing the Python package, run them using [pytest](http://pytest.org/):

pip install pytest python setup.py develop py.test

Alternatively, use [tox](https://tox.readthedocs.org/) to automatically run the tests on all supported versions of Python:

pip install tox tox

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

description-extractor-2.3.3.tar.gz (26.6 kB view details)

Uploaded Source

File details

Details for the file description-extractor-2.3.3.tar.gz.

File metadata

File hashes

Hashes for description-extractor-2.3.3.tar.gz
Algorithm Hash digest
SHA256 11e77bcbdd910a3ae29f4d412cd781d7eb4dcf50609c18f4f44593b21136533c
MD5 9d2eff5bd869a525acc393d2d41e8d37
BLAKE2b-256 46ec7fbc8f16b7ace4a3e8c7429e3b4ae4478b7a6600e78957c2ecaaf602f7a3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page