Skip to main content

Transform human mtDNA sequence to variant sites and vice versa.

Project description

oldowan.mtconvert is a small, pure Python, bioinformatic utility to (1)
transform human mitochondral DNA sequence data into variant sites relative to
the revised Cambridge Reference Sequence (rCRS) and (2) transform variant sites
data into DNA sequence. Further information on the rCRS and variant site
nomenclature for human mtDNA sequences is available at the mtconvert_ website.

Installation Instructions

This package is pure Python and has no dependencies outside of the standard
library. The easist way to install is using ``easy_install`` from the
setuptools_ package. This usually goes something like this::

$ easy_install oldowan.mtconvert

or on a unix-like system, assuming you are installing to the main Python
``site-packages`` directory as a non-privileged user, this::

$ sudo easy_install oldowan.mtconvert

You may also use the standard python distutils setup method. Download the
current source archive from the file list towards the bottom of this page,
unarchive it, and install. On Mac OS X and many other unix-like systems, having
downloaded the archive and changed to the directory containing this archive in
your shell, this might go something like::

$ tar xvzf oldowan.mtconvert*
$ cd oldowan.mtconvert*
$ python install

Quick Start

Import ``seq2sites`` and ``sites2seq`` from oldowan.mtconvert::

>>> from oldowan.mtconvert import seq2sites, sites2seq

Convert sequence to sites::

>>> seq2sites(seq)

Sequences must be contiguous! Separate runs of sequence, such as HVR1 and HVR2
without the intervening sequence interval, must be analyzed separately.

There is also a cutoff on the number of ambigous sites (N) allowed in the
sequence. By default, this is 10 - but this is an option that can be set::

>>> seq2sites(seq, ambig_cutoff=20)

Convert a list of variable sites to sequence. The default sequence region that
is returned is hypervariable region 1 (HVR1), which is positions 16024 to 16365
of the rCRS (in biological one-based numbering)::

>>> sites2seq('16129A 16223T')

Predefined sequence regions are:

- HVR1: 16024-16365
- HVR2: 73-340
- HVR1to2: 16024-340
- coding: 577-15992
- all: 1-16559

So, to convert a list of HVR2 sites to sequence::

>>> sites2seq('73G', region='HVR2')

Sites may also be provided in a list::

>>> sites2seq(['16129A', '16223T', '73G'], region='HVR1to2')

The rCRS sequence will be returned given an empty string, empty list, or the
string 'rCRS'. All of the following are equivalent::

>>> sites2seq('')
>>> sites2seq([])
>>> sites2seq('rCRS')

Arbitrary positions may be selected by passing a list of sites to the
``region`` option::

>>> sites2seq('', region=[1,2,3])

The Python range function is convenient for this, but you must remember that
the range does not include its ending position::

>>> sites2seq('', region=range(73,341)) # include 340, but not 341

Release History ===============

1.0.0 (March 25, 2009)
initial release of module.

1.0.1 (March 25, 2009)
minor versioning fix

1.0.2 (May 27, 2009)
partial RFLP implementation

1.0.3 (June 22, 2015)
add fix for spurious deletions at end of query

1.0.4 (June 22, 2015)
improve fix for spurious deletions at end of query

1.0.5 (June 22, 2015)
sites outside requested region should pass silently; fix for insertions

1.0.6 (August 4, 2015)
fix version number install problem

.. _setuptools:

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for oldowan.mtconvert, version 1.0.6
Filename, size File type Python version Upload date Hashes
Filename, size oldowan.mtconvert-1.0.6.tar.gz (34.6 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page