Skip to main content

Human reference mtdna sequence in Python formats.

Project description

oldowan.mtdna contains the human revised Cambridge Reference Sequence (rCRS) in Python friendly formats.

Installation Instructions

This package is pure Python and has no dependencies outside of the standard library. The easist way to install is using easy_install from the setuptools package. This usually goes something like this:

$ easy_install oldowan.mtdna

or on a unix-like system, assuming you are installing to the main Python site-packages directory as a non-privileged user, this:

$ sudo easy_install oldowan.mtdna

You may also use the standard python distutils setup method. Download the current source archive from the file list towards the bottom of this page, unarchive it, and install. On Mac OS X and many other unix-like systems, having downloaded the archive and changed to the directory containing this archive in your shell, this might go something like:

$ tar xvzf oldowan.mtdna*
$ cd oldowan.mtdna*
$ python setup.py install

Quick Start

This package provides the human rCRS in several Python friendly formats.

>>> from oldowan.mtdna import rCRS
>>> from oldowan.mtdna import rCRSlist
>>> from oldowan.mtdna import rCRSplus
>>> from oldowan.mtdna import rCRSplus_positions

rCRS is the raw sequence as a string:

>>> len(rCRS)
16569
>>> rCRS[0:10]
'GATCACAGGT'

rCRSlist is the rCRS sequence broken into a list. NOTE that this list is padded with a nonsense character in position 0 such that the indices match the standard biologial sequence position numbering (i.e. starting at 1 rather than 0):

>>> rCRSlist[0]
'#'
>>> rCRSlist[1]
'G'

rCRSplus is a partially repeat of the rCRS sequence. Because the mtDNA molecule is circular, it cannot be properly represented by a linear string. This might not be that big of an issue if experimentally derived mtDNA sequences generally do not span the break, BUT as it happens, the most commonly sequenced region of the mitochondrial genome (the control region) spans the point where the circular sequence is conventionally broken for the linear sequence. Thus, rCRSplus repeats 1000bp on either end of the molecule to facilitate sequence alignment across the break. The rCRSplus_positions list remaps the indices of rCRSplus onto the conventional position numbering of the rCRS.

>>> len(rCRSplus)
18638
>>> rCRSplus[1]
'A'
>>> rCRSlist[1]
'G'
>>> rCRSlist[rCRSplus_positions[1]]
'A'

Finally, the oldowan.mtdna package provides indices to extract common mtDNA regions from the rCRS.

>>> from oldowan.mtdna import HVR1_indices
>>> from oldowan.mtdna import HVR2_indices
>>> from oldowan.mtdna import HVR1and2_indices
>>> from oldowan.mtdna import HVR1to2_indices
>>> from oldowan.mtdna import coding_indices
>>> from oldowan.mtdna import all_indices
>>> hvr1 = ''.join(list(rCRSlist[x] for x in HVR1_indices))

Release History

1.0.0 (March 25, 2009)

initial release of module.

1.0.1 (March 25, 2009)

minor version reporting fix

1.0.3 (August 4, 2015)

actually fixed the version number issue this time

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oldowan.mtdna-1.0.3.tar.gz (9.4 kB view details)

Uploaded Source

File details

Details for the file oldowan.mtdna-1.0.3.tar.gz.

File metadata

  • Download URL: oldowan.mtdna-1.0.3.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for oldowan.mtdna-1.0.3.tar.gz
Algorithm Hash digest
SHA256 48e56dd4aa542a4f20caf4596c28a5c13735d8daf4526620a9fb1f2684e06c16
MD5 b65d629a221b418ee347ab3fc40d0620
BLAKE2b-256 0c8fdd2c4f6927102414e1d4b5e1e3adeb3cc83d6c0ed79435d46158241dbc48

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page