Skip to main content

A fast 23andMe raw genome file parser

Project description

arv — a fast 23andMe parser for Python

Arv (Norwegian; “inheritance” or “heritage”) is a Python module for parsing raw 23andMe genome files. It lets you lookup SNPs from RSIDs.

from arv import load, unphased_match as match

genome = load("genome.txt")

print("You are a {gender} with {color} eyes and {complexion} skin.".format(
  gender     = "man" if genome.y_chromosome else "woman",
  complexion = "light" if genome["rs1426654"] == "AA" else "dark",
  eyecolor   = match(genome["rs12913832"], {"AA": "brown",
                                            "AG": "brown or green",
                                            "GG": "blue"})))

In my case, this little program produces:

You are a man with blue eyes and light skin.

It’s insanely fast: On a 2013 Xeon machine, a 24 Mb file is fully parsed and put into a hash table in less than 70 ms. Its guts are written in finely tuned C++ and is exposed to Python via Cython.

Status

> This project is currently just a work in progress! I intend to wrap dna-traits under a new name, using Cython to interface with dna-traits much more easily. For a working (but old) version, see https://github.com/cslarsen/dna-traits

Installation

The recommended way is to install from PyPi.

$ pip install arv

NOTE: PyPi/pip is not yet available!

In the meantime, you can do

$ python setup.py install

License

Copyright 2014, 2016, 2017 Christian Stigen Larsen Distributed under the GNU GPL v3 or later.

See the file COPYING for the full license text. This software makes use of open source software; see LICENSES for details.

This code is largely based on the project dna-traits, by the same author.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arv-0.1.tar.gz (239.8 kB view details)

Uploaded Source

Built Distribution

arv-0.1-cp27-cp27m-macosx_10_12_x86_64.whl (42.9 kB view details)

Uploaded CPython 2.7m macOS 10.12+ x86-64

File details

Details for the file arv-0.1.tar.gz.

File metadata

  • Download URL: arv-0.1.tar.gz
  • Upload date:
  • Size: 239.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for arv-0.1.tar.gz
Algorithm Hash digest
SHA256 3c24d882dad79a0690f69f7bde6e09808616a88d688d0b167bddf2443b0392d7
MD5 55ca1f640a0d64a08241d57c8cd548ff
BLAKE2b-256 28bf8f39399e129e77e88084cd6553ed5b02132404fa5aa7953ec357cd53824c

See more details on using hashes here.

Provenance

File details

Details for the file arv-0.1-cp27-cp27m-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for arv-0.1-cp27-cp27m-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 52a3ce894f552d52dd990f3b75cb880de8cf20291a44547cc245c378e003d229
MD5 97a492d99763e28a85342b9e3cbef8b6
BLAKE2b-256 79cbd0a432e6bee84946f2a99c6e1c18757519d9a29a8a93e5cec5a4c6ec0b13

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page