Skip to main content

A fast 23andMe raw genome file parser

Project description

arv — a fast 23andMe parser for Python

Arv (Norwegian; “inheritance” or “heritage”) is a Python module for parsing raw 23andMe genome files. It lets you lookup SNPs from RSIDs.

from arv import load, unphased_match as match

genome = load("genome.txt")

print("You are a {gender} with {color} eyes and {complexion} skin.".format(
  gender     = "man" if genome.y_chromosome else "woman",
  complexion = "light" if genome["rs1426654"] == "AA" else "dark",
  eyecolor   = match(genome["rs12913832"], {"AA": "brown",
                                            "AG": "brown or green",
                                            "GG": "blue"})))

In my case, this little program produces:

You are a man with blue eyes and light skin.

It’s insanely fast: On a 2013 Xeon machine, a 24 Mb file is fully parsed and put into a hash table in less than 70 ms. Its guts are written in finely tuned C++ and is exposed to Python via Cython.

Status

> This project is currently just a work in progress! I intend to wrap dna-traits under a new name, using Cython to interface with dna-traits much more easily. For a working (but old) version, see https://github.com/cslarsen/dna-traits

Installation

The recommended way is to install from PyPi.

$ pip install arv

NOTE: PyPi/pip is not yet available!

In the meantime, you can do

$ python setup.py install

License

Copyright 2014, 2016, 2017 Christian Stigen Larsen Distributed under the GNU GPL v3 or later.

See the file COPYING for the full license text. This software makes use of open source software; see LICENSES for details.

This code is largely based on the project dna-traits, by the same author.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arv-0.1.tar.gz (239.8 kB view hashes)

Uploaded Source

Built Distribution

arv-0.1-cp27-cp27m-macosx_10_12_x86_64.whl (42.9 kB view hashes)

Uploaded CPython 2.7m macOS 10.12+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page