Skip to main content

A fast 23andMe raw genome file parser

Project description

arv — a fast 23andMe parser for Python

Arv (Norwegian; “heritage” or “inheritance”) is a Python module for parsing raw 23andMe genome files. It lets you lookup SNPs from RSIDs.

from arv import load, unphased_match as match

genome = load("genome.txt")

print("You are a {gender} with {color} eyes and {complexion} skin.".format(
  gender     = "man" if genome.y_chromosome else "woman",
  complexion = "light" if genome["rs1426654"] == "AA" else "dark",
  eyecolor   = match(genome["rs12913832"], {"AA": "brown",
                                            "AG": "brown or green",
                                            "GG": "blue"})))

For my genome, this little program produces:

You are a man with blue eyes and light skin.

The parser is insanely fast, having been written in finely tuned C++. A Xeon machine I’ve tested on parses a 24 Mb file into a hash table in 70 ms.

Works with Python 2 and 3.

Installation

The recommended way is to install from PyPi.

$ pip install arv

Note that the pip install builds from source. You’ll need not only Cython, but also a C++11 capable compiler. I might distributed binary wheels in time. If the installation doesn’t work for you, please file a GitHub issue with as much detail as you can.

License

Copyright 2017 Christian Stigen Larsen

Distributed under the GNU GPL v3 or later.

See the file COPYING for the full license text. This software makes use of open source software; see LICENSES for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arv-0.2.tar.gz (239.3 kB view hashes)

Uploaded Source

Built Distribution

arv-0.2-cp27-cp27m-macosx_10_12_x86_64.whl (42.9 kB view hashes)

Uploaded CPython 2.7m macOS 10.12+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page