Skip to main content

A fast 23andMe raw genome file parser

Project description

arv — a fast 23andMe parser for Python

Arv (Norwegian; “heritage” or “inheritance”) is a Python module for parsing raw 23andMe genome files. It lets you lookup SNPs from RSIDs.

from arv import load, unphased_match as match

genome = load("genome.txt")

print("You are a {gender} with {color} eyes and {complexion} skin.".format(
  gender     = "man" if genome.y_chromosome else "woman",
  complexion = "light" if genome["rs1426654"] == "AA" else "dark",
  eyecolor   = match(genome["rs12913832"], {"AA": "brown",
                                            "AG": "brown or green",
                                            "GG": "blue"})))

For my genome, this little program produces:

You are a man with blue eyes and light skin.

The parser is insanely fast, having been written in finely tuned C++. A Xeon machine I’ve tested on parses a 24 Mb file into a hash table in 70 ms.

Works with Python 2 and 3.

Installation

The recommended way is to install from PyPi.

$ pip install arv

Note that the pip install builds from source. You’ll need not only Cython, but also a C++11 capable compiler. I might distributed binary wheels in time. If the installation doesn’t work for you, please file a GitHub issue with as much detail as you can.

License

Copyright 2017 Christian Stigen Larsen

Distributed under the GNU GPL v3 or later.

See the file COPYING for the full license text. This software makes use of open source software; see LICENSES for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arv-0.2.tar.gz (239.3 kB view details)

Uploaded Source

Built Distribution

arv-0.2-cp27-cp27m-macosx_10_12_x86_64.whl (42.9 kB view details)

Uploaded CPython 2.7m macOS 10.12+ x86-64

File details

Details for the file arv-0.2.tar.gz.

File metadata

  • Download URL: arv-0.2.tar.gz
  • Upload date:
  • Size: 239.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for arv-0.2.tar.gz
Algorithm Hash digest
SHA256 0b849c4d1edf13695ca6bd6228f8c2710f66f04375f7c783b565184b34cd0b25
MD5 f3cb3fe5b48e4f185b2e6f010861d8d5
BLAKE2b-256 c1485c084fc21937e4cde5752d10c4b4e123e6a24df37c4acf8cce4dabf00258

See more details on using hashes here.

Provenance

File details

Details for the file arv-0.2-cp27-cp27m-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for arv-0.2-cp27-cp27m-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 43eaa8c3ffda637c2c45185e2be009dc547a4d86df7d62e786b02150015b2d36
MD5 df1f91e712143e29d40b077b87695df7
BLAKE2b-256 d1d1be5f1eb8ad00dbf7128d7c698fe2c3d3aac06329d6d2564fa33fac4d81cb

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page