Skip to main content

Parser and writer of IGD files for genetic data

Project description

Python build and test

pyigd

PyIGD is a Python-only parser the Indexable Genotype Data (IGD) format. See our short paper (or the preprint) that describes the format and some of its advantages.

For tools to manipulate IGD files (and convert VCF to IGD), use igdtools (pip install igdtools).

For a C++ library that supports creating and parsing IGD, see picovcf (which supports VCF to IGD conversion).

Installation

You can install the latest release of PyIGD from pypi, via pip install pyigd.

For development, you can clone the code and install it directly from the directory (this will automatically reflect any code changes you make):

pip install -e pyigd/

or build and install via the wheel:

cd pyigd/ && python setup.py bdist_wheel
pip install --force-reinstall dist/*.whl

Usage

The pyigd.IGDReader class reads IGD data from a buffer. See the example script that loads an IGD file, prints out some meta-data, and then iterates the genotype data for all variants. Generally the usage pattern is:

with open(filename, "rb") as f:
  igd_reader = pyigd.IGDReader(f)

There is also the pyigd.IGDWriter class to construct IGD files. Related is pyigd.IGDTransformer, which is a way to create a copy of an IGD while modifying its contents. See the IGDTransformer sample list example and bitvector example.

IGD can be highly performant for a few reasons:

  1. It stores sparse data sparsely. Low-frequency variants are stored as sample lists. Medium/high frequency variants are stored as bit vectors.
  2. It is indexable (you can jump directly to data for the ith variant). Since the index is stored in its own section of the file, scanning the index is extremely fast. So only looking at variants for a particular range of the genome is very fast (in this case you would use pyigd.IGDFile.get_position_and_flags() to find the first variant index within the range, and then use pyigd.IGDFile.get_samples() after that).
  3. The genotype data is stored in one of two very simple binary formats. This makes parsing fast, and the compact nature of the file makes reading from disk/memory fast as well.

How do I use IGD in my project?

  • Install igdtools. The easiest way is via pip install igdtools.
    • igdtools can convert from VCF to IGD, among other things (such as filtering IGD files).
  • Do one of the following:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyigd-1.4.tar.gz (34.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyigd-1.4-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file pyigd-1.4.tar.gz.

File metadata

  • Download URL: pyigd-1.4.tar.gz
  • Upload date:
  • Size: 34.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for pyigd-1.4.tar.gz
Algorithm Hash digest
SHA256 3c7c1a13af13e16d4b4b6a8930a22cf032e3751ce98d18d2126e93eb2b3f60e4
MD5 632d084f64084f43bc017aa57de82054
BLAKE2b-256 0fbdfcf0f8e2cec62b8d4b8b4f9c9b662d762452244ef040491861bcc5276d5d

See more details on using hashes here.

File details

Details for the file pyigd-1.4-py3-none-any.whl.

File metadata

  • Download URL: pyigd-1.4-py3-none-any.whl
  • Upload date:
  • Size: 14.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for pyigd-1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 2dde251264b5d8e9dcff256b313f63be028e06e0f338366084e62d8b4ccb99d1
MD5 9e0932d70285f56ecb2061fe3439b25e
BLAKE2b-256 04f283f740b742c39d3bfc50d1b23fd22560c8bccff44a658c58ac651e8aa315

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page