Skip to main content

Python interface for tabix

Project description

April 16, 2014

This module allows fast random access to files compressed with bgzip and indexed by tabix. It includes a C extension with code from klib. The bgzip and tabix programs are available here.

Installation

pip install --user pytabix

Synopsis

Genomics data is often in a table where each row corresponds to a genomic region (start, end) or a position:

chrom  pos      snp
1      1000760  rs75316104
1      1000894  rs114006445
1      1000910  rs79750022
1      1001177  rs4970401
1      1001256  rs78650406

With tabix, you can quickly retrieve all rows in a genomic region by specifying a query with a sequence name, start, and end:

import tabix

# Open a remote or local file.
url = "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20100804/"
url += "ALL.2of4intersection.20100804.genotypes.vcf.gz"

tb = tabix.open(url)

# These queries are identical. A query returns an iterator over the results.
records = tb.query("1", 1000000, 1250000)

records = tb.queryi(0, 1000000, 1250000)

records = tb.querys("1:1000000-1250000")

# Each record is a list of strings.
for record in records:
    print record[:5]
    break
['1', '1000071', '.', 'C', 'T']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytabix-0.0.2.tar.gz (46.8 kB view details)

Uploaded Source

File details

Details for the file pytabix-0.0.2.tar.gz.

File metadata

  • Download URL: pytabix-0.0.2.tar.gz
  • Upload date:
  • Size: 46.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pytabix-0.0.2.tar.gz
Algorithm Hash digest
SHA256 7cdefa37f77e59c1ddea3c402e00cb352e5be2aa7fef41feb49ed930a86ace5f
MD5 c558be1d55a8b72c92668837305054e2
BLAKE2b-256 c537e30f6b04237801d072938ca55fde9c3773cc981043152a9ccc16a028f321

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page