Python interface for tabix
Project description
April 16, 2014
This module allows fast random access to files compressed with bgzip and indexed by tabix. It includes a C extension with code from klib. The bgzip and tabix programs are available here.
Installation
pip install --user pytabix
Synopsis
Genomics data is often in a table where each row corresponds to a genomic region (start, end) or a position:
chrom pos snp 1 1000760 rs75316104 1 1000894 rs114006445 1 1000910 rs79750022 1 1001177 rs4970401 1 1001256 rs78650406
With tabix, you can quickly retrieve all rows in a genomic region by specifying a query with a sequence name, start, and end:
import tabix
# Open a remote or local file.
url = "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20100804/"
url += "ALL.2of4intersection.20100804.genotypes.vcf.gz"
tb = tabix.open(url)
# These queries are identical. A query returns an iterator over the results.
records = tb.query("1", 1000000, 1250000)
records = tb.queryi(0, 1000000, 1250000)
records = tb.querys("1:1000000-1250000")
# Each record is a list of strings.
for record in records:
print record[:5]
break
['1', '1000071', '.', 'C', 'T']
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pytabix-0.0.2.tar.gz
(46.8 kB
view details)
File details
Details for the file pytabix-0.0.2.tar.gz.
File metadata
- Download URL: pytabix-0.0.2.tar.gz
- Upload date:
- Size: 46.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7cdefa37f77e59c1ddea3c402e00cb352e5be2aa7fef41feb49ed930a86ace5f
|
|
| MD5 |
c558be1d55a8b72c92668837305054e2
|
|
| BLAKE2b-256 |
c537e30f6b04237801d072938ca55fde9c3773cc981043152a9ccc16a028f321
|