Python interface for tabix
Project description
April 16, 2014
This module allows fast random access to files compressed with bgzip and indexed by tabix. It includes a C extension with code from klib. The bgzip and tabix programs are available here.
Installation
pip install --user pytabix
Synopsis
Genomics data is often in a table where each row corresponds to a genomic region (start, end) or a position:
chrom pos snp 1 1000760 rs75316104 1 1000894 rs114006445 1 1000910 rs79750022 1 1001177 rs4970401 1 1001256 rs78650406
With tabix, you can quickly retrieve all rows in a genomic region by specifying a query with a sequence name, start, and end:
import tabix # Open a remote or local file. url = "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20100804/" url += "ALL.2of4intersection.20100804.genotypes.vcf.gz" tb = tabix.open(url) # These queries are identical. A query returns an iterator over the results. records = tb.query("1", 1000000, 1250000) records = tb.queryi(0, 1000000, 1250000) records = tb.querys("1:1000000-1250000") # Each record is a list of strings. for record in records: print record[:5] break
['1', '1000071', '.', 'C', 'T']
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pytabix-0.0.2.tar.gz
(46.8 kB
view details)
File details
Details for the file pytabix-0.0.2.tar.gz
.
File metadata
- Download URL: pytabix-0.0.2.tar.gz
- Upload date:
- Size: 46.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7cdefa37f77e59c1ddea3c402e00cb352e5be2aa7fef41feb49ed930a86ace5f |
|
MD5 | c558be1d55a8b72c92668837305054e2 |
|
BLAKE2b-256 | c537e30f6b04237801d072938ca55fde9c3773cc981043152a9ccc16a028f321 |