Interval class and fasta access
Project description
An simple interval class for DNA sequences
Typically, you will create a genome and then use that object to create intervals. The intervals have a sequence property that will look up the actual sequence:
>>> from fastinterval import Genome, Interval >>> test_genome = Genome('test/example.fa') >>> int1 = test_genome.interval(100, 150, chrom='1') >>> print int1 1:100-150: >>> print int1.sequence GATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGA
fastinterval is using pyfasta to retrieve the sequence, so the access is mmapped. It supports strandedness, which will be respected when accessing the sequence:
>>> int2 = test_genome.interval(100, 150, chrom='1', strand=-1) >>> print int2.sequence TCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATC
The Interval class supports many interval operations:
>>> int1 = test_genome.interval(100, 150, chrom='1') >>> int2 = test_genome.interval(125, 175, chrom='1') >>> int1.distance(int2) 0 >>> int1.span(int2) Interval(100, 175) >>> int1.overlaps(int2) True >>> int1.is_contiguous(int2) True >>> int1.contains(int2) False >>> int1.intersection(int2) Interval(125, 150) >>> int1.union(int2) Interval(100, 175) >>> Interval.merge([int1, int2, test_genome.interval(200,250, chrom='1')]) [Interval(100, 175), Interval(200, 250)]
The Interval class is also based on bx python intervals. So you can pass in a value attritbue to point to an external object, and create interval trees and so on.
>>> from bx.intervals.intersection import IntervalTree >>> int3 = test_genome.interval(150, 200, chrom='1', value='foo') >>> tree = IntervalTree() >>> _ = map(tree.insert_interval, (int1, int2, int3)) >>> tree.find(190, 195) [Interval(150, 200, value=foo)]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
fastinterval-0.0.1.tar.gz
(4.8 kB
view hashes)