Skip to main content

pythonic access to fasta sequence files

Project description

Description:

pythonic access to fasta sequence files.

Author:

Brent Pedersen (brentp)

License:

MIT

Implementation

Requires Python >= 2.6. Stores a flattened version of the fasta file without spaces or headers. And a pickle of the start, stop (for fseek) locations of each header in the fasta file for internal use.

Usage

>>> from pyfasta import Fasta

>>> f = Fasta('tests/data/three_chrs.fasta')
>>> sorted(f.keys())
['chr1', 'chr2', 'chr3']

>>> f['chr1']
FastaRecord('tests/data/three_chrs.fasta.flat', 0..80)

>>> f['chr1'][:10]
'ACTGACTGAC'


# the index stores the start and stop of each header from teh fasta file
>>> f.index
{'chr3': (160, 3760), 'chr2': (80, 160), 'chr1': (0, 80)}


# can query by a 'feature' dictionary
>>> f.sequence({'chr': 'chr1', 'start': 2, 'stop': 9})
'CTGACTGA'

# with reverse complement for - strand
>>> f.sequence({'chr': 'chr1', 'start': 2, 'stop': 9, 'strand': '-'})
'TCAGTCAG'


# creates a .flat and a .gdx pickle of the fasta and the index.
>>> import os
>>> sorted(os.listdir('tests/data/'))[1:]
['three_chrs.fasta', 'three_chrs.fasta.flat', 'three_chrs.fasta.gdx']

# cleanup (though for real use these will remain for faster access)
>>> os.unlink('tests/data/three_chrs.fasta.gdx')
>>> os.unlink('tests/data/three_chrs.fasta.flat')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyfasta-0.2.2.tar.gz (5.1 kB view details)

Uploaded Source

File details

Details for the file pyfasta-0.2.2.tar.gz.

File metadata

  • Download URL: pyfasta-0.2.2.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pyfasta-0.2.2.tar.gz
Algorithm Hash digest
SHA256 20d048f5eec76cd55c327863e1a28b983430056b9c1f5c09f45d30c7aa645af8
MD5 c744d505b8483a49834a3d10ab13d949
BLAKE2b-256 42d9fe2fc89cc25862f5c7fbacb4b0aa0cdff2ff8a47c2d6bd2307639e2cbb2f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page