Utilities to read and write FASTA- and FASTQ-files.
Project description
Dinopy’s goal is to make files containing biological sequences easily and efficiently accessible for python programmers, allowing them to focus on their application instead of file-io.
#!python import dinopy fq_reader = dinopy.FastqReader("reads.fastq") for sequence, name, quality in fq_reader.reads(quality_values=True): if some_function(quality): analyze(seq)
Features
Easy to use reader and writer for FASTA- and FASTQ-files.
Specifiable data type and representation for return values (bytes, strings and integers see dtype for more information).
Works directly on gzipped files.
Iterators for q-grams of a sequence (also allowing shaped q-grams).
(Reverse) complement.
Chromosome selection from FASTA files.
Implemented in Cython for additional speedup.
Getting Started
Installation
Dinopy can be downloaded from Bitbucket and compiled using its setup.py:
Download source code from bitbucket.
Install globally:
$ python setup.py install
or only for the current user:
$ python setup.py install --user
Use dinopy:
$ python >>> import dinopy
System requirements
We recommend using anaconda:
$ conda create -n dinopy python cython numpy
Platform support
Dinopy has been tested on Ubuntu, Arch Linux and OS X (Yosemite and El Capitan).
We do not officially support Windows - dinopy will probably work, but there might be problems due to different linebreak styles; we assume \n as separator but the probability to encounter files with \r\n as line-separator might be higher on Windows.
Planned features
SAM-reader / -writer
quality-trimming for FASTQ-reader
GFF-reader
License
Dinopy is Open Source and licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.