Biopython scripts for converting molecular sequences.
Project description
Introduction
Biopython scripts for converting molecular sequences.
Bioinformatics is bedevilled by a large number of file formats. Biopython provides classes and IO functions that allow interconversion. This module provides scripts that use Biopython internally to simply convert multiple files on the commandline.
Installation
bioscripts.convert [1] can be installed in a number of ways. Biopython [2] is required. Either of the automated methods using setuptools [3] are preferred, but a manual installation will suffice if need be.
Via setuptools / easy_install
From the commandline call:
% easy_install bioscripts.convert
Superuser privileges may be required.
Via setup.py
Download a source tarball, unpack it and call setup.py to install:
% tar zxvf bioscripts.convert.tgz % cd bioscripts.convert % python setup.py install
Superuser privileges may be required.
Manual
Download and unpack the tarball as above. Ensure Biopython is available. Copy the scripts in bioscripts/convert to a location they can be called from.
Usage
Due to limitations on identifiers in certain formats, sequence names may differ between input and output files. Also, not all formats understood by Biopython have been enabled, due to being untested or incomplete.
convbioseq
convbioseq.py [options] FORMAT INFILES ...
with the options:
- --version
show program’s version number and exit
- -h, --help
show this help message and exit
- -i FORMAT, --input-format=FORMAT
The format of the input biosequence files. If not supplied, this will be inferred from the extension of the files.
- -e EXTENSION, --output-extension=EXTENSION
The extension of the output biosequence files. If not supplied, this will be inferred from the output format.
FORMAT must be one of ‘clustal’, ‘fasta’, ‘genbank’, ‘nexus’, ‘phd’, ‘phylip’, ‘qual’, ‘stockholm’. The input formats inferred from extensions are clustal (‘.aln’), genbank (‘.genbank’), nexus (‘.nxs’), nexus (‘.nexus’), phylip (‘.phylip’), stockholm (‘.sth’), phd (‘.phd’), qual (‘.qual’), phylip (‘.phy’), clustal (‘.clustal’), genbank (‘.gb’), tab (‘.tab’), fasta (‘.fasta’), stockholm (‘.stockholm’). The default extensions for output formats are ‘.aln’ (clustal), ‘.nexus’ (nexus), ‘.phy’ (phylip), ‘.phd’ (phd), ‘.qual’ (qual), ‘.gb’ (genbank), ‘.sth’ (stockholm), ‘.fasta’ (fasta).
For example:
% convbioseq.py clustal one.fasta two.nxs three.stockholm
will produce three clustal formatted files ‘one.aln’, ‘two.aln’ and ‘three.aln’ from files it assumes are Fasta, Nexus and Stockholm formatted respectively.
% convbioseq.py -i phylip clustal one.fasta two.nxs
will produce two Phylip formatted files ‘one.phy’ and ‘two.phy’ and from files it assumes are Fasta formatted.
% convbioseq.py -e foo clustal one.fasta two.nxs
will produce two Clustal formatted files ‘one.foo’ and ‘two.foo’ from files it assumes are Fasta and Nexus formatted respectively.
convalign
convalign.py [options] FORMAT INFILES ...
with the options:
- --version
show program’s version number and exit
- -h, --help
show this help message and exit
- -i FORMAT, --input-format=FORMAT
The format of the input alignment files. If not supplied, this will be inferred from the extension of the files.
- -e EXTENSION, --output-extension=EXTENSION
The extension of the output alignment files. If not supplied, this will be inferred from the output format.
FORMAT must be one of ‘clustal’, ‘fasta’, ‘nexus’, ‘phylip’, ‘stockholm’. The input formats inferred from extensions are clustal (‘.aln’), nexus (‘.nxs’), nexus (‘.nexus’), phylip (‘.phylip’), stockholm (‘.sth’), phylip (‘.phy’), clustal (‘.clustal’), stockholm (‘.stockholm’), fasta (‘.fasta’). The default extensions for output formats are ‘.nxs’ (nexus), ‘.phy’ (phylip), ‘.fasta’ (fasta), ‘.aln’ (clustal), ‘.sth’ (stockholm).
Developer notes
This module is not intended for importing, but the setuptools packaging and infrastructure make for simple distribution of scripts, allowing the checking of prerequisites, consistent installation and updating.
The bioscripts namespace was chosen as a convenient place to “keep” these scripts and is open to other developers.
References
Changelog
0.2 - 2009/4/14
Initial release
0.3.1 - 2009/4/16
Added alignment converter
Corrections to documentation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for bioscripts.convert-0.3-py2.5.egg
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5c292528e76696ff759688ff2a4ea6f3a60c4bafb96e9ed0d07029424b773664 |
|
MD5 | 8ee5cb72ac96927a370d20f24e0dad42 |
|
BLAKE2b-256 | 7a2d81af3966d65d6f4bfc258b42ae27a41b021e53c2064b8cace285a6b46b1e |