bioscripts.convert

Biopython scripts for converting molecular sequences between formats.

These details have not been verified by PyPI

Project links

Homepage

Project description

Introduction

Biopython scripts for converting molecular sequences.

Bioinformatics is bedevilled by a large number of file formats. Biopython provides classes and IO functions that allow interconversion. This module provides scripts that use Biopython internally to simply convert multiple files on the commandline.

Installation

bioscripts.convert [1] can be installed in a number of ways. Biopython [2] is a prerequisite.

Via easy_install or equivalent

From the commandline call:

% easy_install bioscripts.convert

Superuser privileges may be required.

Via setup.py

Download a source tarball, unpack it and call setup.py to install:

% tar zxvf bioscripts.convert.tgz
% cd bioscripts.convert
% python setup.py install

Superuser privileges may be required.

Usage

convbioseq.py [options] FORMAT INFILES ...

or:

convalign.py [options] FORMAT INFILES ...

with the options:

--version

show program’s version number and exit

-h, --help

show this help message and exit

-i FORMAT, --input-format=FORMAT

The format of the input biosequence files. If not supplied, this will be inferred from the extension of the files.

-e EXTENSION, --output-extension=EXTENSION

The extension of the output biosequence files. If not supplied, this will be inferred from the output format.

-t TYPE, --seqtype=TYPE

The type of sequence (dna or protein) being converted. Often this can be inferred from the input file, but sometimes must be explicitly set.

FORMAT must be one of clustal, fasta, genbank, nexus, phd, phylip, qual, stockholm, tab. The input formats inferred from extensions are clustal (‘.aln’), genbank (‘.genbank’), nexus (‘.nxs’), nexus (‘.nexus’), phylip (‘.phylip’), stockholm (‘.sth’), phd (‘.phd’), qual (‘.qual’), phylip (‘.phy’), clustal (‘.clustal’), genbank (‘.gb’), tab (‘.tab’), fasta (‘.fasta’), stockholm (‘.stockholm’). The default extensions for output formats are ‘.aln’ (clustal), ‘.nexus’ (nexus), ‘.phy’ (phylip), ‘.phd’ (phd), ‘.qual’ (qual), ‘.gb’ (genbank), ‘.sth’ (stockholm), ‘.fasta’ (fasta).

For example:

% convbioseq.py clustal one.fasta two.nxs three.stockholm

will produce three clustal formatted files ‘one.aln’, ‘two.aln’ and ‘three.aln’ from files it assumes are Fasta, Nexus and Stockholm formatted respectively.

% convbioseq.py -i phylip clustal one.fasta two.nxs

will produce two Phylip formatted files ‘one.phy’ and ‘two.phy’ and from files it assumes are Fasta formatted.

% convbioseq.py -e foo clustal one.fasta two.nxs

will produce two Clustal formatted files ‘one.foo’ and ‘two.foo’ from files it assumes are Fasta and Nexus formatted respectively.

Limitations

This module is not intended for importing, but the setuptools packaging and infrastructure make for simple distribution of scripts, allowing the checking of prerequisites, consistent installation and updating.

The bioscripts namespace was chosen as a convenient place to “keep” these scripts and is open to other developers.

Due to limitations on identifiers in certain formats, sequence names may differ between input and output files. Also, not all formats understood by Biopython have been enabled, due to being untested or incomplete.

Depending on your platform, the scripts may be installed as .py scripts, or some form of executable, or both.

Some formats (e.g. FASTA) do not specify sequence type, while others (e.g. NEXUS), absolutely require it. Thus, the sequence type option may need to be explicitly specified. Older versions of Biopython contain a bug that will prevent conversion to nexus format for associated reasons.

References

Changelog

0.2 - 2009/4/14

Initial release

0.3.1 - 2009/4/16

First public release
Added alignment converter
Corrections to documentation

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.4

Aug 10, 2011

0.3

Apr 16, 2009

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bioscripts.convert-0.4.tar.gz (10.0 kB view details)

Uploaded Aug 10, 2011 Source

File details

Details for the file bioscripts.convert-0.4.tar.gz.

File metadata

Download URL: bioscripts.convert-0.4.tar.gz
Upload date: Aug 10, 2011
Size: 10.0 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for bioscripts.convert-0.4.tar.gz
Algorithm	Hash digest
SHA256	`e56cfd4b005c80452b3a410ef748f4b559e76b33f2b73656a39af2162ab142c0`
MD5	`e168a61a3dfef8e004159c69553565ed`
BLAKE2b-256	`577bce1d03f9ea92544b23fb7e91a4b04ecf797d684e0f9b6c5fc1ac1b701bb1`

See more details on using hashes here.

bioscripts.convert 0.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Introduction

Installation

Via easy_install or equivalent

Via setup.py

Usage

Limitations

References

Changelog

0.2 - 2009/4/14

0.3.1 - 2009/4/16

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes