Skip to main content

revtrans - performs a reverse translation of a peptide alignment

Project description

NAME
revtrans - performs a reverse translation of a peptide alignment

SYNOPSIS
revtrans dnafile pepfile [-v] [-h] [-gapin chars] [-gapout char]
[-Idna format] [-Ipep format] [-mtx tablename/file] [-match method]
[-O format] [outfile]

DESCRIPTION
Reads a set of aligned peptide sequences from pepfile and uses
the corresponding DNA sequences from dnafile to construct a
reverse translated version of the alignment.

By default the input file formats are auto detected and the
corresponding DNA and peptide sequences is found by translation.

In the typical case this means that the user only need to
supply the DNA and peptide sequences, and may safely ignore
the more advanced options. E.g :

revtrans kinases.dna.fsa kinases.prot.aln

The final alignment is written to STDOUT or outfile if specified,
and is by default in FASTA format.

OPTIONS
-h
Help. Print this help information.

-gapin chars
Specify gap characters in the input sequences.
Default is '.-~'

-gapout char
Specify which character should be used for gaps in the
output.
Default is '-'

-Idna format
Specify format of the input DNA file.
Valid formats are: auto (default), fasta, msf and aln

-Ipep format
Specify format of the input peptide file.
Valid formats are: auto (default), fasta, msf and aln

-O format
Specify format of the output file.
Valid formats are: fasta (default), msf and aln

-mtx tablename/file
Use alternative translation matrix instead of the build-in
Standard Genetic Code for translation.

If "tablename" is 1-6,9-16 or 21-23 one of the alternative
translation tables defined by the NCBI taxonomy group will be
used.

Briefly, the following tables are defined:
-----------------------------------------
1: The Standard Code
2: The Vertebrate Mitochondrial Code
3: The Yeast Mitochondrial Code
4: The Mold, Protozoan, and Coelenterate Mitochondrial Code
and the Mycoplasma/Spiroplasma Code
5: The Invertebrate Mitochondrial Code
6: The Ciliate, Dasycladacean and Hexamita Nuclear Code
9: The Echinoderm and Flatworm Mitochondrial Code
10: The Euplotid Nuclear Code
11: The Bacterial and Plant Plastid Code
12: The Alternative Yeast Nuclear Code
13: The Ascidian Mitochondrial Code
14: The Alternative Flatworm Mitochondrial Code
15: Blepharisma Nuclear Code
16: Chlorophycean Mitochondrial Code
21: Trematode Mitochondrial Code
22: Scenedesmus obliquus mitochondrial Code
23: Thraustochytrium Mitochondrial Code

See http://www.ncbi.nlm.nih.gov/Taxonomy [Genetic Codes]
for a detailed description. Please notice that the table
of start codons is also used (see the -allinternal option
below for details).

If a filename is supplied the translation table is read from
file instead.

The file should contain one line per codon in the format:

codon<whitespace>aa-single letter code

All 64 codons must be included. Stop codons is specified
by "*". T and U is interchangeable. Blank lines and lines
starting with "#" are ignored.

See the "gcMitVertebrate.mtx" file in the RevTrans source
distribution for a well documented example.

-allinternal
By default the very first codon in each sequences is assumed
to be the initial codon on the transcript. This means certain
non-methionine codons actually codes for metionine at this
position. For example "TTG" in the standard genetic code (see
above).

Selecting this option treats all codons as internal codons.

-readthroughstop
Allow the translation to continue after a stop codon is reached.
The stop codon will be marked as "*".

Be careful that stop codons have been addressed in the same manner
in the input peptide alignment.

-match method
Specify how to match the corresponding DNA and peptide
sequences. Valid methods are: trans (default), name and pos.

Please note that both DNA and peptide sequence should have
unique names, regardless of the matching method.

trans:
Match sequences by translation. The DNA sequences are
translated using the standard genetic code (default)
or an alternative translation matrix if the -mtx
option is used.

name:
Match sequences by name. Please note that for FASTA
files everything after the ">" is considered the
sequence name.

pos:
Match by position. The sequence are matched by position
in the files (first DNA sequence with first peptide
sequence etc.).
-v
Verbose. Print extra information about files, sequences
and the progress in general to STDERR.

The verbose level can be set to three degrees of
detail.

-v: verbose level 1
Info about files, number of sequences read etc.
Use this as the first try if something needs
investigation.

-vv: verbose level 2
As level 1 +
Print detailed info about all the sequence names.

-vvv: verbose level 3
As level 2 +
Do a sanity check on the degapped length of the
sequences. Warn if the sizes do not match.

AUTHOR
Rasmus Wernersson, raz@cbs.dtu.dk
September 2002, February 2003, July 2004, April 2005

FILES
revtrans.py, mod_translate.py, mod_seqfiles.py,
ncbi_genetic_codes.py

WEB PAGE
http://www.cbs.dtu.dk/services/RevTrans/

REFERENCE
Rasmus Wernersson and Anders Gorm Pedersen.
RevTrans - Constructing alignments of coding DNA from aligned amino
acid sequences.
Nucl. Acids Res., 2003, 31(13), 3537-3539.

Project details


Release history Release notifications | RSS feed

This version

1.4

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

revtrans-1.4.tar.gz (15.5 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page