Skip to main content

Rapid identification of sequence evolution models

Project description

modelmatcher: Rapid identification of evolutionary models

This tool reads multiple sequence alignments and determines a suitable sequence evolution model for your phylogenetic analysis.

Usage

Example usage:

$ modelmatcher inputfile.fasta

The input file is a multiple sequence alignmnent in one of these common formats:

  • FASTA
  • Clustal
  • NEXUS
  • PHYLIP
  • STOCKHOLM

The output is a list of models, in order of fit to data, and their modelmatcher score. The base model (such as JTT, WAG, LG, etc) is predicted, as well as whether one should adapt to the alignments amino acid composition (i.e., JTT+F, WAG+F, etc).

If you want to automatically feed the prediction from modelmatcher to a phylogenetic inference software, consider using the -of option:

iqtree  -s infile.phy  -m $(modelmatcher -of iqtree infile.phy)

The dollar-parenthesis is a subcommand and the output is a single model name. Only models accepted by the given application (here: IQTREE) are output.

Options

Optional options:

  -h, --help            show this help message and exit
  -f {guess,fasta,clustal,nexus,phylip,stockholm}, --format {guess,fasta,clustal,nexus,phylip,stockholm}
                        Specify what sequence type to assume. Be specific if
                        the file is not recognized automatically. When reading
                        from stdin, the format is always guessed to be FASTA.
                        Default: guess
  -m filename, --model filename
                        Add the model given in the file to the comparisons.
  -nf, --no-F-testing   Do not try +F models, i.e., do not test with amino
                        acid frequencies estimated from the MSA.
  -s int, --sample-size int
                        For alignments with many sequences, decide on an upper
                        bound of sequence pairs to use from the MSA. The
                        computational complexity grows quadratically in the
                        number of sequences, so a choice of 5000 bounds the
                        growth for MSAs with more than 100 sequence.
  -of {tabular,json,iqtree,raxml,phyml,mrbayes}, --output_format {tabular,json,iqtree,raxml,phyml,mrbayes}
                        Choose output format. Tabular format is default. JSON
                        is for convenient later parsing, with some additional
                        meta-data added. For one-line output convenient for
                        immediate use by inference tools, consider raxml and
                        similar choices. Note that the PhyML and MrBayes
                        options are restricted to their implemented models.
                        Although PhyML supports the +F models (using the "-f
                        e" option), this is not reflected in the output from
                        "modelmatcher -of phyml ..." at this time.
  --list-models         Output a list of models implemented in modelmatcher,
                        then exit.
  --verbose             Output progress information
  --version

See the section "Output" below for some more examples.

Input formats

Input format is detected automatically from the following list, but can also be requested specifically.

  • FASTA
  • Phylip
  • Nexus
  • Clustal
  • Stockholm

Output

The default output is given as a simple text table, or in JSON format for easy parsing by other scripts, ranking possible models in preference order. For example, the command above may yield a table looking like:

WAG             7.972
VT              8.238
BLOSUM62        8.478
JTT             8.864
JTT-DCMUT       8.917
LG              9.984
DCMUT          10.467
Dayhoff        10.495
FLU            11.211
HIVb           12.853
RtREV          14.048
cpREV          14.186
HIVw           17.338
MtZoa          18.476
MtMAM          21.453
mtArt          21.741
MtREV          22.059

Each model is given with its modelmatcher score.

Alternatively, the same analysis can look like:

$ modelmatcher  --json  inputfile.fasta
{"n_observations": 863692, "infile": "inputfile.fasta", "n_seqs": 66, "model_ranking": [["WAG", 7.972410383355675], ["VT", 8.238362164888876], ["BLOSUM62", 8.478000205922985], ["JTT", 8.863578165338444], ["JTT-DCMUT", 8.917496451351846], ["LG", 9.983874357603963], ["DCMUT", 10.466872509785343], ["Dayhoff", 10.49522598111376], ["FLU", 11.21137482805874], ["HIVb", 12.852877789672046], ["RtREV", 14.047539707772572], ["cpREV", 14.18648653904322], ["HIVw", 17.338193829402], ["MtZoa", 18.475515151949153], ["MtMAM", 21.452528293860837], ["mtArt", 21.740741039472418], ["MtREV", 22.058622800684176]]}

Install

Recommended installation is:

pip install --upgrade pip
pip install modelmatcher

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modelmatcher-1.2.tar.gz (34.6 kB view details)

Uploaded Source

Built Distribution

modelmatcher-1.2-py3-none-any.whl (33.2 kB view details)

Uploaded Python 3

File details

Details for the file modelmatcher-1.2.tar.gz.

File metadata

  • Download URL: modelmatcher-1.2.tar.gz
  • Upload date:
  • Size: 34.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.25.0 CPython/3.6.9

File hashes

Hashes for modelmatcher-1.2.tar.gz
Algorithm Hash digest
SHA256 870d66b55ed7a13c542636e7c13b9c12e08660db1d16952fbdb302bc57668101
MD5 b4e40f472f0227a1c339900ac124986e
BLAKE2b-256 cdd41959ee1d121e6309c87b6c40d7480dc3253ed650215ffe09ef26a6b43b51

See more details on using hashes here.

File details

Details for the file modelmatcher-1.2-py3-none-any.whl.

File metadata

  • Download URL: modelmatcher-1.2-py3-none-any.whl
  • Upload date:
  • Size: 33.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.25.0 CPython/3.6.9

File hashes

Hashes for modelmatcher-1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b309990a195c2d7ea63d85e7a3c85861116cb268c7e3eee23d42ced22354cc5f
MD5 ad96e70a1e2f68ae388b3f502cc32a39
BLAKE2b-256 1614ef8ba9d1523fcd8e7ab11c313df9cadd8a8053fb0879f4bd9bcfec375a0c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page