Skip to main content

Remove outlier sequences from multiple sequence alignment

Project description

pysickle.py will try to remove sequences that cause misalignments from a multiple sequence alignment (MSA). It reads a given MSA in multi-fasta format and removes sequences with the highest penalty scores, then builds the next MSA without those sequences. This process is repeated until a user-specified cuttoff is reached or less than three sequences are left to be aligned.

Usage:

######################################
# pysickle.py
######################################
usage:
    pysickle.py -f multifasta alignment
options:
    -f, --fasta=FILE    multifasta alignment (eg. "align.fas")
    OR
    -F, --fasta_dir=DIR directory with multifasta files (needs -s SUFFIX)
    -s, --suffix=SUFFIX will try to work with files that end with SUFFIX (eg ".fas")

    -a, --msa_tool=STR  supported: "mafft" [default:"mafft"]
    -i, --max_iterations=NUM    force stop after NUM iterations
    -n, --num_threads=NUM   max number of threads to be executed in parallel [default: 1]
    -m, --mode=MODE         set strategy to remove outlier sequences [default: "Sites"]
                            available modes (not case sensitive):
                                "Sites", "Gaps", "uGaps","Insertions",
                                "uInsertions","uInstertionsGaps", "custom"
    -l, --log       write logfile
    -h, --help      prints this

only for mode "custom":
    -g, --gap_penalty=NUM        set gap penalty [default: 1.0]
    -G, --unique_gap_penalty=NUM set unique gap penalty [default: 10.0]
    -j, --insertion_penalty=NUM  set insertion penalty [default:1.0]
    -J, --unique_insertion_penalty=NUM set insertion penalty [default:1.0]
    -M, --mismatch_penalty=NUM   set mismatch penalty [default:1.0]
    -r, --match_reward=NUM       set match reward [default: -10.0]

Currently supported multiple sequence aligners:

  • mafft (Katoh, Standley 2013 (Molecular Biology and Evolution 30:772-780) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. http://mafft.cbrc.jp/alignment/software/)

Requirements

  • matplotlib
  • numpy

External Programs

  • mafft

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for pysickle, version 0.1
Filename, size File type Python version Upload date Hashes
Filename, size pysickle-0.1-py2.7.egg (23.0 kB) File type Egg Python version 2.7 Upload date Hashes View
Filename, size pysickle-0.1.tar.gz (7.3 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page