Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (pypi.python.org).
Help us improve Python packaging - Donate today!

Reformat and condense multiple sequence alignments to highlight variability

Project Description

alnvu makes a multiple alignment of biological sequences more easily readable by condensing it and highlighting variability.

Produces formatted multiple alignments in plain text, html, and pdf.

authors

  • Noah Hoffman
  • Chris Small
  • Connor McCoy
  • Tim Holland

dependencies

Required:

Optional:

installation

Installation is easiest using pip:

pip install alnvu

To install from the sources on GitHub, first clone the repository.

Using setup.py:

cd alnvu
python setup.py install

Using pip:

cd alnvu
pip install .

examples

% cd alnvu

The default output. Note that columns are numbered (column 8 is the first shown, column 122 is the last):

% ./av -w 80 testfiles/aln.fasta | head -n 15
     # 00000000000000000000000000000000000000000000000000000000000000000000000000000000
     # 00000000000000000000000000000000000000000000000000000000000000000000000000000000
     # 00111111111122222222223333333333444444444455555555556666666666777777777788888888
     # 89012345678901234567890123456789012345678901234567890123456789012345678901234567
 59735 agagtttgatcctggctcaggacgaacgcTGGCGGCgtGCTTAACACATGCAAGTCGAACGaTgAAgcggtGCTTgcacc
 70875 --------------------------------------------------------------------------------
 58095 agagtttgatcctggctcagagcgaacgcTGGCGGCatGCTTAACACATGCAAGTCGcACGGgtggtttcgGCcatc---
 70854 -----------------------------TGGCGGCagGCcTAACACATGCAAGTCGAgCGGatgAcgggAGCTTgctcc
 62024 agagtttgatcctggctcaggacgaacgcTGGCGGCgtGCTTAACACATGCAAGTCGAACGaTgAAgcctttCggggtgg
 59895 ---------------------------------------------------AAGTCGAACGGTgAAagagAGCTTagctc
 57728 -----------------------------------------------------------tt-------------------
 10734 ---------------------------gcTGaCGGCgtGCTTAACACATGCAAGTCGAACGGgatccattAGCgcttttg
 71041 --------------------------cgcTGGCGGCagGCTTAACACATGCAAGTCGAACGaTgAAgtctAGCTTgctag
 6161O -----------------------------TGGCGGt-gGCcTAACACATGCAAGTCGAACGGatccttcggGaTT-----

The input file can be provided via stdin:

% cat testfiles/aln.fasta | av

Exercising some of the options (show sequence numbers and a consensus; show differences with first sequence, restrict to columns 200-280):

% av testfiles/aln.fasta --number-sequences --consensus --range 200,280 --compare-to 59735
                # 000000000000000000000000000000000000000000000000000000000000000000000000000000000
                # 222222222222222222222222222222222222222222222222222222222222222222222222222222222
                # 000000000011111111112222222222333333333344444444445555555555666666666677777777778
                # 012345678901234567890123456789012345678901234567890123456789012345678901234567890
 1 ==REF==> 59735 TGGGGtG-TTGGTgGAAAGCgttatgga------------GTGGTTTTAGATGGGCTCACGGCCTATCAGCTTGTTGGTGA
 2          70875 gGGt----------GAAAGtGggggaccgcaaggcctc--acGcagcagGAgcGGCcgAtGtCtgATtAGCTaGTTGGTGg
 3          58095 gGcc----------cAAAGCcgaAaG--------------GcGccTTTgGAgcGGCctgCGtCCgATtAGgTaGTTGGTGg
 4          70854 gGGa----------GAAAGCaggggaccttcgggcctt--GcGcTaTcAGATGaGCctAgGtCggATtAGCTaGTTGGTGg
 5          62024 TGGGa-c-ggGGTtaAAAGCtccg----------------GcGGTgaagGATGaGCcCgCGGCCTATCAGCTTGTTGGTGg
 6          59895 TcttcaG-caGcTGGAAAGaaTT-----------------tcGGTcaggGATGaGCTCgCGGCCTATCAGCTTGTTGGTGA
 7          57728 TcGaG-GaTaGaT-GAAAGgtggcctctacatgtaagctatcacTgaagGAgGGGaTtgCGtCtgATtAGCTaGTTGGaGg
 8          10734 TGGGG-t-TgttgGGAAAGgtTTtTt--------------cTGGaTTggGATGGGCTCgCGGCtTATCAGCTTGTTGGTGg
 9          71041 gaGa----------GAAAGgGggcTtttagctc-------tcGcTaaTAGATGaGCctAaGtCggATtAGCTaGTTGGTGg
10          6161O gGGG----------GAAAGatTTA----------------tcGccaTTgGAgcGGCcCgCGtCtgATtAGCTaGTTGGTGg
11      CONSENSUS xGGx------x-x-GAAAGxxxxxxx--------------xcGcTxxxgGATGGGCcxgCGtCxgATtAGCTaGTTGGTGg

The above alignment rendered as colored html (thanks @timholl):

% av testfiles/aln.fasta --number-sequences --consensus --range 200,280 --compare-to 59735 -q --html aln.html

Write a single-page pdf file:

% av testfiles/aln.fasta --pdf test.pdf --quiet --blocks-per-page=5

And do you know about seqmagick? If not, run, don’t walk to https://github.com/fhcrc/seqmagick and check it out, so that you can do this:

% seqmagick convert testfiles/ae_like.sto --output-format=fasta - | av -cx
               # 000000000000000000000000000000000
               # 445555555555566666666666666667777
               # 990111111155813445566778888991122
               # 791123678914209568907050235891215
  GA05AQR01D2ULR ...............TTGGT.GT..AG...A..
  GA05AQR01DFGSE ........................T.TAAGT..
  GA05AQR01CI0QB ...........A.....................
  GA05AQR01DW22X .TC..G.T.T.......................
  GA05AQR01A5WF4 ....................A........-T..
  GA05AQR01BUV2U ---..............................
  GA05AQR01B1R8I .............T...............CT..
  GA05AQR02JASPX ........A........................
  GCX02B001AYSTJ .............................-TA.
  GCX02B001DP9EQ ............A..........CA.......T
  GCX02B001AFAY1 ..............G..................
  GCX02B002J489C ...-......A......................
  GLKT0ZE01EDLCP AT...ATT.T.......................
  GLKT0ZE02I8LRD ---GA............................
-ref-> CONSENSUS TCTAGCGCGCGGGGACGAACGAGGCGCGCTGGA
Release History

Release History

This version
History Node

0.3.2

History Node

0.3.1

History Node

0.2

History Node

0.1.0

Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
alnvu-0.3.2.tar.gz (15.7 kB) Copy SHA256 Checksum SHA256 Source Dec 7, 2017

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting