This is a pre-production deployment of Warehouse, however changes made here WILL affect the production instance of PyPI.
Latest Version Dependencies status unknown Test status unknown Test coverage unknown
Project Description

alnvu makes a multiple alignment of biological sequences more easily readable by condensing it and highlighting variability.

dependencies

Required:

  • Python 2.7

Optional:

installation

Using setup.py:

cd alnvu
python setup.py install

examples

All of these examples can be run from within the package directory:

% cd alnvu
% ./av --help

usage: av [-h] [-v] [-q] [-w NUMBER] [-L NUMBER] [-x] [-g] [-r INTERVAL]
          [-s NUMBER] [-c] [-d NUMBER] [-D] [-C CASE] [-G] [-i] [-n NUMBER]
          [-N CHARACTER] [-S FILE] [-T FILE] [-o OUTFILE] [-F NUMBER]
          [-O ORIENTATION] [-b NUMBER]
          [infile]

Create formatted sequence alignments with optional pdf output.

positional arguments:
  infile                Input file in fasta format (reads stdin if missing)

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -q, --quiet           Suppress output of alignment to screen.

Layout:
  -w NUMBER, --width NUMBER
                        Width of sequence to display in each block in
                        characters [115]
  -L NUMBER, --lines-per-block NUMBER
                        Sequences (lines) per block. [75]

Column selection:
  -x, --exclude-invariant
                        Show only columns with at least N non-consensus bases
                        (set N using the '-a/--min-subs')
  -g, --include-gapcols
                        Show columns containing only gap characters.
  -r INTERVAL, --range INTERVAL
                        Range of columns to display (eg '-r start,stop')
  -s NUMBER, --min-subs NUMBER
                        Minimum NUMBER of substitutions required to define a
                        position as variable. [1]

Consensus display and sequence appearance:
  -c, --consensus       Show the consensus sequence [False]
  -d NUMBER, --compare-to NUMBER
                        Identify the reference sequence. Nucleotide positions
                        identical to the reference will be shown as a '.' The
                        default behavior is to use the consensus sequence as a
                        reference. Use the -i option to display the sequence
                        numbers for reference.
  -D, --no-comparison   Show all bases (ie, suppress comparsion with the
                        reference sequence).
  -C CASE, --case CASE  Convert all characters to a uniform case
                        ('upper','lower')
  -G, --ignore-gaps     Ignore gaps in the calculation of a consensus.

Sequence annotation:
  -i, --number-sequences
                        Show sequence number to left of name.
  -n NUMBER, --name-max NUMBER
                        Maximum width of sequence name in characters [35]
  -N CHARACTER, --name-split CHARACTER
                        Specify a character delimiting sequence names. By
                        default, the name of each sequence is the first
                        whitespace-delimited word. '--name-split=none' causes
                        the entire line after the '>' to be displayed.
  -S FILE, --sort-by-name FILE
                        File containing sequence names defining the sort-order
                        of the sequences in the alignment.
  -T FILE, --sort-by-tree FILE
                        File containing a newick-format tree defining the
                        sort-order of the sequences in the alignment (requires
                        biopython).

PDF output:
  These options require reportlab.

  -o OUTFILE, --outfile OUTFILE
                        Write output to a pdf file.
  -F NUMBER, --fontsize NUMBER
                        Font size for pdf output [7]
  -O ORIENTATION, --orientation ORIENTATION
                        Set page orientation; choose from portrait, landscape
                        [portrait]
  -b NUMBER, --blocks-per-page NUMBER
                        Number of aligned blocks of sequence per page [1]

The default output. Note that columns are numbered (column 8 is the first shown, column 122 is the last):

% ./av testfiles/10patients_aln.fasta | head -n 15
         # 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
         # 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011111111111111111111111
         # 0011111111112222222222333333333344444444445555555555666666666677777777778888888888999999999900000000001111111111222
         # 8901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012
    H59735 AGAGTTTGATCCTGGCTCAGGACGAACGC.......GT.......................A.G..GCGGT....GCACCGTGGATT..........................T.
    T70875 ...........................---------------------------------------------------.----..--......--------------......T.
    F58095 AGAGTTTGATCCTGGCTCAGAGCGAACGC.......AT...................C....GTGGTTTCG..CATC-.----..--.............G.............G
    T70854 ...........................--.......AG..C.................G...ATG.CGGG.....GCTCCTTGATTC........C....G............TG
    F62024 AGAGTTTGATCCTGGCTCAGGACGAACGC.......GT.......................A.G..GCCTTT.GGGGTGGATT..--............................
    H59895 ...........................------------------------............G..AGAG.....AGCTCTCTGGATC...........................
    F57728 ...........................--------------------------------TT-----------------.----..--...........................G
    M10734 ...........................GC..A....GT........................GATCCATT...GCTTTTGTGTTTTTGGTGAG......................
    T71041 ..........................CGC.......AG.......................A.G..GTCT.....GCTAGACGGATT..........................TG
    M6161O ...........................--......T-G..C.....................ATCCTTCGG.A..---.----..--.............G..............

The input file can be provided via stdin:

% cat testfiles/10patients_aln.fasta | ./av

Exercising some of the options (show sequence numbers and a consensus; show differences with sequence number 1, restrict to columns 200-300):

% ./av testfiles/10patients_aln.fasta --number-sequences --consensus --compare-to 1 --range 200,300
               # 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
               # 22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222223
               # 00000000001111111111222222222233333333334444444444555555555566666666667777777777888888888899999999990
               # 01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
 1 -ref-> H59735 TGGGGtG-TTGGTgGAAAGCgttatgga------------GTGGTTTTAGATGGGCTCACGGCCTATCAGCTTGTTGGTGAGGTAATGGCTTACCAAGGCG
 2        T70875 G..T---.------.....T.GGGGACCGCAAGGCCTC..AC.CAGCAG..GC...CG.T.T.TG..T....A.......G.....A...CC.........
 3        F58095 G.CC---.------C.....CGA.A.--.............C.CC...G..GC...CTG..T..G..T..G.A.......G.....A...C.......C.T
 4        T70854 G..A---.------......AGGGGACCTTCGGGCCTT...C.C.A.C.....A..CT.G.T.GG..T....A.......G..........C.........
 5        F62024 ....A-C.GG...TA.....TCCG----.............C...GAAG....A..C.G.....................G.........C..........
 6        H59895 .CTTCA..CA.C.......AA..-----............TC...CAGG....A....G................................C.........
 7        F57728 .C.A.-.A.A.A.-.....GTGGCCTCTACATGTAAGCTATCAC.GAAG..G...A.TG..T.TG..T....A.....A.G.....C...CC.........
 8        M10734 .....-T..GTTG......GT..T.T--............C...A..GG.........G....T................G...G...............T
 9        T71041 GA.A---.------.....G.GGC.TTTAGCTC.......TC.C.AA......A..CT.A.T.GG..T....A.......G.....A...C..........
10        M6161O G...---.------.....AT...----............TC.CCA..G..GC...C.G..T.TG..T....A.......G.....A....C.........
11     CONSENSUS X..X.X.A.X.X.......XXXXXXXCXXXXXGXXXXXTAXC.C.XXXG.......CXG..T.XG..T....A.......G.....X...XX.........

Write a single-page pdf file:

% ./av testfiles/10patients_aln.fasta --outfile=test.pdf --quiet --blocks-per-page=5

Same as above:

% ./av testfiles/10patients_aln.fasta -o test.pdf -q -b 5

And do you know about seqmagick? If not, run, don’t walk to https://github.com/fhcrc/seqmagick and check it out, so that you can do this:

% seqmagick convert testfiles/ae_like.sto --output-format=fasta - | ./av -cx
               # 000000000000000000000000000000000
               # 445555555555566666666666666667777
               # 990111111155813445566778888991122
               # 791123678914209568907050235891215
  GA05AQR01D2ULR ...............TTGGT.GT..AG...A..
  GA05AQR01DFGSE ........................T.TAAGT..
  GA05AQR01CI0QB ...........A.....................
  GA05AQR01DW22X .TC..G.T.T.......................
  GA05AQR01A5WF4 ....................A........-T..
  GA05AQR01BUV2U ---..............................
  GA05AQR01B1R8I .............T...............CT..
  GA05AQR02JASPX ........A........................
  GCX02B001AYSTJ .............................-TA.
  GCX02B001DP9EQ ............A..........CA.......T
  GCX02B001AFAY1 ..............G..................
  GCX02B002J489C ...-......A......................
  GLKT0ZE01EDLCP AT...ATT.T.......................
  GLKT0ZE02I8LRD ---GA............................
-ref-> CONSENSUS TCTAGCGCGCGGGGACGAACGAGGCGCGCTGGA
Release History

Release History

0.1.0

This version

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

Download Files

Download Files

TODO: Brief introduction on what you do with files - including link to relevant help section.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
alnvu-0.1.0-py2-none-any.whl (14.0 kB) Copy SHA256 Checksum SHA256 py2 Wheel Jul 9, 2014
alnvu-0.1.0.tar.gz (13.8 kB) Copy SHA256 Checksum SHA256 Source Jul 9, 2014

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS HPE HPE Development Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting