bio: making bioinformatics fun again
The software is currently under development. It is operational but not fully vetted.
bio - command-line utilities to make bioinformatics explorations more enjoyable.
Why do we need this software?
If you've ever done bioinformatics you know how even seemingly straightforward tasks require multiple steps, arcane incantations, reading documentation and numerous other preparations that slow down your progress.
Time and again I found myself not pursuing an idea because getting to the fun part was too tedious. The
bio package is meant to solve that tedium. With
bio you can write things like this:
# Fetch the data from NCBI. bio NC_045512 --fetch --rename ncov bio MN996532 --fetch --rename ratg13 # Align the DNA for the S protein. bio ncov:S ratg13:S --end 90 --align
to align the first 90 basepairs of the DNA sequence of the
S protein from the SARS-COV-2 novel coronavirus to its closest (known) relative, the bat coronavirus RaTG13. The command above will print:
### 1: YP_009724390 vs QHR63300.2 ### Length: 90 (semiglobal) Query: 90 [1, 90] Target: 90 [1, 90] Score: 387 Ident: 83/90 (92.2%) Simil: 83/90 (92.2%) Gaps: 0/90 (0.0%) Matrix: nuc44(-11, -1) YP_009724390 ATGTTTGTTTTTCTTGTTTTATTGCCACTAGTCTCTAGTCAGTGTGTTAATCTTACAACCAGAACTCAATTACCCCCTGCATACACTAAT 1 ||||||||||||||||||||||||||||||||.||||||||||||||||||||.|||||.||||||||.|||||.|||||||||||.||. 90 QHR63300.2 ATGTTTGTTTTTCTTGTTTTATTGCCACTAGTTTCTAGTCAGTGTGTTAATCTAACAACTAGAACTCAGTTACCTCCTGCATACACCAAC
If you wanted to align the same sequences as translated proteins
bio lets you write:
bio ncov:S ratg13:S --end 90 --translate --align
### 1: YP_009724390 vs QHR63300.2 ### Length: 30 (semiglobal) Query: 30 [1, 30] Target: 30 [1, 30] Score: 153 Ident: 30/30 (100.0%) Simil: 30/30 (100.0%) Gaps: 0/30 (0.0%) Matrix: blosum62(-11, -1) YP_009724390 MFVFLVLLPLVSSQCVNLTTRTQLPPAYTN 1 |||||||||||||||||||||||||||||| 30 QHR63300.2 MFVFLVLLPLVSSQCVNLTTRTQLPPAYTN
Beyond alignments there is a lot more to
bio and we recommend looking at the documentation
bio designed for?
The software was written to teach bioinformatics and is the companion software to the Biostar Handbook textbook. The targeted audience comprises:
- Students learning about bioinformatics.
- Bioinformatics educators that need a platform to demonstrate bioinformatics concepts.
- Scientists working with large numbers of similar genomes (bacterial/viral strains).
- Scientists that need to closely investigate and understand particular details of a genomic region.
The ideas and motivations fueling the creation of
bio came to us while educating the many cohorts of students that used the handbook in the classrom.
You see, in bioinformatics, many tasks that should be straightforward are, instead, needlessly complicated.
bio is an opinionated take on how bioinformatics, particularly data presentation and access, should be simplified.
The documentation is maintained at
bio works on Linux and Mac computers and on Windows when using the Linux Subsystem. Install the package with:
# We recommend installing prerequisites with conda. conda install -c bioconda biopython parasail-python # Install the bio package. pip install bio --upgrade
See more details in the documentation.
If you clone the repository we recommend to install as development package with:
python setup.py develop
Testing uses the pytest framework:
pip install pytest
To run all tests use:
Tests are automatically built from a test script that mimics real life usage scenarios.
To add a new test first run the command you wish to test, for example:
bio foo --gff > output.gff
test/data directory. After that add the same command above into the master script:
The latter command will automatically generate a Python test for each line in the master script.
The automatically generated test will verify that the command is operational and that the output matches the expectations.
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size bio-0.1.4-py3-none-any.whl (44.1 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size bio-0.1.4.tar.gz (35.5 kB)||File type Source||Python version None||Upload date||Hashes View|