No project description provided
Project description
Build/Coverage Status
Branch | Build | Coverage |
---|---|---|
master | ||
development |
this repo is not longer active
pyblast
This is a wrapper for other applications to run blast searches
Installation
You can install BLAST to the pyblast directory using the following command:
pyblast install
This will install it to pyblast/blast_bin. If you want BLAST installed somewhere else, move the ncbi-blast-X.X.X+ folder to your desired location and add path/to/ncbi-blast-X.X.X+/bin to you $PATH. PyBlast will prefer to use the blast stored in your executable path. If it cannot find a blast executable there, it looks for it in that paths in the pyblast/blast_bin/_paths.txt. file. _paths.txt is automatically updated when you run install_blast.py so theres no need to manage the paths manually.
After installing and verifying the blastn
command works from the cmd line,
pip install pyblastbio
Status
In alpha, but let me know if you are interested in this repo.
Usage
This package is a python wrapper for the BLAST command line, intended to be run along with a microservice (e.g. Flask). This package also includes a basic python-based installation script which is used in unit-testing.
Input options
To use blast...
from pyblast import JSONBlast
# load your DNA from where ever you keep it
query = json.load(...)
subjects = json.load(...)
# BLAST cmd object
myblast = JSONBlast(subjects, query)
# build the db, run the search, & parse the results
congig = {...}
myblast.quick_blastn(**config)
# get the results object
results = myblast.results # get the results object
Input Format
Input sequences have the following format:
{
"name": "attB-mCh-Dnmt3b1-Poly(A",
"sequence": "gatGCCAGCTCATTCCTCCCACTCATGATCTATAGATCCCCCGGGCTGCAGGAATTCTACCGGGTAGGGGAGGCGCTTTTCCCAAGGCAGTCTGGAGCATGCGCTTTAGCAGCCCCGCTGGGCACTTGGCGCTACACAAGTGGCCTCTGGCCTCGCACACATTCCACATCCACCGGTAGGCGCCAACCGGCTCCGTTCTTTGGTGGCCCCTTCGCGCCACCTTCTACTCCTCCCCTAGTCAGGAAGTTCCCCCCCGCCCCGCAGCTCGCGTCGTGCAGGACGTGACAAATGGAAGTAGCACGTCTCACTAGTCTCGTGCAGATGGACAGCACCGCTGAGCAATGGAAGCGGGTAGGCCTTTGGGGCAGCGGCCAATAGCAGCTTTGCTCCTTCGCTTTCTGGGCTCAGAGGCTGGGAAGGGGTGGGTCCGGGGGCGGGCTCAGGGGCGGGCTCAGGGGCGGGGCGGGCGCCCGAAGGTCCTCCGGAGGCCCGGCATTCTGCACGCTTCAAAAGCGCACGTCTGCCGCGCTGTTCTCCTCTTCCTCATCTCCGGGCCTTTCGACCGATCATCAAGCTTAATGATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGCAAGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGTATCCATCATGGCTGATGCAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAAACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCCAGGCTCAAGGCGAGCATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGGACCGCTATCAGGACATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTTCTTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACCGACCAAGCGACGCCCAACCTGCCATCACGAGATTTCGATTCCACCGCCGCCTTCTATGAAAGGTTGGGCTTCGGAATCGTTTTCCGGGACGCCGGCTGGATGATCCTCCAGCGCGGGGATCTCATGCTGGAGTTCTTCGCCCACCCTAGGGGGAGGCTAACTGAAACACGGAAGGAGACAATACCGGAAGGAACCCGCGCTATGACGGCAATAAAAAGACAGAATAAAACGCACGGTGTTGGGTCGTTTGTTCATAAACGCGGGGTTCGGTCCCAGGGCTGGCACTCTGTCGATACCCCACCGAGACCCCATTGGGGCCAATACGCCCGCGTTTCTTCCTTTTCCCCACCCCACCCCCCAAGTTCGGGTGAAGGCCCAGGGCTCGCAGCCAACGTCGGGGCGGCAAGCTTACATCGAGatcCCGGCTTGTCGACGACGGCGgtCTCCGTCGTCAGGATCATCCGTGAGCAAGGGCGAGGAGGATAACATGGCCATCATCAAGGAGTTCATGCGCTTCAAGGTGCACATGGAGGGCTCCGTGAACGGCCACGAGTTCGAGATCGAGGGCGAGGGCGAGGGCCGCCCCTACGAGGGCACCCAGACCGCCAAGCTGAAGGTGACCAAGGGTGGCCCCCTGCCCTTCGCCTGGGACATCCTGTCCCCTCAGTTCATGTACGGCTCCAAGGCCTACGTGAAGCACCCCGCCGACATCCCCGACTACTTGAAGCTGTCCTTCCCCGAGGGCTTCAAGTGGGAGCGCGTGATGAACTTCGAGGACGGCGGCGTGGTGACCGTGACCCAGGACTCCTCCCTGCAGGACGGCGAGTTCATCTACAAGGTGAAGCTGCGCGGCACCAACTTCCCCTCCGACGGCCCCGTAATGCAGAAGAAGACCATGGGCTGGGAGGCCTCCTCCGAGCGGATGTACCCCGAGGACGGCGCCCTGAAGGGCGAGATCAAGCAGAGGCTGAAGCTGAAGGACGGCGGCCACTACGACGCTGAGGTCAAGACCACCTACAAGGCCAAGAAGCCCGTGCAGCTGCCCGGCGCCTACAACGTCAACATCAAGTTGGACATCACCTCCCACAACGAGGACTACACCATCGTGGAACAGTACGAACGCGCCGAGGGCCGCCACTCCACCGGCGGCATGGACGAGCTGTACAAGAGCGGCCTGAGGAGCAGAGCCCAGGCGAGCAACAGCGCCGTGGACGCCACCATGGGCGatCGCATGAAGGGAGACAGCAGACATCTGAATGAAGAAGAGGGTGCCAGCGGGTATGAGGAGTGCATTATCGTTAATGGGAACTTCAGTGACCAGTCCTCAGACACGAAGGATGCTCCCTCACCCCCAGTCTTGGAGGCAATCTGCACAGAGCCAGTCTGCACACCAGAGACCAGAGGCCGCAGGTCAAGCTCCCGGCTGTCTAAGAGGGAGGTCTCCAGCCTTCTGAATTACACGCAGGACATGACAGGAGATGGAGACAGAGATGATGAAGTAGATGATGGGAATGGCTCTGATATTCTAATGCCAAAGCTCACCCGTGAGACCAAGGACACCAGGACGCGCTCTGAAAGCCCGGCTGTCCGAACCCGACATAGCAATGGGACCTCCAGCTTGGAGAGGCAAAGAGCCTCCCCCAGAATCACCCGAGGTCGGCAGGGCCGCCACCATGTGCAGGAGTACCCTGTGGAGTTTCCGGCTACCAGGTCTCGGAGACGTCGAGCATCGTCTTCAGCAAGCACGCCATGGTCATCCCCTGCCAGCGTCGACTTCATGGAAGAAGTGACACCTAAGAGCGTCAGTACCCCATCAGTTGACTTGAGCCAGGATGGAGATCAGGAGGGTATGGATACCACACAGGTGGATGCAGAGAGCAGAGATGGAGACAGCACAGAGTATCAGGATGATAAAGAGTTTGGAATAGGTGACCTCGTGTGGGGAAAGATCAAGGGCTTCTCCTGGTGGCCTGCCATGGTGGTGTCCTGGAAAGCCACCTCCAAGCGACAGGCCATGCCCGGAATGCGCTGGGTACAGTGGTTTGGTGATGGCAAGTTTTCTGAGATCTCTGCTGACAAACTGGTGGCTCTGGGGCTGTTCAGCCAGCACTTTAATCTGGCTACCTTCAATAAGCTGGTTTCTTATAGGAAGGCCATGTACCACACTCTGGAGAAAGCCAGGGTTCGAGCTGGCAAGACCTTCTCCAGCAGTCCTGGAGAGTCACTGGAGGACCAGCTGAAGCCCATGCTGGAGTGGGCCCACGGTGGCTTCAAGCCTACTGGGATCGAGGGCCTCAAACCCAACAAGAAGCAACCAGTGGTTAATAAGTCGAAGGTGCGTCGTTCAGACAGTAGGAACTTAGAACCCAGGAGACGCGAGAACAAAAGTCGAAGACGCACAACCAATGACTCTGCTGCTTCTGAGTCCCCCCCACCCAAGCGCCTCAAGACAAATAGCTATGGCGGGAAGGACCGAGGGGAGGATGAGGAGAGCCGAGAACGGATGGCTTCTGAAGTCACCAACAACAAGGGCAATCTGGAAGACCGCTGTTTGTCCTGTGGAAAGAAGAACCCTGTGTCCTTCCACCCCCTCTTTGAGGGTGGGCTCTGTCAGAGTTGCCGGGATCGCTTCCTAGAGCTCTTCTACATGTATGATGAGGACGGCTATCAGTCCTACTGCACCGTGTGCTGTGAGGGCCGTGAACTGCTGCTGTGCAGTAACACAAGCTGCTGCAGATGCTTCTGTGTGGAGTGTCTGGAGGTGCTGGTGGGCGCAGGCACAGCTGAGGATGCCAAGCTGCAGGAACCCTGGAGCTGCTATATGTGCCTCCCTCAGCGCTGCCATGGGGTCCTCCGACGCAGGAAAGATTGGAACATGCGCCTGCAAGACTTCTTCACTACTGATCCTGACCTGGAAGAATTTGAGCCACCCAAGTTGTACCCAGCAATTCCTGCAGCCAAAAGGAGGCCCATTAGAGTCCTGTCTCTGTTTGATGGAATTGCAACGGGGTACTTGGTGCTCAAGGAGTTGGGTATTAAAGTGGAAAAGTACATTGCCTCCGAAGTCTGTGCAGAGTCCATCGCTGTGGGAACTGTTAAGCATGAAGGCCAGATCAAATATGTCAATGACGTCCGGAAAATCACCAAGAAAAATATTGAAGAGTGGGGCCCGTTCGACTTGGTGATTGGTGGAAGCCCATGCAATGATCTCTCTAACGTCAATCCTGCCCGCAAAGGTTTATATGAGGGCACAGGAAGGCTCTTCTTCGAGTTTTACCACTTGCTGAATTATACCCGCCCCAAGGAGGGCGACAACCGTCCATTCTTCTGGATGTTCGAGAATGTTGTGGCCATGAAAGTGAATGACAAGAAAGACATCTCAAGATTCCTGGCATGTAACCCAGTGATGATCGATGCCATCAAGGTGTCTGCTGCTCACAGGGCCCGGTACTTCTGGGGTAACCTACCCGGAATGAACAGGCCCGTGATGGCTTCAAAGAATGATAAGCTCGAGCTGCAGGACTGCCTGGAGTTCAGTAGGACAGCAAAGTTAAAGAAAGTGCAGACAATAACCACCAAGTCGAACTCCATCAGACAGGGCAAAAACCAGCTTTTCCCTGTAGTCATGAATGGCAAGGACGACGTTTTGTGGTGCACTGAGCTCGAAAGGATCTTCGGCTTCCCTGCTCACTACACGGACGTGTCCAACATGGGCCGCGGCGCCCGTCAGAAGCTGCTGGGCAGGTCCTGGAGTGTACCGGTCATCAGACACCTGTTTGCCCCCTTGAAGGACTACTTTGCCTGTGAATAGGCggccGCAGTTAACGAATTCtctagaggatccagacatgataagatacattgatgagtttggacaaaccacaactagaatgcagtgaaaaaaatgctttatttgtgaaatttgtgatgctattgctttatttgtaaccattataagctgcaataaacaagttaacaacaacaattgcattcattttatgtttcaggttcagggggaggtgtgggaggttttttaaagcaagtaaaacctctacaaatgtggtatggctgattatgatcctgcaagcctcgtcgccgcggtttATTCTGTTGACAATTAATCATCGGCATAGTATATCGGCATAGTATAATACGACAAGGTGAGGAACTAAACCATGGGATCGGCCATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACGATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGTATCCATCATGGCTGATGCAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAAACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCCAGGCTCAAGGCGCGCATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGGACCGCTATCAGGACATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCCCCCGATTCGCAGCGCATCGCCTTCTATCGCCTTCTTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAATAAAGACCGACCAAGCGACGTCTGAGAGCTCCCTGGCGAATTCGGTACCAATAAAAGAGCTTTATTTTCATGATCTGTGTGTTGGTTTTTGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTAGTGTGCG",
"circular": true,
"description": null,
"features": [
{
"name": "END",
"type": "misc_feature",
"id": null,
"start": 5194,
"end": 5209,
"strand": 1
},
...
}
Output Format
BLAST results returns an AlignmentResults
object, which contains a list of alignments at .alignments
Alignments have the following format:
[
{
"meta": {
"gaps_open": 0,
"bit_score": 7792,
"gaps": 0,
"identical": 4219,
"score": 4219.0,
"evalue": 0.0,
"alignment_length": 4219
},
"query": {
"length": 18816,
"acc": "pMODKan-HO-pACT1-Z4-",
"sequence": "GGAGCAG...",
"start": 1,
"circular": null,
"end": 4219,
"name": null
},
"subject": {
"length": 7883,
"acc": "pMODKan-HO-pACT1-ZEV4",
"sequence": "TCAGT...",
"start": 1,
"circular": null,
"strand": "plus",
"end": 4219,
"name": null
}
]]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pyblastbio-0.1.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 58387e73f4130fbe14f1067ac6b128e164d2e251269582f2c71cc66d87399155 |
|
MD5 | de4e9618cdead4ddb82d5ddef17e44da |
|
BLAKE2b-256 | ff6c1f8feb92d66c0b8e3a4c6e4e42840793a34f2cfa25eb461aaaeb636802f5 |