No project description provided

These details have not been verified by PyPI

Project links

Project description

Build package

pyblast

This is a wrapper for other applications to run blast searches on SeqRecord objects and JSON objects. Intended to be used in small python applications.

Features include:

Automatic BLAST parsing to JSON
Alignment to circular queries, using either linear or circular subjects
Blast self installation

Installation

You can install BLAST to the pyblast directory using the following command:

pyblast install

This will install it to pyblast/blast_bin in your python install location. If you want BLAST installed somewhere else, move the ncbi-blast-X.X.X+ folder to your desired location and add path/to/ncbi-blast-X.X.X+/bin to you $PATH. PyBlast will prefer to use the blast stored in your executable path. If it cannot find a blast executable there, it looks for it in that paths in the pyblast/blast_bin/_paths.txt. file. _paths.txt is automatically updated when you run install_blast.py so theres no need to manage the paths manually.

After installing and verifying the blastn command works from the cmd line,

pip install pyblastbio

Usage

This package is a python wrapper for the BLAST command line, intended to be run along with a microservice (e.g. Flask) or for a quick alignment in a jupyter notebook or small python script/app.

This package also includes a basic python-based installation script which is used in unit-testing.

Running a blast query on a Bio.SeqRecord object

We can do a quick alignment to some sequences using the following, which gives us a nice dictionary of the results:

from pyblast import BioBlast
from pyblast.utils import make_linear, make_circular
from Bio.SeqRecord import SeqRecord
from Bio.Seq import Seq

queries = [
  SeqRecord(Seq("ACGTGATTCGTCGTGTAGTTGAGTGTTACGTTGCATGTCGTACGTGTGTAGTGTCGTGTAGTGCTGATGCTACGTGATCG"))
]
subjects = [
  SeqRecord(Seq("ACGTGATTCGTCGTGTAGTTGAGTGTTACGTTGCATGTCGTTACGTGATCG"))
]

# pyblast requires a 'topology' annotation on the SeqRecords.
# we can make records circular or linear using `make_linear` or `make_circular` methods
subjects = make_linear(subjects)
queries = make_linear(queries)

blast = BioBlast(subjects, queries)
results = blast.quick_blastn()
print(results)

[
  {
    "query": {
      "start": 1,
      "end": 46,
      "bases": "ACGTGATTCGTCGTGTAGTTGAGTGTTACGTTGCATGTCGT-ACGTG",
      "strand": 1,
      "length": 80,
      "sequence_id": "11e17df2-579f-4234-a1e6-f4e3fadfe277",
      "circular": false,
      "name": "<unknown name>",
      "origin_key": "bbadd55c-9413-4394-a23c-0da983630b98",
      "origin_record_id": "<unknown id>",
      "origin_sequence_length": 80
    },
    "subject": {
      "start": 1,
      "end": 47,
      "bases": "ACGTGATTCGTCGTGTAGTTGAGTGTTACGTTGCATGTCGTTACGTG",
      "strand": 1,
      "length": 51,
      "sequence_id": "69248d23-1044-4a75-80c9-53b999796d48",
      "circular": false,
      "name": "<unknown name>",
      "origin_key": "1f627d51-93df-458b-ba36-9b5a7b483a4d",
      "origin_record_id": "<unknown id>",
      "origin_sequence_length": 51
    },
    "meta": {
      "query acc.": "11e17df2-579f-4234-a1e6-f4e3fadfe277",
      "subject acc.": "69248d23-1044-4a75-80c9-53b999796d48",
      "score": 43,
      "evalue": 0,
      "bit score": 80,
      "alignment length": 47,
      "identical": 46,
      "gap opens": 1,
      "gaps": 1,
      "query length": 80,
      "q. start": 1,
      "q. end": 46,
      "subject length": 51,
      "s. start": 1,
      "s. end": 47,
      "subject strand": "plus",
      "query seq": "ACGTGATTCGTCGTGTAGTTGAGTGTTACGTTGCATGTCGT-ACGTG",
      "subject seq": "ACGTGATTCGTCGTGTAGTTGAGTGTTACGTTGCATGTCGTTACGTG",
      "span_origin": true
    }
  }
]

Running blast on circular subjects and queries

Pyblast handles alignments to circular subjects and queries as well. As you can see below, we get a complete alignment of the subject (1 to 50) to the circular query (82 over origin to 30). Circular subjects and circular queries can be mixed together, as well as multiple queries.

seq = "ACGTTGTAGTGTAGTTGATGATGATGTCTGTGTCGTGTGATGTGCTGTAGTGTTTAGGGGCGGCGCGGAGTATGCTG"
queries = [
	SeqRecord(Seq(seq))
]

subjects = [
	SeqRecord(Seq(seq[-20:] + seq[:30]))
]

# pyblast requires a 'topology' annotation on the SeqRecords.
# we can make records circular or linear using `make_linear` or `make_circular` methods
subjects = make_circular(subjects)
queries = make_circular(queries)

blast = BioBlast(subjects, queries)
results = blast.quick_blastn()
print(results)

[
  {
    "query": {
      "start": 82,
      "end": 30,
      "strand": 1,
      "...": "..."
    },
    "subject": {
      "start": 1,
      "end": 50,
      "strand": 1,
      "...": "..."
    },
    "meta": {
    	"...": "..."
    }
]

BioBlastFactory

In some cases, we will want to share the same sequences for different types of alignments. For example, we may want to align a set of primers and a set of templates to the same query records. In these types of cases, we can use the BioBlastFactory:

from pyblast import BioBlastFactory

# initialize a new factory
factory = BioBlastFactory()

# add records accessible by keyword
factory.add_records(records1, "primers")
factory.add_records(records2, "templates")
factory.add_records(records3, "queries")

# we spawn new BioBlast alignmers from the keywords above
primer_alignment = factory("primers", "queries")
template_alignment = factory("templates", "queries")

# we can then run alignments, ensuring the queries in both results
# refer to the exact same query
primer_results = primer_alignment.quick_short_blastn()
template_results = template_alignment.quick_blastn()

Utilities for reading files

pyblast includes utilities for reading in fasta and genbank files.

from pyblast.utils import load_glob, load_genbank_glob, load_fasta_glob

# load many genbank files into a list of SeqRecords
# 'topology' is automatically detected here
# we enforce all record_ids to be unique (a requirement for pyblast)
records1 = load_genbank_glob("~/mydesigns/*.gb", force_unique_ids=True)

# load many fasta files into a list of SeqRecords
# 'topology' is NOT detected
# we enforce all record_ids to be unique (a requirement for pyblast)
records2 = make_linear(load_fasta_glob("~/mydesigns/*.fasta"), force_unique_ids=True)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.0a11 pre-release

Jul 7, 2019

1.0.0a10 pre-release

Jul 7, 2019

1.0.0a9 pre-release

Jul 7, 2019

1.0.0a7 pre-release

Jul 7, 2019

1.0.0a6 pre-release

Jul 7, 2019

1.0.0a5 pre-release

Jul 7, 2019

1.0.0a4 pre-release

Jul 5, 2019

1.0.0a3 pre-release

Jul 5, 2019

1.0.0a1 pre-release

Jul 3, 2019

1.0.0a0 pre-release

Jul 3, 2019

This version

0.9

Sep 12, 2020

0.8

Jul 30, 2020

0.7

Jul 30, 2020

0.6.3

Jan 24, 2020

0.6.2

Jan 16, 2020

0.6.1

Dec 4, 2019

0.6

Dec 3, 2019

0.5.4

Nov 26, 2019

0.5.3

Nov 25, 2019

0.5.2

Nov 16, 2019

0.5.1

Nov 11, 2019

0.5

Oct 18, 2019

0.4.1

Sep 30, 2019

0.4.0

Sep 30, 2019

0.4.0a1 pre-release

Sep 30, 2019

0.3.8

Sep 15, 2019

0.3.6

Sep 13, 2019

0.3.5

Sep 12, 2019

0.3.4

Sep 4, 2019

0.3.3

Sep 3, 2019

0.3.2

Sep 2, 2019

0.3.1

Aug 31, 2019

0.3.0

Aug 20, 2019

0.2.17

Aug 31, 2019

0.2.16

Aug 20, 2019

0.2.15

Aug 19, 2019

0.2.14

Aug 18, 2019

0.2.13

Aug 18, 2019

0.2.12

Aug 17, 2019

0.2.11

Aug 16, 2019

0.2.10

Aug 14, 2019

0.2.9

Aug 14, 2019

0.2.8

Aug 8, 2019

0.2.7

Aug 8, 2019

0.2.6

Aug 7, 2019

0.2.5

Jul 17, 2019

0.2.4

Jul 16, 2019

0.2.3

Jul 16, 2019

0.2.2

Jul 16, 2019

0.2.0

Jul 14, 2019

0.1.5

Jul 14, 2019

0.1.4

Jul 14, 2019

0.1.3

Jul 14, 2019

0.1.2

Jul 13, 2019

0.1.1

Jul 7, 2019

0.1.0

Jul 7, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyblastbio-0.9.tar.gz (29.8 kB view details)

Uploaded Sep 12, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyblastbio-0.9-py3-none-any.whl (31.2 kB view details)

Uploaded Sep 12, 2020 Python 3

File details

Details for the file pyblastbio-0.9.tar.gz.

File metadata

Download URL: pyblastbio-0.9.tar.gz
Upload date: Sep 12, 2020
Size: 29.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.0.10 CPython/3.7.6 Linux/5.4.0-47-generic

File hashes

Hashes for pyblastbio-0.9.tar.gz
Algorithm	Hash digest
SHA256	`a9039444eba8f720e369064d09757589f8a7be2e5795517a6da06090cef27248`
MD5	`55a48334bd79e120e96577afa72752f0`
BLAKE2b-256	`09c778982998852191a1832d60d7defc3638fdda616d546f649044a2954b3acc`

See more details on using hashes here.

File details

Details for the file pyblastbio-0.9-py3-none-any.whl.

File metadata

Download URL: pyblastbio-0.9-py3-none-any.whl
Upload date: Sep 12, 2020
Size: 31.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.0.10 CPython/3.7.6 Linux/5.4.0-47-generic

File hashes

Hashes for pyblastbio-0.9-py3-none-any.whl
Algorithm	Hash digest
SHA256	`79dc693b787f90ffbd82e5887b63ae913ab87d64379a76fae27d7764bd685144`
MD5	`05c6647c7c2542708ecfb168a02240a0`
BLAKE2b-256	`0a67c0ba5a548b8409f88bfec1e6345296694769ae394ead94bf3087ae75b296`

See more details on using hashes here.

pyblastbio 0.9

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pyblast

Installation

Usage

Running a blast query on a Bio.SeqRecord object

Running blast on circular subjects and queries

BioBlastFactory

Utilities for reading files

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes