Skip to main content

A tool for handling repetitive insertions into sequence alignments

Project description

brawn

A python port of MUSCLE's profile-profile mode for aligning sequences.

Brawn was specifically designed with repetitive insertions of query sequences into reference alignments in mind.

Brawn takes advantage of that repetitive nature by using cache files to remove the calculations that would need to be repeated over and over for the reference alignments. This counters the speed loss of moving from C to python. Initially building these cache files will take about 10 times longer than an equivalent single run.

While results are often identical to those from MUSCLE, there will occasionally be some minor variations. If an alignment would be very poor in the best case, this variation will be more pronounced.

Only protein sequences are explicitly supported at the moment, full DNA/RNA support may follow later.

Installation

from PyPI

pip install brawn

from source

pip install .

for development

pip install .[testing]

Use

For speed, using a cache file is extremely important. They can be built and used both programatically and the command line. They are also optional, if required.

Basic profile-profile mode

To load an alignment from a FASTA file, use:

from brawn import Alignment

with open("some.fasta") as handle:
    alignment = Alignment.from_file(handle)

Or, to build from an existing dictionary (even if it's just one element), use:

name_to_sequence = {
    "A": "GT-DVG",
    "B": "GTK-VG",
}
alignment = Alignment(name_to_sequence)

Or, to load cached calculations directly (this is the preferable option for large alignments):

with open("reference.cache") as handle:
    alignment = Alignment.from_cache_file(handle)

As a sidenote, to build a cache for the reference alignment:

with open("reference.cache", "w") as handle:
    alignment.to_cache_file(handle)

This will add any needed computational steps prior to writing.

Once two alignments have been created/loaded, they can be combined with:

from brawn import combine_alignments

result = combine_alignments(alignment1, alignment2)

And the resulting alignment can be output to file:

with open("output.fasta", "w") as handle:
    result.to_file(handle=handle)

# if handle is not supplied, the output will be sent to sys.stdout
results.to_file()

Simpler insertion of a sequence into an alignment

For the use case that brawn was designed for, only a single sequence is being inserted into a reference alignment at a time. This is usually paired with fetching a single reference sequence, and finding the matching sites.

# note: this still aligns with the full reference alignment
# only the reference sequence mentioned by name will be built
aligned_query, aligned_reference = get_aligned_pair(query_sequence, reference_name,
                                                    reference_alignment)

Or, if you prefer to have the full set of newly aligned reference sequences as a dictionary:

aligned_query, aligned_refs_by_name = insert_into_alignment(query_sequence, alignment)

From the command line

Brawn can function as a drop-in replacement for MUSCLE's -profile mode, complete command line argument conversion.

To take advantage of cached alignment calculations, first build a cache of your alignment, with:

brawn input.fasta --build-cache desired_cache_path

From that point, you can use cached files and plain FASTA files interchangeably. E.g.

brawn query.fasta --reference-alignment some_fasta_file

and

brawn query.fasta --reference-alignment some_cache_file

will both work as expected.

For the command line, the resulting FASTA output will to be stdout.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

brawn-1.0.2.tar.gz (38.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

brawn-1.0.2-py3-none-any.whl (41.7 kB view details)

Uploaded Python 3

File details

Details for the file brawn-1.0.2.tar.gz.

File metadata

  • Download URL: brawn-1.0.2.tar.gz
  • Upload date:
  • Size: 38.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for brawn-1.0.2.tar.gz
Algorithm Hash digest
SHA256 8c993df898fdf359cd619f035c5c14a0e4ab7587dc6be3d1d970eb7efaf30ec9
MD5 bd00a4a8164ac6d1a1e964cec577cb28
BLAKE2b-256 c2c8fe7b560057829c6b4018f4ddf8927b5bbf7333493ecefaa0a53c0590d503

See more details on using hashes here.

File details

Details for the file brawn-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: brawn-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 41.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for brawn-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 47fa33be381f334dae7f996a3393a12354846ae2ac1e68a78119c9e3970d82bd
MD5 7628d6b33cb3ca3371b52a61d7ba02fd
BLAKE2b-256 9b920a93cb8b1d98555713d6685f49beea9e22b757333debb9397a1e124d447f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page