Skip to main content

Python bindings for microbiorust, Microbiology friendly bioinformatics functions

Project description

microbiorust 🦀

Python bindings for microBioRust — a high-performance, modular bioinformatics toolkit written in Rust.

microbiorust provides fast and memory-efficient bioinformatics functionality to Python users by leveraging the power of Rust, exposed through PyO3. This package aims to offer an alternative to libraries like Biopython, with a focus on speed, correctness, and extensibility.


Installation

pip install microbiorust

Wheels are available for Linux, macOS and Windows (Python 3.10+). No Rust toolchain required. (no requirement to install Rust)

Build from source

If you prefer to build from source using maturin:

pip install maturin
git clone https://github.com/microBioRust/microBioRust
cd microbiorust-py
maturin develop --features extension-module

To verify the Python module functions are correctly exposed from Rust:

cargo test

Features

  • Fast parsers for GenBank and EMBL formats
  • Fast parsers for BLAST XML and tabular formats
  • Fast parser for MSA alignments — subset, get_consensus
  • Output to GFF3, FAA and FFN formats
  • Accurate feature extraction and translation
  • Sequence metrics: hydrophobicity, amino acid counts and percentages
  • Seamless Python API for easy integration into existing pipelines
  • Built with Rust for memory safety and performance

Modules

microbiorust gbk — GenBank format

from microbiorust import gbk

# Extract protein sequences to FASTA
gbk.gbk_to_faa("input.gbk", "output.faa")

# Extract nucleotide sequences to FASTA
gbk.gbk_to_fna("input.gbk", "output.fna")

# Count protein sequences
count = gbk.gbk_to_faa_count("input.gbk")

# Convert annotations to GFF3
gbk.gbk_to_gff("input.gbk", "output.gff")

microbiorust embl — EMBL format

from microbiorust import embl

# Extract protein sequences to FASTA
embl.embl_to_faa("input.embl", "output.faa")

# Extract nucleotide sequences to FASTA
embl.embl_to_fna("input.embl", "output.fna")

# Convert annotations to GFF3
embl.embl_to_gff("input.embl", "output.gff")

microbiorust seqmetrics — Sequence metrics

from microbiorust import seqmetrics

sequence = "MKTLLLTLVVVTIVCLDLGAVGNGSSLSEDKDNVHK"

# Hydrophobicity score
window_size = 5
score = seqmetrics.hydrophobicity(sequence, window_size)

# Amino acid counts
counts = seqmetrics.amino_counts(sequence)

# Amino acid percentages
percentages = seqmetrics.amino_percentage(sequence)

microbiorust align — Multiple sequence alignment

from microbiorust import align

# Subset a fasta format MSA by row and column e.g.
align.subset_msa_alignment("input.fasta", "ids.txt", "output.fasta")
where the first tuple (0,10) is a row-wise subset and
the second tuple (0,100) is a column-wise subset

Why Rust?

Rust gives microbiorust C-level performance with memory safety — no segfaults, no GIL limitations, and no need for NumPy or Pandas for core parsing operations. Large GenBank or EMBL files are parsed significantly faster than equivalent pure-Python implementations.


Documentation

Full documentation: https://microbiorust.github.io/docs/

Source: https://github.com/microBioRust/microBioRust


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

microbiorust-0.1.5.tar.gz (9.5 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

microbiorust-0.1.5-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

microbiorust-0.1.5-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ ARM64

microbiorust-0.1.5-cp310-abi3-win_amd64.whl (857.2 kB view details)

Uploaded CPython 3.10+Windows x86-64

microbiorust-0.1.5-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

microbiorust-0.1.5-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

microbiorust-0.1.5-cp310-abi3-macosx_11_0_arm64.whl (967.0 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

microbiorust-0.1.5-cp310-abi3-macosx_10_12_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file microbiorust-0.1.5.tar.gz.

File metadata

  • Download URL: microbiorust-0.1.5.tar.gz
  • Upload date:
  • Size: 9.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.12.0

File hashes

Hashes for microbiorust-0.1.5.tar.gz
Algorithm Hash digest
SHA256 609f44c35cc5d1c38b4b3919a5ad12906584fcbde4cfbfb8f05e0c110452fb65
MD5 96005abc6fc4a68e87a052f47477f10a
BLAKE2b-256 c593d18d3466431e0148fd0fda379917a4e78a5c1bdfc6f4425bd1867507d33d

See more details on using hashes here.

File details

Details for the file microbiorust-0.1.5-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for microbiorust-0.1.5-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 85ab01e34a2e06e2101874bf21350966f80408ceb9cd236cbb313909b1754e14
MD5 b99ef1649400c00d63f7f980ea3d13ae
BLAKE2b-256 c29e0f0746f0d0f3984d6a7db134ef631673a458fc8da486a3f6cd37259c4fcb

See more details on using hashes here.

File details

Details for the file microbiorust-0.1.5-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for microbiorust-0.1.5-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ece572743f52ed9d496162ae916307fcfee4af9dcd75b346f89429e17707c47a
MD5 984a8f1fdc0e52e29f7773d074c01ad2
BLAKE2b-256 bc185a2164012e7785d15fb9de1db415c2afc626e630279f00516dfe7b7818c9

See more details on using hashes here.

File details

Details for the file microbiorust-0.1.5-cp310-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for microbiorust-0.1.5-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 647c9a3d97da6917e7a2b8b49eb64a61f629f3093049410f41b2a9b35e51cb6a
MD5 c786b2c097581951aeb96e522d5f9be0
BLAKE2b-256 8ca68537467ae3c879333c47a2a834d9c91968b720c8964764f540c77381159b

See more details on using hashes here.

File details

Details for the file microbiorust-0.1.5-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for microbiorust-0.1.5-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d36b695cacc791e4249ad258a29e74698ea9f2737b65f91a4664aae5cc043467
MD5 b648e935503ab06362398e646a933446
BLAKE2b-256 1a99c9c8e6b1634f380d14ee9a058f9e903d6461b033c12c927983d4672936ef

See more details on using hashes here.

File details

Details for the file microbiorust-0.1.5-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for microbiorust-0.1.5-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 6e9a2ba36f7963a0d7d1a541bd35fffc1bf4e57e71985d60225f7e246ea48538
MD5 858d3f58732fc42f80b0506e5e119337
BLAKE2b-256 199bf93a82c944c0e523c8f041bfecda9ebc25e08e0f3dddf1c313deb0ef630a

See more details on using hashes here.

File details

Details for the file microbiorust-0.1.5-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for microbiorust-0.1.5-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dd2c723f335067bfcf369e45034ce68e56249fe4b17811ed8c9539881715c623
MD5 e949545228d1b032c9604f405b5e3b82
BLAKE2b-256 ad336072d7c0e9d99e4d2415f5652429624360656e02ee02f36e4177273ac914

See more details on using hashes here.

File details

Details for the file microbiorust-0.1.5-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for microbiorust-0.1.5-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 72a094f11303d7b8ffef09f1225517d2f2c088928adb7cc0b8cc2d8fd7ffc40d
MD5 fb58fe3299dd9fc88ed04d39c1a30f37
BLAKE2b-256 691d117b39ec646dcbd652d1b98afacd4f024f005a3e65d43bcdb18817ed1a46

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page