Python bindings for microbiorust, Microbiology friendly bioinformatics functions
Project description
microbiorust 🦀
Python bindings for microBioRust — a high-performance, modular bioinformatics toolkit written in Rust.
microbiorust provides fast and memory-efficient bioinformatics functionality to Python users by leveraging the power of Rust, exposed through PyO3. This package aims to offer an alternative to libraries like Biopython, with a focus on speed, correctness, and extensibility.
Installation
pip install microbiorust
Wheels are available for Linux, macOS and Windows (Python 3.10+). No Rust toolchain required. (no requirement to install Rust)
Build from source
If you prefer to build from source using maturin:
pip install maturin
git clone https://github.com/microBioRust/microBioRust
cd microbiorust-py
maturin develop --features extension-module
To verify the Python module functions are correctly exposed from Rust:
cargo test
Features
- Fast parsers for GenBank and EMBL formats
- Fast parsers for BLAST XML and tabular formats
- Fast parser for MSA alignments — subset, get_consensus
- Output to GFF3, FAA and FFN formats
- Accurate feature extraction and translation
- Sequence metrics: hydrophobicity, amino acid counts and percentages
- Seamless Python API for easy integration into existing pipelines
- Built with Rust for memory safety and performance
Modules
microbiorust gbk — GenBank format
from microbiorust import gbk
# Extract protein sequences to FASTA
gbk.gbk_to_faa("input.gbk", "output.faa")
# Extract nucleotide sequences to FASTA
gbk.gbk_to_fna("input.gbk", "output.fna")
# Count protein sequences
count = gbk.gbk_to_faa_count("input.gbk")
# Convert annotations to GFF3
gbk.gbk_to_gff("input.gbk", "output.gff")
microbiorust embl — EMBL format
from microbiorust import embl
# Extract protein sequences to FASTA
embl.embl_to_faa("input.embl", "output.faa")
# Extract nucleotide sequences to FASTA
embl.embl_to_fna("input.embl", "output.fna")
# Convert annotations to GFF3
embl.embl_to_gff("input.embl", "output.gff")
microbiorust seqmetrics — Sequence metrics
from microbiorust import seqmetrics
sequence = "MKTLLLTLVVVTIVCLDLGAVGNGSSLSEDKDNVHK"
# Hydrophobicity score
window_size = 5
score = seqmetrics.hydrophobicity(sequence, window_size)
# Amino acid counts
counts = seqmetrics.amino_counts(sequence)
# Amino acid percentages
percentages = seqmetrics.amino_percentage(sequence)
microbiorust align — Multiple sequence alignment
from microbiorust import align
# Subset a fasta format MSA by row and column e.g.
align.subset_msa_alignment("input.fasta", "ids.txt", "output.fasta")
where the first tuple (0,10) is a row-wise subset and
the second tuple (0,100) is a column-wise subset
Why Rust?
Rust gives microbiorust C-level performance with memory safety — no segfaults, no GIL limitations, and no need for NumPy or Pandas for core parsing operations. Large GenBank or EMBL files are parsed significantly faster than equivalent pure-Python implementations.
Documentation
Full documentation: https://microbiorust.github.io/docs/
Source: https://github.com/microBioRust/microBioRust
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file microbiorust-0.1.5.tar.gz.
File metadata
- Download URL: microbiorust-0.1.5.tar.gz
- Upload date:
- Size: 9.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
609f44c35cc5d1c38b4b3919a5ad12906584fcbde4cfbfb8f05e0c110452fb65
|
|
| MD5 |
96005abc6fc4a68e87a052f47477f10a
|
|
| BLAKE2b-256 |
c593d18d3466431e0148fd0fda379917a4e78a5c1bdfc6f4425bd1867507d33d
|
File details
Details for the file microbiorust-0.1.5-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: microbiorust-0.1.5-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.1 MB
- Tags: PyPy, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
85ab01e34a2e06e2101874bf21350966f80408ceb9cd236cbb313909b1754e14
|
|
| MD5 |
b99ef1649400c00d63f7f980ea3d13ae
|
|
| BLAKE2b-256 |
c29e0f0746f0d0f3984d6a7db134ef631673a458fc8da486a3f6cd37259c4fcb
|
File details
Details for the file microbiorust-0.1.5-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: microbiorust-0.1.5-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 1.1 MB
- Tags: PyPy, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ece572743f52ed9d496162ae916307fcfee4af9dcd75b346f89429e17707c47a
|
|
| MD5 |
984a8f1fdc0e52e29f7773d074c01ad2
|
|
| BLAKE2b-256 |
bc185a2164012e7785d15fb9de1db415c2afc626e630279f00516dfe7b7818c9
|
File details
Details for the file microbiorust-0.1.5-cp310-abi3-win_amd64.whl.
File metadata
- Download URL: microbiorust-0.1.5-cp310-abi3-win_amd64.whl
- Upload date:
- Size: 857.2 kB
- Tags: CPython 3.10+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
647c9a3d97da6917e7a2b8b49eb64a61f629f3093049410f41b2a9b35e51cb6a
|
|
| MD5 |
c786b2c097581951aeb96e522d5f9be0
|
|
| BLAKE2b-256 |
8ca68537467ae3c879333c47a2a834d9c91968b720c8964764f540c77381159b
|
File details
Details for the file microbiorust-0.1.5-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: microbiorust-0.1.5-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.10+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d36b695cacc791e4249ad258a29e74698ea9f2737b65f91a4664aae5cc043467
|
|
| MD5 |
b648e935503ab06362398e646a933446
|
|
| BLAKE2b-256 |
1a99c9c8e6b1634f380d14ee9a058f9e903d6461b033c12c927983d4672936ef
|
File details
Details for the file microbiorust-0.1.5-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: microbiorust-0.1.5-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.10+, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e9a2ba36f7963a0d7d1a541bd35fffc1bf4e57e71985d60225f7e246ea48538
|
|
| MD5 |
858d3f58732fc42f80b0506e5e119337
|
|
| BLAKE2b-256 |
199bf93a82c944c0e523c8f041bfecda9ebc25e08e0f3dddf1c313deb0ef630a
|
File details
Details for the file microbiorust-0.1.5-cp310-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: microbiorust-0.1.5-cp310-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 967.0 kB
- Tags: CPython 3.10+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd2c723f335067bfcf369e45034ce68e56249fe4b17811ed8c9539881715c623
|
|
| MD5 |
e949545228d1b032c9604f405b5e3b82
|
|
| BLAKE2b-256 |
ad336072d7c0e9d99e4d2415f5652429624360656e02ee02f36e4177273ac914
|
File details
Details for the file microbiorust-0.1.5-cp310-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: microbiorust-0.1.5-cp310-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 1.0 MB
- Tags: CPython 3.10+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
72a094f11303d7b8ffef09f1225517d2f2c088928adb7cc0b8cc2d8fd7ffc40d
|
|
| MD5 |
fb58fe3299dd9fc88ed04d39c1a30f37
|
|
| BLAKE2b-256 |
691d117b39ec646dcbd652d1b98afacd4f024f005a3e65d43bcdb18817ed1a46
|