No project description provided
Project description
raidx
High-performance FASTA file reader with Python bindings
raidx is a drop-in replacement for pyfaidx implemented in Rust, providing 2-4x faster performance for FASTA file operations while maintaining full API compatibility.
⚡ Performance
raidx is fast:
| Operation | pyfaidx (ms) | raidx (ms) | Speedup |
|---|---|---|---|
| 🚀 File Opening | 0.254 | 0.068 | 3.72x faster |
| 🧬 Sequence Access | 0.252 | 0.061 | 4.13x faster |
| ✂️ Sequence Slicing | 0.259 | 0.077 | 3.35x faster |
| 🔍 get_seq Method | 0.268 | 0.071 | 3.76x faster |
| 🔄 Reverse Complement | 0.287 | 0.071 | 4.03x faster |
| 🔁 Sequence Iteration | 0.299 | 0.097 | 3.08x faster |
| 🎯 Random Access | 3.403 | 1.172 | 2.90x faster |
📊 Benchmarked on the hg38 human genome assembly with 1000 iterations per test
Installation
pip install .
Quick Start
raidx provides the same API as pyfaidx:
>>> from raidx import Fasta
>>> genome = Fasta('genome.fasta')
>>> genome
Fasta("genome.fasta")
# Access sequences like a dictionary
>>> genome['chr1'][1000:1100]
>chr1:1001-1100
ATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGC...
# Get sequence metadata
>>> seq = genome['chr1'][1000:1100]
>>> seq.name
'chr1'
>>> seq.start # 1-based
1001
>>> seq.end # 0-based
1100
# String-like operations
>>> genome['chr1'][1000:1100].complement
>chr1 (complement):1001-1100
TACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACG...
>>> -genome['chr1'][1000:1100] # reverse complement
>chr1 (complement):1100-1001
GCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCAT...
# Method-based access
>>> genome.get_seq('chr1', 1001, 1100)
>chr1:1001-1100
ATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGC...
# Iteration
>>> for record in genome:
... print(f"{record.name}: {len(record)} bp")
chr1: 248956422 bp
chr2: 242193529 bp
...
Key Features
- Drop-in replacement for pyfaidx - same API, same behavior
- Memory-mapped I/O for efficient file access
- Rust performance with Python convenience
- Full compatibility with existing pyfaidx code
- Comprehensive indexing (.fai files compatible with samtools)
- Rich sequence objects with metadata and methods
- String-like operations (slicing, reverse, complement)
API Compatibility
raidx implements the complete pyfaidx API:
# All pyfaidx features work identically
from raidx import Fasta
# Indexing and slicing
genome = Fasta('genome.fasta')
genome['chr1'][1000:2000]
genome[0][:100] # First sequence, first 100 bp
# Sequence operations
seq = genome['chr1'][1000:1100]
seq.complement
seq.reverse
-seq # reverse complement
# Method calls
genome.get_seq('chr1', 1000, 2000)
genome.keys()
len(genome)
# Iteration
for record in genome:
print(record.name, len(record))
Benchmarking
raidx includes two benchmarking approaches for different use cases:
pytest-benchmark
Use the organized benchmarks/ directory with pytest-benchmark for development, CI/CD, and detailed performance analysis:
# Install benchmark dependencies
pip install -e ".[benchmark]"
# Run all benchmarks
pytest benchmarks/
# Run specific benchmark categories
pytest benchmarks/benchmark_file_ops.py # File operations
pytest benchmarks/benchmark_sequence_ops.py # Sequence operations
# Save and compare results
pytest benchmarks/ --benchmark-save=baseline
pytest benchmarks/ --benchmark-compare=baseline
Standalone Benchmarks
Use the small benchmark tool for quick performance comparisons on your own files:
# Benchmark your files
python benchmark_raidx.py your_genome.fasta
# Adjust benchmarking details
python benchmark_raidx.py genome.fasta --iterations 1000 --random-access 500
Why raidx? raidx provides the same familiar pyfaidx interface, but with the performance of Rust underneath. Perfect for the pipelines that need to scale.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file raidx-0.1.0.tar.gz.
File metadata
- Download URL: raidx-0.1.0.tar.gz
- Upload date:
- Size: 103.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
445604d658ec38c802dd021e1e033fd99c9837cfe12c980dee93ad444249d25b
|
|
| MD5 |
dbc58e8c24c89ca1d543759363e8a301
|
|
| BLAKE2b-256 |
5600f5292eb270ec399a282bf8c4d00b3db15950f8f3a15c9094ee0f63799faa
|
File details
Details for the file raidx-0.1.0-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: raidx-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 255.1 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba8c25fc375d03100e5529daab1bc7325225fa5f435054ad3900e431942827ea
|
|
| MD5 |
0a30ff18542b5180fea9fa75daf34c6f
|
|
| BLAKE2b-256 |
92c2dbfe652de07638455f2bc4d21a149bfc11ed7ba5e4804b5057282feca97f
|