A high-performance Rust library for weighted finite-state transducers with Python bindings
Project description
ArcWeight
A high-performance Rust library for weighted finite-state transducers with comprehensive semiring support.
ArcWeight provides efficient algorithms for constructing, combining, and optimizing weighted finite-state transducers (WFSTs), making it suitable for natural language processing, speech recognition, and computational linguistics applications.
Features
- Core FST Operations: Composition, determinization, minimization, closure, union, concatenation
- Advanced Algorithms: Shortest path, weight pushing, epsilon removal, pruning, synchronization
- Rich Semiring Support: Tropical, log, probability, boolean, integer, product, and Gallic weights
- Multiple FST Implementations: Vector-based, constant, compact, lazy evaluation, and cached
- Type-Safe Design: Zero-cost abstractions with trait-based polymorphism
- OpenFST Compatible: Read and write OpenFST format files
- Python Bindings: Full-featured Python API via PyO3 for easy integration
- Pure Rust: Memory-safe implementation with no C++ dependencies
- Parallel Processing: Optional Rayon-based parallelization for large FSTs
Quick Start
Add ArcWeight to your Cargo.toml:
[dependencies]
arcweight = "0.2"
Basic Example
use arcweight::prelude::*;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create a simple FST
let mut fst = VectorFst::<TropicalWeight>::new();
// Add states
let s0 = fst.add_state();
let s1 = fst.add_state();
let s2 = fst.add_state();
// Set start and final states
fst.set_start(s0);
fst.set_final(s2, TropicalWeight::one());
// Add arcs
fst.add_arc(s0, Arc::new(1, 1, TropicalWeight::one(), s1));
fst.add_arc(s1, Arc::new(2, 2, TropicalWeight::one(), s2));
// Perform operations
let minimized = minimize(&fst)?;
println!("Original states: {}", fst.num_states());
println!("Minimized states: {}", minimized.num_states());
Ok(())
}
Python Bindings
ArcWeight also provides Python bindings for easy integration into Python projects:
pip install arcweight
import arcweight
# Create a new FST
fst = arcweight.VectorFst()
# Add states
s0 = fst.add_state()
s1 = fst.add_state()
# Set start state
fst.set_start(s0)
# Add an arc: from s0 to s1, input=1, output=1, weight=1.0
fst.add_arc(s0, 1, 1, 1.0, s1)
# Set final state
fst.set_final(s1, 0.5)
# Perform operations
minimized = arcweight.minimize(fst)
composed = arcweight.compose(fst1, fst2)
The Python API provides full access to all FST operations and algorithms. See the Python bindings documentation for more details.
Examples
ArcWeight includes comprehensive examples demonstrating real-world applications:
# String edit distance
cargo run --example edit_distance
# Spell checking and correction
cargo run --example spell_checking
# Morphological analysis
cargo run --example morphological_analyzer
# Phonological rules
cargo run --example phonological_rules
# Text normalization
cargo run --example number_date_normalizer
See the examples/ directory for complete implementations with detailed explanations.
Documentation
- API Documentation - Complete API reference with examples
- Examples - Real-world applications and usage patterns
Minimum Supported Rust Version (MSRV)
ArcWeight requires Rust 1.85.0 or later.
The MSRV is explicitly tested in CI and will only be increased in minor version updates. When the MSRV is increased, the previous two stable releases will still be supported for six months.
Performance
ArcWeight is designed for high performance:
- Zero-copy arc iteration minimizes allocations
- Cache-friendly data structures optimize memory access
- Optional parallel algorithms leverage multi-core processors
- Automatic algorithm selection based on FST properties
Run benchmarks on your system:
cargo bench
Contributing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
Quick checklist:
- Follow existing code style (run
cargo fmt) - Add tests for new functionality (run
cargo test) - Update documentation for public APIs (run
cargo doc) - Ensure all CI checks pass (run
cargo clippy)
Getting Help
- Documentation - API reference and guides
- Issues - Bug reports and feature requests
- Discussions - Questions and community support
License
Licensed under the Apache License, Version 2.0. See LICENSE for details.
Citation
If you use ArcWeight in your research, please cite:
@software{arcweight,
author = {White, Aaron Steven},
title = {ArcWeight: A Rust Library for Weighted Finite-State Transducers},
url = {https://github.com/aaronstevenwhite/arcweight},
doi = {10.5281/zenodo.17371992},
year = {2025}
}
References
ArcWeight implements algorithms based on:
- Mehryar Mohri. 1997. Finite-State Transducers in Language and Speech Processing. Computational Linguistics 23(2):269-311.
- Mehryar Mohri. 2002. Semiring Frameworks and Algorithms for Shortest-Distance Problems. Journal of Automata, Languages and Combinatorics 7(3):321-350.
- Mehryar Mohri. 2009. Weighted Automata Algorithms. In Handbook of Weighted Automata, pages 213-254. Springer.
Acknowledgments
This library was architected and implemented with the help of Claude Code.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file arcweight-0.2.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: arcweight-0.2.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 630.3 kB
- Tags: CPython 3.8+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2bc2d9ab733244ee8650bbdba104fea16cac1238236f99ae5469102757b88e36
|
|
| MD5 |
4b9c55b94d0a6b6184896d7789ad1ee2
|
|
| BLAKE2b-256 |
0dbc21f2fe8de03e64e8736003334e485e90f352b2c786c075afad7e4aed5771
|
File details
Details for the file arcweight-0.2.0-cp38-abi3-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.
File metadata
- Download URL: arcweight-0.2.0-cp38-abi3-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.8+, macOS 10.12+ universal2 (ARM64, x86-64), macOS 10.12+ x86-64, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4e2b300a81e1dfd2fefa38707cb5ac8c715b82437b233b3c74d20e96d6e1b294
|
|
| MD5 |
2c2216becba74b9467153f50132db5e6
|
|
| BLAKE2b-256 |
bdc10de09dd40b572eebd5b44f23e949335183f849b0cb5ccf72415e7e326815
|