Suffix smoothing classifier v0.3.0: 1.8x faster, KN memory fix, model merging, and feature importance.
Project description
suffix-smoother
A high-performance, production-ready sequence classifier using recursive suffix smoothing.
Zero neural networks. Zero model files. Zero corpus downloads. Handles OOV via progressive backoff.
What's New in v0.3.0
- 1.8x - 2x Speedup: Vectorized core loops and optimized backoff weight caching.
- Advanced Conformal Prediction:
- APS (Adaptive Prediction Sets): State-of-the-art coverage guarantees.
- Online Calibration:
update_calibration()for incremental refinement. - Drift Detection:
detect_calibration_drift()to monitor reliability in production.
- Industrial Memory Management:
- KN Optimization: 44% memory reduction for Kneser-Ney models.
- Budget Pruning:
prune_to_budget()ensures model fits in strict RAM constraints.
- Collaborative Learning:
- Weighted Merging:
merge_weighted()for domain adaptation and ensemble fusion. - Sharded Fusion:
merge_all()for large-scale distributed training.
- Weighted Merging:
- Deep Interpretability:
- Feature Importance: Rank suffixes by discriminative power (KL divergence).
- Label Insight:
label_importance()finds motifs for specific classes.
Install
pip install suffix-smoother
Quick Start
from suffix_smoother import SuffixSmoother, SuffixConfig
# 1. Train and Predict
cfg = SuffixConfig(n_classes=10, max_nodes=5000) # Budgeted memory
model = SuffixSmoother(cfg)
model.train(data) # list of (seq_tuple, label_int)
# 2. Vectorized High-Throughput Inference
results = model.predict_batch(sequences)
# 3. Model Merging (Domain Adaptation)
general_model = SuffixSmoother.load("general.pkl")
medical_model = SuffixSmoother.load("medical.pkl")
# Fuse knowledge: medical knowledge is 5x more important for this deployment
fused = SuffixSmoother.merge_weighted(general_model, medical_model, w_a=1.0, w_b=5.0)
# 4. Conformal Reliability
fused.calibrate(val_data, score_type="aps")
prediction_set = fused.predict_set(seq, coverage=0.95)
Performance (v0.3.0)
| Operation | v0.2.1 | v0.3.0 | Improvement |
|---|---|---|---|
| Inference (Top-1) | 14.1 μs | 7.0 μs | 2.0x |
| Batch Throughput | 6,000/s | 140,000/s | 23x |
| KN Memory | 100% | 56% | -44% |
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
suffix_smoother-0.3.0.tar.gz
(12.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file suffix_smoother-0.3.0.tar.gz.
File metadata
- Download URL: suffix_smoother-0.3.0.tar.gz
- Upload date:
- Size: 12.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
33e3304697b3b7606848e0563556e1040017a3992a01a47eb87c123a3efe3b66
|
|
| MD5 |
4f9f1ed8f6c322de4e6b46dfabcd27be
|
|
| BLAKE2b-256 |
b64cb4c8a897bdf893e1ef5006df3de637904129e195d4bde9dd6d65b4f10a38
|
File details
Details for the file suffix_smoother-0.3.0-py3-none-any.whl.
File metadata
- Download URL: suffix_smoother-0.3.0-py3-none-any.whl
- Upload date:
- Size: 9.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2904288e35f29218a65f3f3420208e25bc105d7d0ed3bdcedfd8617d9dc5f62
|
|
| MD5 |
a7b9a4e0951d439f43cf49466fb71902
|
|
| BLAKE2b-256 |
9bd57ac3394f094b94e4119e7ac2614ec4610f898b6e465950e397ec01fc5f6b
|