Hardware-adaptive multilingual ASR library for the Lattice family. Pluggable engines (Parakeet, Whisper, Distil-Whisper), optional diarization, streaming.
Project description
lattice-asr
Hardware-adaptive multilingual ASR library for the Lattice family.
Status: v0.1 implementation in progress. W1 (foundation), W2 (Whisper + LID + Transcriber), W3.1 (
ParakeetMlxEngine), W3.2 (ParakeetTdtEngine), W4 (RemoteEngine + lattice-asr-server), W5 (diarization), and W6.1 (perf-gate skeleton) are landed; W3-future (WhisperCppEnginefor Apple Silicon multilingual) is deferred past v0.1; W6.2 (CI workflow), W6.3 (this rewrite), and v0.1.0 (tag + PyPI) are pending. All three v0.1 perf gates clear via the wrappers on canonical hosts: C1 RTF 45.43× on Switch (Apple M4 Pro, parakeet-mlx), C2 RTF 115.12× on Cypher (RTX 2070 Turing, parakeet-tdt), C3 RTF 3.35× on Switch (faster-whisper distil-large-v3 int8). 81 r_tier tests passing. See CHANGELOG.md for the unreleased detail and docs/performance-baseline.md for the canonical baselines.
What it is
A Lattice-layer library that lifts speech-to-text from per-consumer ad-hoc plumbing into a single pluggable surface. Consumers (lattice-meetbot for meeting transcription, lattice-dictate for push-to-talk dictation, WhatsApp voice-message decryption, future ambient capture) all transcribe through the same Transcriber interface.
What it solves
- Hardware adaptivity — picks Parakeet-TDT (NeMo) on NVIDIA, parakeet-mlx on Apple Silicon, Distil-Whisper via faster-whisper on CPU-only (arm64 or x86_64 — non-GPU fallback path). Single API across all paths.
- Multilingual — dual-engine load: English route uses Parakeet (fastest, English-only); non-English route uses Whisper-large-v3-turbo or Distil-Whisper. Routing via Silero LID on the first 1.5 s of audio.
- Optional diarization —
transcribe(..., diarize=True)returns segments with speaker labels (pyannote.audio CPU/GPU, NVIDIA Sortformer GPU). - Streaming — VAD-segment-bounded partial results for ambient/meeting use.
- Remote engine —
RemoteEngineadapter forwards transcription to a network endpoint speaking the lattice-asr wire protocol; lets a CPU-only laptop offload to a GPU host. - Telemetry-injected — every call records duration, engine, language, audio length to a consumer-supplied sink. No hidden coupling.
Non-goals (v0.1)
Speaker identification (deferred to a future lattice-voiceprint), on-device fine-tuning, sub-100 ms streaming partials, audio enhancement, translation, browser/mobile bindings.
Status
| Phase | State |
|---|---|
| Spec | Locked — vault 02_Projects/Lattice/lattice-asr/Specifications/2026-04-27 lattice-asr v1 - Design Spec.md |
| Plan | Ratified and in execution — vault 02_Projects/Lattice/lattice-asr/Plans/2026-05-08 lattice-asr v0.1 - Implementation Plan.md |
| W1 Foundation | Landed (hardware probe, types, telemetry, config) |
| W2 Whisper + LID + Transcriber MVP | Landed |
| W3.1 ParakeetMlxEngine (Apple Silicon EN, MLX runtime) | Landed — verified Switch RTF 45.43× via wrapper |
| W3.2 ParakeetTdtEngine (NVIDIA EN, NeMo runtime) | Landed — verified Cypher RTX 2070 RTF 115.12× via wrapper |
| W3-future WhisperCppEngine (Apple Silicon multilingual) | Deferred past v0.1 |
| W4 RemoteEngine + lattice-asr-server | Landed |
| W5 Diarization | Landed (pyannote + sortformer adapters + Transcriber wire-in; real-model exercise gated on HF_TOKEN / NeMo) |
| W6 Ship (perf, CI, release) | Partial — W6.1 perf-gate skeleton landed and all three C1/C2/C3 gates pass via wrappers on canonical hosts; W6.2 CI workflow + W6.3 README rewrite + v0.1.0 tag pending. Canonical hosts: C1+C3 on Switch (Apple M4 Pro Mac mini), C2 on Cypher (Linux + RTX 2070). |
| First consumer | lattice-dictate (planned) |
| Second consumer | lattice-meetbot (refactored transcription path) |
License
Apache-2.0 (Lattice default). See LICENSE.
Family
Part of the Lattice family. Sibling libraries: lattice-meetbot, lattice-meeting, lattice-watch, lattice-dictate (planned), lattice-recall (planned).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lattice_asr-0.1.0.tar.gz.
File metadata
- Download URL: lattice_asr-0.1.0.tar.gz
- Upload date:
- Size: 823.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
24c07408148155967a0717faf9bc0108e6b9c1f5f74275b38b72e57398517a00
|
|
| MD5 |
67a68a88a5d3de2a9c4f9e81277eb89e
|
|
| BLAKE2b-256 |
82fa022219a693ec2732f3ffdd67ff85c53eba78b44b075d9347ba5e64176221
|
File details
Details for the file lattice_asr-0.1.0-py3-none-any.whl.
File metadata
- Download URL: lattice_asr-0.1.0-py3-none-any.whl
- Upload date:
- Size: 32.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cad44e5b54c753b88492d53c3aa2430ca48ffbd4252b4ca9ccbb21e7ec965e0e
|
|
| MD5 |
43ecc2a841d4d6ff6531caecd9e5bd73
|
|
| BLAKE2b-256 |
2ad687118018c3dffe3c1373388c457ef3f907f4f9a763e3a478f200e182bf82
|