Skip to main content

Hardware-adaptive multilingual ASR library for the Lattice family. Pluggable engines (Parakeet, Whisper, Distil-Whisper), optional diarization, streaming.

Project description

lattice-asr

Hardware-adaptive multilingual ASR library for the Lattice family.

Status: v0.1 implementation in progress. W1 (foundation), W2 (Whisper + LID + Transcriber), W3.1 (ParakeetMlxEngine), W3.2 (ParakeetTdtEngine), W4 (RemoteEngine + lattice-asr-server), W5 (diarization), and W6.1 (perf-gate skeleton) are landed; W3-future (WhisperCppEngine for Apple Silicon multilingual) is deferred past v0.1; W6.2 (CI workflow), W6.3 (this rewrite), and v0.1.0 (tag + PyPI) are pending. All three v0.1 perf gates clear via the wrappers on canonical hosts: C1 RTF 45.43× on Switch (Apple M4 Pro, parakeet-mlx), C2 RTF 115.12× on Cypher (RTX 2070 Turing, parakeet-tdt), C3 RTF 3.35× on Switch (faster-whisper distil-large-v3 int8). 81 r_tier tests passing. See CHANGELOG.md for the unreleased detail and docs/performance-baseline.md for the canonical baselines.

What it is

A Lattice-layer library that lifts speech-to-text from per-consumer ad-hoc plumbing into a single pluggable surface. Consumers (lattice-meetbot for meeting transcription, lattice-dictate for push-to-talk dictation, WhatsApp voice-message decryption, future ambient capture) all transcribe through the same Transcriber interface.

What it solves

  • Hardware adaptivity — picks Parakeet-TDT (NeMo) on NVIDIA, parakeet-mlx on Apple Silicon, Distil-Whisper via faster-whisper on CPU-only (arm64 or x86_64 — non-GPU fallback path). Single API across all paths.
  • Multilingual — dual-engine load: English route uses Parakeet (fastest, English-only); non-English route uses Whisper-large-v3-turbo or Distil-Whisper. Routing via Silero LID on the first 1.5 s of audio.
  • Optional diarizationtranscribe(..., diarize=True) returns segments with speaker labels (pyannote.audio CPU/GPU, NVIDIA Sortformer GPU).
  • Streaming — VAD-segment-bounded partial results for ambient/meeting use.
  • Remote engineRemoteEngine adapter forwards transcription to a network endpoint speaking the lattice-asr wire protocol; lets a CPU-only laptop offload to a GPU host.
  • Telemetry-injected — every call records duration, engine, language, audio length to a consumer-supplied sink. No hidden coupling.

Non-goals (v0.1)

Speaker identification (deferred to a future lattice-voiceprint), on-device fine-tuning, sub-100 ms streaming partials, audio enhancement, translation, browser/mobile bindings.

Status

Phase State
Spec Locked — vault 02_Projects/Lattice/lattice-asr/Specifications/2026-04-27 lattice-asr v1 - Design Spec.md
Plan Ratified and in execution — vault 02_Projects/Lattice/lattice-asr/Plans/2026-05-08 lattice-asr v0.1 - Implementation Plan.md
W1 Foundation Landed (hardware probe, types, telemetry, config)
W2 Whisper + LID + Transcriber MVP Landed
W3.1 ParakeetMlxEngine (Apple Silicon EN, MLX runtime) Landed — verified Switch RTF 45.43× via wrapper
W3.2 ParakeetTdtEngine (NVIDIA EN, NeMo runtime) Landed — verified Cypher RTX 2070 RTF 115.12× via wrapper
W3-future WhisperCppEngine (Apple Silicon multilingual) Deferred past v0.1
W4 RemoteEngine + lattice-asr-server Landed
W5 Diarization Landed (pyannote + sortformer adapters + Transcriber wire-in; real-model exercise gated on HF_TOKEN / NeMo)
W6 Ship (perf, CI, release) Partial — W6.1 perf-gate skeleton landed and all three C1/C2/C3 gates pass via wrappers on canonical hosts; W6.2 CI workflow + W6.3 README rewrite + v0.1.0 tag pending. Canonical hosts: C1+C3 on Switch (Apple M4 Pro Mac mini), C2 on Cypher (Linux + RTX 2070).
First consumer lattice-dictate (planned)
Second consumer lattice-meetbot (refactored transcription path)

License

Apache-2.0 (Lattice default). See LICENSE.

Family

Part of the Lattice family. Sibling libraries: lattice-meetbot, lattice-meeting, lattice-watch, lattice-dictate (planned), lattice-recall (planned).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lattice_asr-0.1.0.tar.gz (823.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lattice_asr-0.1.0-py3-none-any.whl (32.1 kB view details)

Uploaded Python 3

File details

Details for the file lattice_asr-0.1.0.tar.gz.

File metadata

  • Download URL: lattice_asr-0.1.0.tar.gz
  • Upload date:
  • Size: 823.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for lattice_asr-0.1.0.tar.gz
Algorithm Hash digest
SHA256 24c07408148155967a0717faf9bc0108e6b9c1f5f74275b38b72e57398517a00
MD5 67a68a88a5d3de2a9c4f9e81277eb89e
BLAKE2b-256 82fa022219a693ec2732f3ffdd67ff85c53eba78b44b075d9347ba5e64176221

See more details on using hashes here.

File details

Details for the file lattice_asr-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: lattice_asr-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 32.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for lattice_asr-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cad44e5b54c753b88492d53c3aa2430ca48ffbd4252b4ca9ccbb21e7ec965e0e
MD5 43ecc2a841d4d6ff6531caecd9e5bd73
BLAKE2b-256 2ad687118018c3dffe3c1373388c457ef3f907f4f9a763e3a478f200e182bf82

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page