Skip to main content

Local inference server for RIFT Transcription — streaming and batch speech recognition, LLM transforms, and CLI transcription backed by local models.

Project description

rift-local

Local inference server for RIFT Transcription. Serves streaming speech recognition over WebSocket, backed by local models with automatic download.

Install

pip install rift-local

Backend extras

rift-local supports multiple ASR backends, each installed as an optional extra:

pip install rift-local[sherpa]      # sherpa-onnx (Nemotron, Kroko)
pip install rift-local[moonshine]   # Moonshine Gen 2 (via moonshine-voice)
pip install rift-local[sherpa,moonshine]  # both

On Apple Silicon, add MLX support for future GPU-accelerated batch transcription:

pip install rift-local[mlx]

For development (includes pytest):

pip install rift-local[dev]

Models

List all available models and see which are installed:

rift-local list
rift-local list --installed

sherpa-onnx models

Model Params Languages Download Notes
nemotron-en 0.6B EN 447 MB Best accuracy.
zipformer-en-kroko ~30M EN 55 MB Lightweight, fast. Only ~68 MB on disk.

Requires: pip install rift-local[sherpa]

Moonshine models

Model Params Languages Size Notes
moonshine-en-tiny 34M EN 26 MB Fastest. Good for low-resource.
moonshine-en-small 123M EN 95 MB Balanced speed/accuracy.
moonshine-en-medium 245M EN 190 MB Default. Best Moonshine accuracy.

Requires: pip install rift-local[moonshine]

Moonshine models are downloaded automatically by the moonshine-voice library on first use.

Usage

Server mode (for RIFT app)

Start the WebSocket server with any model:

# Start server and open RIFT Transcription in your browser
rift-local serve --open

# Moonshine (default model)
rift-local serve

# sherpa-onnx
rift-local serve --model nemotron-en

# Custom host/port
rift-local serve --model moonshine-en-tiny --host 0.0.0.0 --port 8080

The --open flag launches RIFT Transcription in your browser, pre-configured to connect to the local server. The voice source is set to "Local" automatically — just click to start the mic.

For local development of the RIFT Transcription client:

rift-local serve --open dev          # opens http://localhost:5173
rift-local serve --open dev:3000     # custom port

The server auto-downloads the model on first run, then listens on:

  • WebSocket: ws://127.0.0.1:2177/ws (streaming ASR)
  • HTTP: http://127.0.0.1:2177/info (model metadata)

Server options

Flag Default Description
--model moonshine-en-medium Model name from registry
--host 127.0.0.1 Bind address
--port 2177 Server port
--threads 2 Inference threads
--open off Open browser to RIFT Transcription client

WebSocket protocol

  1. Client connects to /ws
  2. Server sends info JSON (model name, features, sample rate)
  3. Client sends binary frames of Float32 PCM audio at 16 kHz
  4. Server sends result JSON messages with partial/final transcriptions
  5. Client sends text "Done" to end the session

Running tests

# Install dev + backend dependencies
pip install -e ".[dev,sherpa,moonshine]"

# Run fast tests (mocked backends, no model download)
pytest

# Run all tests including slow integration tests (downloads models)
pytest --slow

Tests are in the tests/ directory:

  • test_server.py — WebSocket server tests using a mock backend
  • test_moonshine.py — Moonshine adapter unit tests (mocked) + integration tests (slow)
  • conftest.py — Shared MockBackend fixture and --slow flag

Spec

See specs/rift-local.md for the full design document.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rift_local-0.1.0.dev1.tar.gz (135.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rift_local-0.1.0.dev1-py3-none-any.whl (19.0 kB view details)

Uploaded Python 3

File details

Details for the file rift_local-0.1.0.dev1.tar.gz.

File metadata

  • Download URL: rift_local-0.1.0.dev1.tar.gz
  • Upload date:
  • Size: 135.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for rift_local-0.1.0.dev1.tar.gz
Algorithm Hash digest
SHA256 e033b6e9170080dd5944c873ba18826917a45459fb8d71b57d7840286b2ecbdd
MD5 34e098fb35e8acfab0e7d3e03429da9e
BLAKE2b-256 45b4c0378054e2d522f084eb51c97d4ac9de52cfb50b7017c5d2fa01e42be342

See more details on using hashes here.

File details

Details for the file rift_local-0.1.0.dev1-py3-none-any.whl.

File metadata

File hashes

Hashes for rift_local-0.1.0.dev1-py3-none-any.whl
Algorithm Hash digest
SHA256 9efef7a8f89fc1ad46958a2495e895fa1f61db480dad6d68b77595b45d39401c
MD5 d79d1843f176b5fd8e796873f3c312a4
BLAKE2b-256 ff81359baa5fae994a37917809ae20cd1238d753ba91c407f015043e35aa9f72

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page