Skip to main content

Local inference server for RIFT Transcription — streaming and batch speech recognition, LLM transforms, and CLI transcription backed by local models.

Project description

rift-local

Local inference server for RIFT Transcription. Serves streaming speech recognition over WebSocket, backed by local models with automatic download.

Install

pip install rift-local

Backend extras

rift-local supports multiple ASR backends, each installed as an optional extra:

pip install rift-local[sherpa]      # sherpa-onnx (Nemotron, Kroko)
pip install rift-local[moonshine]   # Moonshine Gen 2 (via moonshine-voice)
pip install rift-local[sherpa,moonshine]  # both

On Apple Silicon, add MLX support for future GPU-accelerated batch transcription:

pip install rift-local[mlx]

For development (includes pytest):

pip install rift-local[dev]

Models

List all available models and see which are installed:

rift-local list
rift-local list --installed

sherpa-onnx models

Model Params Languages Download Notes
nemotron-en 0.6B EN 447 MB Best accuracy.
zipformer-en-kroko ~30M EN 55 MB Lightweight, fast. Only ~68 MB on disk.

Requires: pip install rift-local[sherpa]

Moonshine models

Model Params Languages Size Notes
moonshine-en-tiny 34M EN 26 MB Fastest. Good for low-resource.
moonshine-en-small 123M EN 95 MB Balanced speed/accuracy.
moonshine-en-medium 245M EN 190 MB Default. Best Moonshine accuracy.

Requires: pip install rift-local[moonshine]

Moonshine models are downloaded automatically by the moonshine-voice library on first use.

Usage

Server mode (for RIFT app)

Start the WebSocket server with any model:

# Start server and open RIFT Transcription in your browser
rift-local serve --open

# Moonshine (default model)
rift-local serve

# sherpa-onnx
rift-local serve --model nemotron-en

# Custom host/port
rift-local serve --model moonshine-en-tiny --host 0.0.0.0 --port 8080

The --open flag launches RIFT Transcription in your browser, pre-configured to connect to the local server. The voice source is set to "Local" automatically — just click to start the mic.

For local development of the RIFT Transcription client:

rift-local serve --open dev          # opens http://localhost:5173
rift-local serve --open dev:3000     # custom port

The server auto-downloads the model on first run, then listens on:

  • WebSocket: ws://127.0.0.1:2177/ws (streaming ASR)
  • HTTP: http://127.0.0.1:2177/info (model metadata)

Server options

Flag Default Description
--model moonshine-en-medium Model name from registry
--host 127.0.0.1 Bind address
--port 2177 Server port
--threads 2 Inference threads
--open off Open browser to RIFT Transcription client

WebSocket protocol

  1. Client connects to /ws
  2. Server sends info JSON (model name, features, sample rate)
  3. Client sends binary frames of Float32 PCM audio at 16 kHz
  4. Server sends result JSON messages with partial/final transcriptions
  5. Client sends text "Done" to end the session

Running tests

# Install dev + backend dependencies
pip install -e ".[dev,sherpa,moonshine]"

# Run fast tests (mocked backends, no model download)
pytest

# Run all tests including slow integration tests (downloads models)
pytest --slow

Tests are in the tests/ directory:

  • test_server.py — WebSocket server tests using a mock backend
  • test_moonshine.py — Moonshine adapter unit tests (mocked) + integration tests (slow)
  • conftest.py — Shared MockBackend fixture and --slow flag

Spec

See specs/rift-local.md for the full design document.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rift_local-0.0.1.tar.gz (136.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rift_local-0.0.1-py3-none-any.whl (18.9 kB view details)

Uploaded Python 3

File details

Details for the file rift_local-0.0.1.tar.gz.

File metadata

  • Download URL: rift_local-0.0.1.tar.gz
  • Upload date:
  • Size: 136.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for rift_local-0.0.1.tar.gz
Algorithm Hash digest
SHA256 69900240f8b11da1ca89e07f25d90eba6af0290ef43a38f702c907f8c8995583
MD5 e7a50d3fd15c0cbe555d352e06307455
BLAKE2b-256 1ce254c6cb9a79e8d09c12bab1a203b9c80592113e1de2bc5c652a935036fc08

See more details on using hashes here.

File details

Details for the file rift_local-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: rift_local-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 18.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for rift_local-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f559516a696540d60219e78aa52513c42fd4f448fa7d2dcc493885f1cbee5d31
MD5 34a6bc7cd4d47a9fc2272955e14bc53f
BLAKE2b-256 b30a608ea1d79dbe496aa81c486568d2946a8900f4389a73feb6a595407811b7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page