Local inference server for RIFT Transcription — streaming and batch speech recognition, LLM transforms, and CLI transcription backed by local models.
Project description
rift-local
Local inference server for RIFT Transcription. Serves streaming speech recognition over WebSocket, backed by local models with automatic download.
Install
pip install rift-local
Backend extras
rift-local supports multiple ASR backends, each installed as an optional extra:
pip install rift-local[sherpa] # sherpa-onnx (Nemotron, Kroko)
pip install rift-local[moonshine] # Moonshine Gen 2 (via moonshine-voice)
pip install rift-local[sherpa,moonshine] # both
On Apple Silicon, add MLX support for future GPU-accelerated batch transcription:
pip install rift-local[mlx]
For development (includes pytest):
pip install rift-local[dev]
Models
List all available models and see which are installed:
rift-local list
rift-local list --installed
sherpa-onnx models
| Model | Params | Languages | Download | Notes |
|---|---|---|---|---|
nemotron-en |
0.6B | EN | 447 MB | Best accuracy. |
zipformer-en-kroko |
~30M | EN | 55 MB | Lightweight, fast. Only ~68 MB on disk. |
Requires: pip install rift-local[sherpa]
Moonshine models
| Model | Params | Languages | Size | Notes |
|---|---|---|---|---|
moonshine-en-tiny |
34M | EN | 26 MB | Fastest. Good for low-resource. |
moonshine-en-small |
123M | EN | 95 MB | Balanced speed/accuracy. |
moonshine-en-medium |
245M | EN | 190 MB | Default. Best Moonshine accuracy. |
Requires: pip install rift-local[moonshine]
Moonshine models are downloaded automatically by the moonshine-voice library on first use.
Usage
Server mode (for RIFT app)
Start the WebSocket server with any model:
# Start server and open RIFT Transcription in your browser
rift-local serve --open
# Moonshine (default model)
rift-local serve
# sherpa-onnx
rift-local serve --model nemotron-en
# Custom host/port
rift-local serve --model moonshine-en-tiny --host 0.0.0.0 --port 8080
The --open flag launches RIFT Transcription in your browser, pre-configured to connect to the local server. The voice source is set to "Local" automatically — just click to start the mic.
For local development of the RIFT Transcription client:
rift-local serve --open dev # opens http://localhost:5173
rift-local serve --open dev:3000 # custom port
The server auto-downloads the model on first run, then listens on:
- WebSocket:
ws://127.0.0.1:2177/ws(streaming ASR) - HTTP:
http://127.0.0.1:2177/info(model metadata)
Server options
| Flag | Default | Description |
|---|---|---|
--model |
moonshine-en-medium |
Model name from registry |
--host |
127.0.0.1 |
Bind address |
--port |
2177 |
Server port |
--threads |
2 |
Inference threads |
--open |
off | Open browser to RIFT Transcription client |
WebSocket protocol
- Client connects to
/ws - Server sends
infoJSON (model name, features, sample rate) - Client sends binary frames of Float32 PCM audio at 16 kHz
- Server sends
resultJSON messages with partial/final transcriptions - Client sends text
"Done"to end the session
Running tests
# Install dev + backend dependencies
pip install -e ".[dev,sherpa,moonshine]"
# Run fast tests (mocked backends, no model download)
pytest
# Run all tests including slow integration tests (downloads models)
pytest --slow
Tests are in the tests/ directory:
test_server.py— WebSocket server tests using a mock backendtest_moonshine.py— Moonshine adapter unit tests (mocked) + integration tests (slow)conftest.py— SharedMockBackendfixture and--slowflag
Spec
See specs/rift-local.md for the full design document.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rift_local-0.1.0.dev1.tar.gz.
File metadata
- Download URL: rift_local-0.1.0.dev1.tar.gz
- Upload date:
- Size: 135.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e033b6e9170080dd5944c873ba18826917a45459fb8d71b57d7840286b2ecbdd
|
|
| MD5 |
34e098fb35e8acfab0e7d3e03429da9e
|
|
| BLAKE2b-256 |
45b4c0378054e2d522f084eb51c97d4ac9de52cfb50b7017c5d2fa01e42be342
|
File details
Details for the file rift_local-0.1.0.dev1-py3-none-any.whl.
File metadata
- Download URL: rift_local-0.1.0.dev1-py3-none-any.whl
- Upload date:
- Size: 19.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9efef7a8f89fc1ad46958a2495e895fa1f61db480dad6d68b77595b45d39401c
|
|
| MD5 |
d79d1843f176b5fd8e796873f3c312a4
|
|
| BLAKE2b-256 |
ff81359baa5fae994a37917809ae20cd1238d753ba91c407f015043e35aa9f72
|