A unified Python library for speech AI — ASR and TTS using open models
Project description
RevoS
A unified Python library for speech AI — ASR and TTS using open models.
Installation
# Core (ASR support)
pip install revospeech
# With TTS support (RevoVoice — requires PyTorch)
pip install revospeech[tts]
# With GPU support
pip install revospeech[gpu]
# Everything (GPU + TTS)
pip install revospeech[all]
# Or with uv
uv add revospeech
HuggingFace Login (Required for TTS)
Note: The RevoVoice TTS model is hosted on a private HuggingFace repository. You must log in before using TTS.
pip install huggingface-hub
huggingface-cli login
Get your token at https://huggingface.co/settings/tokens
Important Notes
revospeech[gpu]andrevospeech[all]installonnxruntime-gpu, which conflicts withonnxruntime. If you already haverevosinstalled, uninstall it first before installing the GPU variant before installing the GPU variant.- Audio formats supported: WAV, FLAC, OGG, and any format supported by
libsndfile.
Quick Start
ASR (Automatic Speech Recognition)
from revos.asr import ASR
asr = ASR('zipformer-v2')
result = asr.transcribe('meeting.wav')
print(result.text) # Full transcript
print(result.language) # Detected language
for seg in result.segments:
print(f"[{seg.start:.1f}s - {seg.end:.1f}s] {seg.text}")
TTS (Text-to-Speech)
from revos.tts import TTS
# Basic synthesis
tts = TTS('revovoice')
audio = tts.synthesize('Hello, how are you?')
audio.save('greeting.wav')
# Voice cloning (with reference audio)
audio = tts.synthesize(
'This will sound like the reference speaker.',
ref_audio='speaker.wav',
ref_text='Sample of the speaker talking.',
)
audio.save('cloned.wav')
Model Discovery
import revos
# List all models with status
revos.list_models()
# Filter by task, mode, status
revos.list_models(task="asr", status="ready")
revos.list_models(mode="api")
# Fuzzy search
revos.search_models("english fast")
# Check a specific model
status = revos.check_model("zipformer-v2")
print(status.is_ready)
CLI
# Transcribe audio
revos transcribe -m zipformer-v2 audio.wav
# JSON output
revos transcribe -m zipformer-v2 --json audio.wav
# SRT subtitles
revos transcribe -m zipformer-v2 --srt audio.wav
# Synthesize speech
revos synthesize -m revovoice -t "Hello, world!" -o output.wav
# From text file
revos synthesize -m revovoice -f script.txt -o audiobook.wav
# List available models (with status icons)
revos models
revos models --ready # Only ready-to-use models
revos models --mode api # Only API models
revos models --task asr # Filter by task
# Detailed model info
revos models-info zipformer-v2
# Fuzzy search
revos search "english fast"
# Browse remote catalog
revos catalog list
# Pull a model from the catalog
revos catalog pull revovoice
# API key management
revos config set-api-key
# Show environment info
revos info
Available Models
| Model | Task | Backend | Mode | Languages | Access | Description |
|---|---|---|---|---|---|---|
zipformer-v2 |
ASR | sherpa-onnx | local | English | Open | Zipformer small transducer model |
revovoice |
TTS | RevoVoice | local | 600+ | Gated | Zero-shot multilingual TTS with voice cloning |
Model Directory
revos/models/
├── asr/
│ └── zipformer_v2.yaml # Open — downloads from GitHub releases
└── tts/
└── revovoice.yaml # Gated — requires HF login + approval
Gated Model Access
Some models (like revovoice) are hosted on private HuggingFace repositories
and require approval before use.
-
Log in to HuggingFace:
pip install huggingface-hub huggingface-cli login
Get your token at https://huggingface.co/settings/tokens
-
Request access: Visit the model's HuggingFace page and submit an access request. The repo owner will review and approve.
-
Use the model: Once approved, the model will download automatically on first use:
from revos.tts import TTS tts = TTS('revovoice') # Will prompt for HF login if not authenticated
For team members adding models: If your model is gated, set
hf_private: truein the YAML manifest. This tells RevoS to check HF authentication before downloading.
Configuration
API Keys
For cloud API backends, set your API key:
# Option 1: Environment variable
export REVOLAB_API_KEY=rv-your-key-here
# Option 2: CLI command (saves to ~/.config/revos/config.yaml)
revos config set-api-key
Resolution order: constructor arg > REVOLAB_API_KEY env var > ~/.config/revos/config.yaml
Catalog Source
Override the catalog source with:
export REVOS_CATALOG_REPO="myorg/revos" # env var
Or in ~/.config/revos/config.yaml:
catalog_repo: "myorg/revos"
Adding Custom Models
Add a YAML manifest to ~/.config/revos/models/:
# ~/.config/revos/models/asr/my-model.yaml
name: my-custom-model
task: asr
mode: local
backend: sherpa-onnx
model_type: transducer
model_url: "https://example.com/models/my-model.tar.bz2"
sample_rate: 16000
language: en
description: "My custom ASR model"
capabilities: ["word-timestamps"]
languages: ["en"]
files:
encoder: "encoder.onnx"
decoder: "decoder.onnx"
joiner: "joiner.onnx"
tokens: "tokens.txt"
Then use it: from revos.asr import ASR; asr = ASR('my-custom-model')
API Models
# ~/.config/revos/models/asr/my-api-model.yaml
name: my-api-model
task: asr
mode: api
backend: my-api
api_endpoint: "https://api.example.com/v1"
description: "Cloud ASR"
capabilities: ["streaming"]
languages: ["en"]
Pinning Model Versions
For HuggingFace-hosted models, pin to a specific commit hash or tag using the revision field:
revision: "a1b2c3d" # Pin to specific commit hash
# revision: "v1.0.0" # Or use a git tag
Without revision, the latest version from the default branch is used.
Remote Catalog
The catalog fetches available models directly from this repository on GitHub. Team members add YAML manifests to revos/models/ and users discover them without upgrading.
# Browse all available models from the repo
revos catalog list
# Install a model locally
revos catalog pull revovoice
Documentation
- AGENTS.md — Architecture guide for AI agents and contributors
- CONTRIBUTING.md — How to contribute
- TODO.md — Full backlog and roadmap
Project Structure
revos/
├── revos/
│ ├── asr/ # ASR engines (sherpa-onnx)
│ ├── tts/ # TTS engines (RevoVoice)
│ ├── registry/ # Model manifests, registry, downloader, status
│ ├── cli/ # Click CLI
│ ├── config.py # API key & configuration management
│ ├── exceptions.py # Custom exception hierarchy
│ ├── catalog.py # Remote model catalog (GitHub, cached)
│ ├── device.py # GPU/CPU auto-detection
│ └── models/ # Bundled YAML manifests
├── tests/
├── pyproject.toml
├── AGENTS.md
└── CONTRIBUTING.md
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file revospeech-0.1.0.tar.gz.
File metadata
- Download URL: revospeech-0.1.0.tar.gz
- Upload date:
- Size: 215.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3379a851c3d26daee7c520a8bf67971237055da3561288c21a1b7c67846e4f52
|
|
| MD5 |
1f46903a24250bfd2a381de0dd48acb2
|
|
| BLAKE2b-256 |
c50793ea9afd8358e8392f4de9e0f417d97f825e7ff85bb308fd8fc422988f95
|
File details
Details for the file revospeech-0.1.0-py3-none-any.whl.
File metadata
- Download URL: revospeech-0.1.0-py3-none-any.whl
- Upload date:
- Size: 34.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
982199594f6dc050c5585f93281ec0c1a79b9fb366fadcec376f8cfab5faeca6
|
|
| MD5 |
b1507f2d07147a9a4f279a0283cf723e
|
|
| BLAKE2b-256 |
240fe4bdf500e38edc4ba3d6feb7437e48169777a73d76bd67f7fb052970b6be
|