Skip to main content

A unified Python library for speech AI — ASR and TTS using open models

Project description

RevoS

Python 3.11+ License: MIT CI

A unified Python library for speech AI — ASR and TTS using open models.

Installation

# Core (ASR support)
pip install revospeech

# With TTS support (RevoVoice — requires PyTorch)
pip install revospeech[tts]

# With GPU support
pip install revospeech[gpu]

# Everything (GPU + TTS)
pip install revospeech[all]

# Or with uv
uv add revospeech

HuggingFace Login (Required for TTS)

Note: The RevoVoice TTS model is hosted on a private HuggingFace repository. You must log in before using TTS.

pip install huggingface-hub
huggingface-cli login

Get your token at https://huggingface.co/settings/tokens

Important Notes

  • revospeech[gpu] and revospeech[all] install onnxruntime-gpu, which conflicts with onnxruntime. If you already have revos installed, uninstall it first before installing the GPU variant before installing the GPU variant.
  • Audio formats supported: WAV, FLAC, OGG, and any format supported by libsndfile.

Quick Start

ASR (Automatic Speech Recognition)

from revos.asr import ASR

asr = ASR('zipformer-v2')
result = asr.transcribe('meeting.wav')

print(result.text)        # Full transcript
print(result.language)    # Detected language
for seg in result.segments:
    print(f"[{seg.start:.1f}s - {seg.end:.1f}s] {seg.text}")

TTS (Text-to-Speech)

from revos.tts import TTS

# Basic synthesis
tts = TTS('revovoice')
audio = tts.synthesize('Hello, how are you?')
audio.save('greeting.wav')

# Voice cloning (with reference audio)
audio = tts.synthesize(
    'This will sound like the reference speaker.',
    ref_audio='speaker.wav',
    ref_text='Sample of the speaker talking.',
)
audio.save('cloned.wav')

Model Discovery

import revos

# List all models with status
revos.list_models()

# Filter by task, mode, status
revos.list_models(task="asr", status="ready")
revos.list_models(mode="api")

# Fuzzy search
revos.search_models("english fast")

# Check a specific model
status = revos.check_model("zipformer-v2")
print(status.is_ready)

CLI

# Transcribe audio
revos transcribe -m zipformer-v2 audio.wav

# JSON output
revos transcribe -m zipformer-v2 --json audio.wav

# SRT subtitles
revos transcribe -m zipformer-v2 --srt audio.wav

# Synthesize speech
revos synthesize -m revovoice -t "Hello, world!" -o output.wav

# From text file
revos synthesize -m revovoice -f script.txt -o audiobook.wav

# List available models (with status icons)
revos models
revos models --ready           # Only ready-to-use models
revos models --mode api        # Only API models
revos models --task asr        # Filter by task

# Detailed model info
revos models-info zipformer-v2

# Fuzzy search
revos search "english fast"

# Browse remote catalog
revos catalog list

# Pull a model from the catalog
revos catalog pull revovoice

# API key management
revos config set-api-key

# Show environment info
revos info

Available Models

Model Task Backend Mode Languages Access Description
zipformer-v2 ASR sherpa-onnx local English Open Zipformer small transducer model
revovoice TTS RevoVoice local 600+ Gated Zero-shot multilingual TTS with voice cloning

Model Directory

revos/models/
├── asr/
│   └── zipformer_v2.yaml    # Open — downloads from GitHub releases
└── tts/
    └── revovoice.yaml       # Gated — requires HF login + approval

Gated Model Access

Some models (like revovoice) are hosted on private HuggingFace repositories and require approval before use.

  1. Log in to HuggingFace:

    pip install huggingface-hub
    huggingface-cli login
    

    Get your token at https://huggingface.co/settings/tokens

  2. Request access: Visit the model's HuggingFace page and submit an access request. The repo owner will review and approve.

  3. Use the model: Once approved, the model will download automatically on first use:

    from revos.tts import TTS
    tts = TTS('revovoice')  # Will prompt for HF login if not authenticated
    

For team members adding models: If your model is gated, set hf_private: true in the YAML manifest. This tells RevoS to check HF authentication before downloading.

Configuration

API Keys

For cloud API backends, set your API key:

# Option 1: Environment variable
export REVOLAB_API_KEY=rv-your-key-here

# Option 2: CLI command (saves to ~/.config/revos/config.yaml)
revos config set-api-key

Resolution order: constructor arg > REVOLAB_API_KEY env var > ~/.config/revos/config.yaml

Catalog Source

Override the catalog source with:

export REVOS_CATALOG_REPO="myorg/revos"    # env var

Or in ~/.config/revos/config.yaml:

catalog_repo: "myorg/revos"

Adding Custom Models

Add a YAML manifest to ~/.config/revos/models/:

# ~/.config/revos/models/asr/my-model.yaml
name: my-custom-model
task: asr
mode: local
backend: sherpa-onnx
model_type: transducer
model_url: "https://example.com/models/my-model.tar.bz2"
sample_rate: 16000
language: en
description: "My custom ASR model"
capabilities: ["word-timestamps"]
languages: ["en"]
files:
  encoder: "encoder.onnx"
  decoder: "decoder.onnx"
  joiner: "joiner.onnx"
  tokens: "tokens.txt"

Then use it: from revos.asr import ASR; asr = ASR('my-custom-model')

API Models

# ~/.config/revos/models/asr/my-api-model.yaml
name: my-api-model
task: asr
mode: api
backend: my-api
api_endpoint: "https://api.example.com/v1"
description: "Cloud ASR"
capabilities: ["streaming"]
languages: ["en"]

Pinning Model Versions

For HuggingFace-hosted models, pin to a specific commit hash or tag using the revision field:

revision: "a1b2c3d"       # Pin to specific commit hash
# revision: "v1.0.0"      # Or use a git tag

Without revision, the latest version from the default branch is used.

Remote Catalog

The catalog fetches available models directly from this repository on GitHub. Team members add YAML manifests to revos/models/ and users discover them without upgrading.

# Browse all available models from the repo
revos catalog list

# Install a model locally
revos catalog pull revovoice

Documentation

Project Structure

revos/
├── revos/
│   ├── asr/           # ASR engines (sherpa-onnx)
│   ├── tts/           # TTS engines (RevoVoice)
│   ├── registry/      # Model manifests, registry, downloader, status
│   ├── cli/           # Click CLI
│   ├── config.py      # API key & configuration management
│   ├── exceptions.py  # Custom exception hierarchy
│   ├── catalog.py     # Remote model catalog (GitHub, cached)
│   ├── device.py      # GPU/CPU auto-detection
│   └── models/        # Bundled YAML manifests
├── tests/
├── pyproject.toml
├── AGENTS.md
└── CONTRIBUTING.md

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

revospeech-0.1.0.tar.gz (215.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

revospeech-0.1.0-py3-none-any.whl (34.8 kB view details)

Uploaded Python 3

File details

Details for the file revospeech-0.1.0.tar.gz.

File metadata

  • Download URL: revospeech-0.1.0.tar.gz
  • Upload date:
  • Size: 215.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for revospeech-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3379a851c3d26daee7c520a8bf67971237055da3561288c21a1b7c67846e4f52
MD5 1f46903a24250bfd2a381de0dd48acb2
BLAKE2b-256 c50793ea9afd8358e8392f4de9e0f417d97f825e7ff85bb308fd8fc422988f95

See more details on using hashes here.

File details

Details for the file revospeech-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: revospeech-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 34.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for revospeech-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 982199594f6dc050c5585f93281ec0c1a79b9fb366fadcec376f8cfab5faeca6
MD5 b1507f2d07147a9a4f279a0283cf723e
BLAKE2b-256 240fe4bdf500e38edc4ba3d6feb7437e48169777a73d76bd67f7fb052970b6be

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page