A unified Python library for speech AI — ASR and TTS using open models

These details have not been verified by PyPI

Project links

Project description

RevoS

A unified Python library for speech AI — ASR and TTS using open models.

Installation

# Core (ASR support)
pip install revospeech

# With TTS support (RevoVoice — requires PyTorch)
pip install revospeech[tts]

# With GPU support
pip install revospeech[gpu]

# Everything (GPU + TTS)
pip install revospeech[all]

# Or with uv
uv add revospeech

HuggingFace Login (Required for TTS)

Note: The RevoVoice TTS model is hosted on a private HuggingFace repository. You must log in before using TTS.

pip install huggingface-hub
huggingface-cli login

Get your token at https://huggingface.co/settings/tokens

Important Notes

revospeech[gpu] and revospeech[all] install onnxruntime-gpu, which conflicts with onnxruntime. If you already have revospeech installed, uninstall it first before installing the GPU variant before installing the GPU variant.
Audio formats supported: WAV, FLAC, OGG, and any format supported by libsndfile.

Quick Start

ASR (Automatic Speech Recognition)

from revospeech.asr import ASR

asr = ASR('zipformer-v2')
result = asr.transcribe('meeting.wav')

print(result.text)        # Full transcript
print(result.language)    # Detected language
for seg in result.segments:
    print(f"[{seg.start:.1f}s - {seg.end:.1f}s] {seg.text}")

TTS (Text-to-Speech)

from revospeech.tts import TTS

# Basic synthesis
tts = TTS('revovoice')
audio = tts.synthesize('Hello, how are you?')
audio.save('greeting.wav')

# Voice cloning (with reference audio)
audio = tts.synthesize(
    'This will sound like the reference speaker.',
    ref_audio='speaker.wav',
    ref_text='Sample of the speaker talking.',
)
audio.save('cloned.wav')

Model Discovery

import revospeech

# List all models with status
revospeech.list_models()

# Filter by task, mode, status
revospeech.list_models(task="asr", status="ready")
revospeech.list_models(mode="api")

# Fuzzy search
revospeech.search_models("english fast")

# Check a specific model
status = revospeech.check_model("zipformer-v2")
print(status.is_ready)

CLI

# Transcribe audio
revospeech transcribe -m zipformer-v2 audio.wav

# JSON output
revospeech transcribe -m zipformer-v2 --json audio.wav

# SRT subtitles
revospeech transcribe -m zipformer-v2 --srt audio.wav

# Synthesize speech
revospeech synthesize -m revovoice -t "Hello, world!" -o output.wav

# From text file
revospeech synthesize -m revovoice -f script.txt -o audiobook.wav

# List available models (with status icons)
revospeech models
revospeech models --ready           # Only ready-to-use models
revospeech models --mode api        # Only API models
revospeech models --task asr        # Filter by task

# Detailed model info
revospeech models-info zipformer-v2

# Fuzzy search
revospeech search "english fast"

# Browse remote catalog
revospeech catalog list

# Pull a model from the catalog
revospeech catalog pull revovoice

# API key management
revospeech config set-api-key

# Show environment info
revospeech info

Available Models

Model	Task	Backend	Mode	Languages	Access	Description
`zipformer-v2`	ASR	sherpa-onnx	local	English	Open	Zipformer small transducer model
`revovoice`	TTS	RevoVoice	local	600+	Gated	Zero-shot multilingual TTS with voice cloning

Model Directory

revospeech/models/
├── asr/
│   └── zipformer_v2.yaml    # Open — downloads from GitHub releases
└── tts/
    └── revovoice.yaml       # Gated — requires HF login + approval

Gated Model Access

Some models (like revovoice) are hosted on private HuggingFace repositories and require approval before use.

Log in to HuggingFace:
```
pip install huggingface-hub
huggingface-cli login
```
Get your token at https://huggingface.co/settings/tokens
Request access: Visit the model's HuggingFace page and submit an access request. The repo owner will review and approve.

Use the model: Once approved, the model will download automatically on first use:

from revospeech.tts import TTS
tts = TTS('revovoice')  # Will prompt for HF login if not authenticated

For team members adding models: If your model is gated, set hf_private: true in the YAML manifest. This tells RevoS to check HF authentication before downloading.

Configuration

API Keys

For cloud API backends, set your API key:

# Option 1: Environment variable
export REVOLAB_API_KEY=rv-your-key-here

# Option 2: CLI command (saves to ~/.config/revospeech/config.yaml)
revospeech config set-api-key

Resolution order: constructor arg > REVOLAB_API_KEY env var > ~/.config/revospeech/config.yaml

Catalog Source

Override the catalog source with:

export REVOS_CATALOG_REPO="myorg/revospeech"    # env var

Or in ~/.config/revospeech/config.yaml:

catalog_repo: "myorg/revospeech"

Adding Custom Models

Add a YAML manifest to ~/.config/revospeech/models/:

# ~/.config/revospeech/models/asr/my-model.yaml
name: my-custom-model
task: asr
mode: local
backend: sherpa-onnx
model_type: transducer
model_url: "https://example.com/models/my-model.tar.bz2"
sample_rate: 16000
language: en
description: "My custom ASR model"
capabilities: ["word-timestamps"]
languages: ["en"]
files:
  encoder: "encoder.onnx"
  decoder: "decoder.onnx"
  joiner: "joiner.onnx"
  tokens: "tokens.txt"

Then use it: from revospeech.asr import ASR; asr = ASR('my-custom-model')

API Models

# ~/.config/revospeech/models/asr/my-api-model.yaml
name: my-api-model
task: asr
mode: api
backend: my-api
api_endpoint: "https://api.example.com/v1"
description: "Cloud ASR"
capabilities: ["streaming"]
languages: ["en"]

Pinning Model Versions

For HuggingFace-hosted models, pin to a specific commit hash or tag using the revision field:

revision: "a1b2c3d"       # Pin to specific commit hash
# revision: "v1.0.0"      # Or use a git tag

Without revision, the latest version from the default branch is used.

Remote Catalog

The catalog fetches available models directly from this repository on GitHub. Team members add YAML manifests to revospeech/models/ and users discover them without upgrading.

# Browse all available models from the repo
revospeech catalog list

# Install a model locally
revospeech catalog pull revovoice

Documentation

AGENTS.md — Architecture guide for AI agents and contributors
CONTRIBUTING.md — How to contribute
TODO.md — Full backlog and roadmap

Project Structure

revospeech/
├── revospeech/
│   ├── asr/           # ASR engines (sherpa-onnx)
│   ├── tts/           # TTS engines (RevoVoice)
│   ├── registry/      # Model manifests, registry, downloader, status
│   ├── cli/           # Click CLI
│   ├── config.py      # API key & configuration management
│   ├── exceptions.py  # Custom exception hierarchy
│   ├── catalog.py     # Remote model catalog (GitHub, cached)
│   ├── device.py      # GPU/CPU auto-detection
│   └── models/        # Bundled YAML manifests
├── tests/
├── pyproject.toml
├── AGENTS.md
└── CONTRIBUTING.md

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Jun 12, 2026

0.1.0

Jun 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

revospeech-0.1.1.tar.gz (215.5 kB view details)

Uploaded Jun 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

revospeech-0.1.1-py3-none-any.whl (35.1 kB view details)

Uploaded Jun 12, 2026 Python 3

File details

Details for the file revospeech-0.1.1.tar.gz.

File metadata

Download URL: revospeech-0.1.1.tar.gz
Upload date: Jun 12, 2026
Size: 215.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for revospeech-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`d062a5cf9645c205669fb8d4fa9b881fca78e15d5491a6b90ae511f7f122c77c`
MD5	`8357125493b3a32cd607727558ac0fc7`
BLAKE2b-256	`24376aebeeb67446c812721ae4a0f2d48c6b6c67ff496a6a6108402f69c9f738`

See more details on using hashes here.

File details

Details for the file revospeech-0.1.1-py3-none-any.whl.

File metadata

Download URL: revospeech-0.1.1-py3-none-any.whl
Upload date: Jun 12, 2026
Size: 35.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for revospeech-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0584e0fb6b9b6d10a058cbe06ed01ecc057dd90e8d233f298315c33cfea16594`
MD5	`53488a8a9209c3389447eddc192c8a21`
BLAKE2b-256	`f2e34ba7da25e110ab756de5821d011375064b97e1983d280f12117d3ddf4187`

See more details on using hashes here.

revospeech 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

RevoS

Installation

HuggingFace Login (Required for TTS)

Important Notes

Quick Start

ASR (Automatic Speech Recognition)

TTS (Text-to-Speech)

Model Discovery

CLI

Available Models

Model Directory

Gated Model Access

Configuration

API Keys

Catalog Source

Adding Custom Models

API Models

Pinning Model Versions

Remote Catalog

Documentation

Project Structure

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes