Skip to main content

Speaker diarization — who spoke when. Rust + ONNX, no Python runtime overhead.

Project description

polyvoice

CI Crates.io PyPI Docs.rs License: MIT

Speaker diarization for Rust — who spoke when, without Python. Silero VAD + WeSpeaker embeddings + AHC clustering in a single call.

Quick Start

[dependencies]
polyvoice = { version = "0.6", features = ["onnx"] }
cargo add polyvoice --features onnx

Features

  • One-call pipelinePipeline::run() wires VAD → embeddings → AHC clustering.
  • Online & offlineOnlineDiarizer for streaming, OfflineDiarizer for batch.
  • CPU-only, ~30 MB — ONNX Runtime, no GPU or Python runtime required.
  • Multi-language — Rust library, Python bindings (pip install polyvoice), C FFI, CLI.
  • Lock-free concurrencycrossbeam-queue session pool for parallel inference.
  • Hardened — Miri (memory), Loom (concurrency), cargo-fuzz (4 targets), model signing (Minisign).

Minimal Example

use polyvoice::{Pipeline, DiarizationConfig, VadConfig, FbankOnnxExtractor, SileroVad};
use std::path::Path;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let ext = FbankOnnxExtractor::new(Path::new("models/wespeaker_resnet34.onnx"), 256, 4)?;
    let mut vad = SileroVad::new(Path::new("models/silero_vad.onnx"), 512)?;
    let (samples, _sr) = polyvoice::wav::read_wav(Path::new("meeting.wav"))?;
    let result = Pipeline::new(DiarizationConfig::default(), VadConfig::default())
        .run(&samples, &ext, &mut vad)?;
    for turn in &result.turns {
        println!("{}: {:.2}s - {:.2}s", turn.speaker, turn.time.start, turn.time.end);
    }
    Ok(())
}

Python / C FFI

import polyvoice
pipeline = polyvoice.Pipeline.balanced("models/")
result = pipeline.run(samples, sample_rate=16000)
for turn in result["turns"]:
    print(f"{turn['speaker']}: {turn['start']:.1f}s - {turn['end']:.1f}s")
// cargo build --features ffi
// See include/polyvoice.h and examples/ffi_usage.c
polyvoice_pipeline_create(BALANCED, "models/", &handle);
polyvoice_pipeline_run(handle, samples, n, 16000, &json, &len);

Benchmarks

Dataset DER Speed
VoxConverse (232 files) ~14% 10x RT (CPU)
AMI (16 meetings) ~23% 7x RT (CPU)

~80% of pyannote's accuracy at 10× the speed on CPU — no GPU, no Python.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

polyvoice-0.6.2-cp312-cp312-win_amd64.whl (8.3 MB view details)

Uploaded CPython 3.12Windows x86-64

polyvoice-0.6.2-cp312-cp312-manylinux_2_38_x86_64.whl (10.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.38+ x86-64

polyvoice-0.6.2-cp312-cp312-macosx_11_0_arm64.whl (9.1 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file polyvoice-0.6.2-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: polyvoice-0.6.2-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 8.3 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for polyvoice-0.6.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 959ab0bae3661e66a1a1a308ceffd7ea128bd83bf1817df92371bed740305177
MD5 b18ae0f0591bd14ad1d1229a98bd2fa2
BLAKE2b-256 c42a8ae07cebcd02d71b53722b0bf16482a318e5fbe493220e13ef04ee404a72

See more details on using hashes here.

Provenance

The following attestation bundles were made for polyvoice-0.6.2-cp312-cp312-win_amd64.whl:

Publisher: release.yml on ekhodzitsky/polyvoice

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file polyvoice-0.6.2-cp312-cp312-manylinux_2_38_x86_64.whl.

File metadata

File hashes

Hashes for polyvoice-0.6.2-cp312-cp312-manylinux_2_38_x86_64.whl
Algorithm Hash digest
SHA256 0bef470b7de80461909019f2c76073607053263d00411987096f5206b17a62d0
MD5 f78708bf7827532ec6b4fd983b7011ed
BLAKE2b-256 606025b56b68db587103cc6cab9866e982a6f2a8272daa025b5e20faf3c89fb9

See more details on using hashes here.

Provenance

The following attestation bundles were made for polyvoice-0.6.2-cp312-cp312-manylinux_2_38_x86_64.whl:

Publisher: release.yml on ekhodzitsky/polyvoice

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file polyvoice-0.6.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for polyvoice-0.6.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 763775d71823505f2b693861206bfed6a848173912a0aae025c851e4d9fd3cef
MD5 b2e8c726190ca73c708ee9534cdf7971
BLAKE2b-256 6e47258c2159664f932ee0f918f63152cbdbe3fe1e29e68b7d6f73a3603ef84c

See more details on using hashes here.

Provenance

The following attestation bundles were made for polyvoice-0.6.2-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on ekhodzitsky/polyvoice

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page