Skip to main content

Speaker diarization — who spoke when. Rust + ONNX, no Python runtime overhead.

Project description

polyvoice

CI Crates.io PyPI Docs.rs License: MIT

Speaker diarization for Rust — who spoke when, without Python. Silero VAD + WeSpeaker embeddings + AHC clustering in a single call.

Quick Start

[dependencies]
polyvoice = { version = "0.6", features = ["onnx"] }
cargo add polyvoice --features onnx

Features

  • One-call pipelinePipeline::run() wires VAD → embeddings → AHC clustering.
  • Online & offlineOnlineDiarizer for streaming, OfflineDiarizer for batch.
  • CPU-only, ~30 MB — ONNX Runtime, no GPU or Python runtime required.
  • Multi-language — Rust library, Python bindings (pip install polyvoice), C FFI, CLI.
  • Lock-free concurrencycrossbeam-queue session pool for parallel inference.
  • Hardened — Miri (memory), Loom (concurrency), cargo-fuzz (4 targets), model signing (Minisign).

Minimal Example

use polyvoice::{Pipeline, DiarizationConfig, VadConfig, FbankOnnxExtractor, SileroVad};
use std::path::Path;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let ext = FbankOnnxExtractor::new(Path::new("models/wespeaker_resnet34.onnx"), 256, 4)?;
    let mut vad = SileroVad::new(Path::new("models/silero_vad.onnx"), 512)?;
    let (samples, _sr) = polyvoice::wav::read_wav(Path::new("meeting.wav"))?;
    let result = Pipeline::new(DiarizationConfig::default(), VadConfig::default())
        .run(&samples, &ext, &mut vad)?;
    for turn in &result.turns {
        println!("{}: {:.2}s - {:.2}s", turn.speaker, turn.time.start, turn.time.end);
    }
    Ok(())
}

Python / C FFI

import polyvoice
pipeline = polyvoice.Pipeline.balanced("models/")
result = pipeline.run(samples, sample_rate=16000)
for turn in result["turns"]:
    print(f"{turn['speaker']}: {turn['start']:.1f}s - {turn['end']:.1f}s")
// cargo build --features ffi
// See include/polyvoice.h and examples/ffi_usage.c
polyvoice_pipeline_create(BALANCED, "models/", &handle);
polyvoice_pipeline_run(handle, samples, n, 16000, &json, &len);

Benchmarks

Dataset DER Speed
VoxConverse (232 files) ~14% 10x RT (CPU)
AMI (16 meetings) ~23% 7x RT (CPU)

~80% of pyannote's accuracy at 10× the speed on CPU — no GPU, no Python.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

polyvoice-0.6.0-cp312-cp312-win_amd64.whl (8.3 MB view details)

Uploaded CPython 3.12Windows x86-64

polyvoice-0.6.0-cp312-cp312-manylinux_2_38_x86_64.whl (10.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.38+ x86-64

polyvoice-0.6.0-cp312-cp312-macosx_11_0_arm64.whl (9.1 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file polyvoice-0.6.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: polyvoice-0.6.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 8.3 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for polyvoice-0.6.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 5e96e6f3384cc8b69a462f1d7c6917e2e87f4682ddaa2e4e23090d97c5d3463e
MD5 27d4d23d9579b2cee9a7af300b8d595c
BLAKE2b-256 6ec0a89a0b2da96be89008d8e3499684599e56d54daa3ed80e85abc3cb34d9d6

See more details on using hashes here.

Provenance

The following attestation bundles were made for polyvoice-0.6.0-cp312-cp312-win_amd64.whl:

Publisher: release.yml on ekhodzitsky/polyvoice

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file polyvoice-0.6.0-cp312-cp312-manylinux_2_38_x86_64.whl.

File metadata

File hashes

Hashes for polyvoice-0.6.0-cp312-cp312-manylinux_2_38_x86_64.whl
Algorithm Hash digest
SHA256 5b3823de070de097c7008b02c9880a014d2c3b4adf6b42139413175d347e0bc1
MD5 f1b27b70385ec8719240d75c082ade3c
BLAKE2b-256 4fee44ad1d6c28c88cdc6cc54752ae9142aebe9bb17d234333ad800293ceea47

See more details on using hashes here.

Provenance

The following attestation bundles were made for polyvoice-0.6.0-cp312-cp312-manylinux_2_38_x86_64.whl:

Publisher: release.yml on ekhodzitsky/polyvoice

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file polyvoice-0.6.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for polyvoice-0.6.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f5f462f13956a2795753334725ca178b1d9944d1c067159baa41b55cecd1692d
MD5 415ddddffcb700981ae23e13ddfa3d4c
BLAKE2b-256 2020ad5e6a7a0ed90632c0380c18caa0591ee978f26ee271684e504f62ddce90

See more details on using hashes here.

Provenance

The following attestation bundles were made for polyvoice-0.6.0-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on ekhodzitsky/polyvoice

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page