Skip to main content

Speaker diarization — who spoke when. Rust + ONNX, no Python runtime overhead.

Project description

polyvoice

CI Crates.io PyPI Docs.rs License: MIT

Speaker diarization for Rust — who spoke when, without Python. Silero VAD + WeSpeaker embeddings + AHC clustering in a single call.

Quick Start

[dependencies]
polyvoice = { version = "0.6", features = ["onnx"] }
cargo add polyvoice --features onnx

Features

  • One-call pipelinePipeline::run() wires VAD → embeddings → AHC clustering.
  • Online & offlineOnlineDiarizer for streaming, OfflineDiarizer for batch.
  • CPU-only, ~30 MB — ONNX Runtime, no GPU or Python runtime required.
  • Multi-language — Rust library, Python bindings (pip install polyvoice), C FFI, CLI.
  • Lock-free concurrencycrossbeam-queue session pool for parallel inference.
  • Hardened — Miri (memory), Loom (concurrency), cargo-fuzz (4 targets), model signing (Minisign).

Minimal Example

use polyvoice::{Pipeline, DiarizationConfig, VadConfig, FbankOnnxExtractor, SileroVad};
use std::path::Path;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let ext = FbankOnnxExtractor::new(Path::new("models/wespeaker_resnet34.onnx"), 256, 4)?;
    let mut vad = SileroVad::new(Path::new("models/silero_vad.onnx"), 512)?;
    let (samples, _sr) = polyvoice::wav::read_wav(Path::new("meeting.wav"))?;
    let result = Pipeline::new(DiarizationConfig::default(), VadConfig::default())
        .run(&samples, &ext, &mut vad)?;
    for turn in &result.turns {
        println!("{}: {:.2}s - {:.2}s", turn.speaker, turn.time.start, turn.time.end);
    }
    Ok(())
}

Python / C FFI

import polyvoice
pipeline = polyvoice.Pipeline.balanced("models/")
result = pipeline.run(samples, sample_rate=16000)
for turn in result["turns"]:
    print(f"{turn['speaker']}: {turn['start']:.1f}s - {turn['end']:.1f}s")
// cargo build --features ffi
// See include/polyvoice.h and examples/ffi_usage.c
polyvoice_pipeline_create(BALANCED, "models/", &handle);
polyvoice_pipeline_run(handle, samples, n, 16000, &json, &len);

Benchmarks

Dataset DER Speed
VoxConverse (232 files) ~14% 10x RT (CPU)
AMI (16 meetings) ~23% 7x RT (CPU)

~80% of pyannote's accuracy at 10× the speed on CPU — no GPU, no Python.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

polyvoice-0.6.1-cp312-cp312-win_amd64.whl (8.3 MB view details)

Uploaded CPython 3.12Windows x86-64

polyvoice-0.6.1-cp312-cp312-manylinux_2_38_x86_64.whl (10.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.38+ x86-64

polyvoice-0.6.1-cp312-cp312-macosx_11_0_arm64.whl (9.1 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file polyvoice-0.6.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: polyvoice-0.6.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 8.3 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for polyvoice-0.6.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 1d09f1b4e19177ef238f6637674edecaae2b6d123be4b6b27fdfc99b7ffcc6a9
MD5 a2dfe8e63dbb0f06bb377af123325dcf
BLAKE2b-256 2758b168214cccf0df8313281db03bac923037dcbf0866f261a45764b5c2ef70

See more details on using hashes here.

Provenance

The following attestation bundles were made for polyvoice-0.6.1-cp312-cp312-win_amd64.whl:

Publisher: release.yml on ekhodzitsky/polyvoice

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file polyvoice-0.6.1-cp312-cp312-manylinux_2_38_x86_64.whl.

File metadata

File hashes

Hashes for polyvoice-0.6.1-cp312-cp312-manylinux_2_38_x86_64.whl
Algorithm Hash digest
SHA256 846e2a7a64eb88aaa8d935c6e86a21f7783ad1fb610375bf895698a576e659f5
MD5 fd9bd7cfa2ad35605f4f18d24c423b86
BLAKE2b-256 7e26e4dc534e94a20ff47a82731927c56b5b7f6cdf8680456fabfbd285b0728f

See more details on using hashes here.

Provenance

The following attestation bundles were made for polyvoice-0.6.1-cp312-cp312-manylinux_2_38_x86_64.whl:

Publisher: release.yml on ekhodzitsky/polyvoice

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file polyvoice-0.6.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for polyvoice-0.6.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 55a8ffc1765a6e30eea0e0ca460013112f501664a0157d1d990390870d007a22
MD5 7e1254dad22053c15fc0a4189cd0d7f1
BLAKE2b-256 1f78475dbcf1401b364268e05d05cac8e46ac94fdde83c21f78f6a2a39c0f46d

See more details on using hashes here.

Provenance

The following attestation bundles were made for polyvoice-0.6.1-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on ekhodzitsky/polyvoice

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page