Skip to main content

Speaker diarization — who spoke when. Rust + ONNX, no Python runtime overhead.

Project description

polyvoice

CI Crates.io PyPI Docs.rs License: MIT

Speaker diarization for Rust — who spoke when, without Python. Silero VAD + WeSpeaker embeddings + AHC clustering in a single call.

Quick Start

[dependencies]
polyvoice = { version = "0.6", features = ["onnx"] }
cargo add polyvoice --features onnx

Features

  • One-call pipelinePipeline::run() wires VAD → embeddings → AHC clustering.
  • Online & offlineOnlineDiarizer for streaming, OfflineDiarizer for batch.
  • CPU-only, ~30 MB — ONNX Runtime, no GPU or Python runtime required.
  • Multi-language — Rust library, Python bindings (pip install polyvoice), C FFI, CLI.
  • Lock-free concurrencycrossbeam-queue session pool for parallel inference.
  • Hardened — Miri (memory), Loom (concurrency), cargo-fuzz (4 targets), model signing (Minisign).

Minimal Example

use polyvoice::{Pipeline, DiarizationConfig, VadConfig, FbankOnnxExtractor, SileroVad};
use std::path::Path;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let ext = FbankOnnxExtractor::new(Path::new("models/wespeaker_resnet34.onnx"), 256, 4)?;
    let mut vad = SileroVad::new(Path::new("models/silero_vad.onnx"), 512)?;
    let (samples, _sr) = polyvoice::wav::read_wav(Path::new("meeting.wav"))?;
    let result = Pipeline::new(DiarizationConfig::default(), VadConfig::default())
        .run(&samples, &ext, &mut vad)?;
    for turn in &result.turns {
        println!("{}: {:.2}s - {:.2}s", turn.speaker, turn.time.start, turn.time.end);
    }
    Ok(())
}

Python / C FFI

import polyvoice
pipeline = polyvoice.Pipeline.balanced("models/")
result = pipeline.run(samples, sample_rate=16000)
for turn in result["turns"]:
    print(f"{turn['speaker']}: {turn['start']:.1f}s - {turn['end']:.1f}s")
// cargo build --features ffi
// See include/polyvoice.h and examples/ffi_usage.c
polyvoice_pipeline_create(BALANCED, "models/", &handle);
polyvoice_pipeline_run(handle, samples, n, 16000, &json, &len);

Benchmarks

Dataset DER Speed
VoxConverse (232 files) ~14% 10x RT (CPU)
AMI (16 meetings) ~23% 7x RT (CPU)

~80% of pyannote's accuracy at 10× the speed on CPU — no GPU, no Python.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

polyvoice-0.6.3-cp312-cp312-win_amd64.whl (8.3 MB view details)

Uploaded CPython 3.12Windows x86-64

polyvoice-0.6.3-cp312-cp312-manylinux_2_38_x86_64.whl (10.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.38+ x86-64

polyvoice-0.6.3-cp312-cp312-macosx_11_0_arm64.whl (9.1 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file polyvoice-0.6.3-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: polyvoice-0.6.3-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 8.3 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for polyvoice-0.6.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 672df13bfff1bca080f160c02c3cc5126c2901c32d01aca74a0abfdebb451789
MD5 26b624152cb2594ff76fbe50491ca365
BLAKE2b-256 aa9a7c83f2207111da9cfb850883c614bf79101ca65e7e3963c4793b788eb179

See more details on using hashes here.

Provenance

The following attestation bundles were made for polyvoice-0.6.3-cp312-cp312-win_amd64.whl:

Publisher: release.yml on ekhodzitsky/polyvoice

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file polyvoice-0.6.3-cp312-cp312-manylinux_2_38_x86_64.whl.

File metadata

File hashes

Hashes for polyvoice-0.6.3-cp312-cp312-manylinux_2_38_x86_64.whl
Algorithm Hash digest
SHA256 3d26498d1de51f1846606f025472b9ff6b89ee8c8843055be76445c0aa1fc94a
MD5 541f1a51e8bffe2b5202cde9328fea4b
BLAKE2b-256 c19145632ed7b1a3f954466cdaac8172e29b4a91073d4a500aa3e15c0e592aa5

See more details on using hashes here.

Provenance

The following attestation bundles were made for polyvoice-0.6.3-cp312-cp312-manylinux_2_38_x86_64.whl:

Publisher: release.yml on ekhodzitsky/polyvoice

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file polyvoice-0.6.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for polyvoice-0.6.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f37d24858509a426d106d2759d8c259ea269c4a8ebc9caff6d2c16fe9c3d6dc5
MD5 5639dbb0948fd6af97a99d89f91c916f
BLAKE2b-256 93041fff20b09ce13b019bbaa1ab41d6d5933f5eabd6bd09291816bd323a86b5

See more details on using hashes here.

Provenance

The following attestation bundles were made for polyvoice-0.6.3-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on ekhodzitsky/polyvoice

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page