Very fast speaker diarization

These details have not been verified by PyPI

Project links

Project description

Senko

閃光 (senkō) - a flash of light

A very fast and accurate speaker diarization pipeline.

1 hour of audio processed in 5 seconds (RTX 4090 + Ryzen 9 7950X).

On Apple M3, 1 hour in 7.7 seconds.

The pipeline achieves a best score of 13.5% DER on VoxConverse, 13.3% on AISHELL-4, and 26.5% on AMI-IHM. See the evaluation directory for more benchmarks and comparison with other diarization systems.

Senko powers the Zanshin media player.

Usage

import senko

diarizer = senko.Diarizer(device='auto', warmup=True, quiet=False, model_dir=None)

wav_path = 'audio.wav' # 16kHz mono 16-bit wav
result = diarizer.diarize(wav_path, generate_colors=False)

senko.save_json(result["merged_segments"], 'audio_diarized.json')
senko.save_rttm(result["merged_segments"], wav_path, 'audio_diarized.rttm')

See examples/diarize.py for an interactive script, and also read DOCS.md

Senko can also be used in a notebook, like Google Colab and Modal Notebooks.

Model Directory

Senko resolves each required source model with the following precedence:

Explicit model_dir= argument or script --model-dir flag
SENKO_MODEL_DIR
Bundled default model directory

If a model is missing from the configured model directory, Senko falls back to the bundled copy for that specific asset.

On macOS, Senko also stores reusable compiled CAM++ CoreML artifacts under <model_dir>/cached. That cache directory is disposable and can be deleted safely if it becomes stale or you want to reclaim disk space.

export SENKO_MODEL_DIR=/path/to/models
python examples/diarize.py --model-dir /path/to/override

Installation

The following instructions are for Linux, macOS, and WSL. For Windows, see WINDOWS.md.

Prerequisites:

gcc/clang - on Linux/WSL, a separate install; on macOS, have the Xcode Command Line Tools installed
uv

Create a Python virtual environment and activate it

uv venv --python 3.13 .venv
source .venv/bin/activate

Then install Senko

# For NVIDIA GPUs with CUDA compute capability >= 7.5 (~GTX 16 series and newer)
uv pip install "senko[nvidia]"

# For NVIDIA GPUs with CUDA compute capability < 7.5 (~GTX 10 series and older)
uv pip install "senko[nvidia-old]"

# For NVIDIA GPUs on native Windows with CUDA compute capability >= 7.5
uv pip install "senko[nvidia-windows]"

# For NVIDIA GPUs on native Windows with CUDA compute capability < 7.5
uv pip install "senko[nvidia-old-windows]"

# For Mac (macOS 14+) and CPU execution on all other platforms
uv pip install senko

For NVIDIA, make sure the installed driver is CUDA 12 capable (should see CUDA Version: 12+ in nvidia-smi).

PyPI alpha wheels are smoke-tested on GitHub-hosted runners, which cover packaging and CPU/default initialization but not GPU end-to-end execution.

For setting up Senko for development, see DEV_SETUP.md.

Accuracy

See the evaluation directory.

Technical Details

Senko is a heavily optimized and slightly modified version of the speaker diarization pipeline found in the excellent 3D-Speaker project. It consists of four stages: VAD (voice activity detection), Fbank feature extraction, speaker embeddings generation, and clustering (spectral or UMAP+HDBSCAN).

The following modifications have been made:

VAD model has been swapped from FSMN-VAD to either Senko's local pyannote backend (powered by the bundled segmentation-3.0 assets) or Silero VAD
Fbank feature extraction is done fully upfront, on the GPU using kaldifeat if on NVIDIA, and on the CPU using all cores otherwise.
Batched inference of the CAM++ embedding model
Clustering when on NVIDIA (with a GPU of CUDA compute capability 7.0+) can be done on the GPU through RAPIDS

On Linux/WSL, Senko's local segmentation-3.0 backend and CAM++ run using PyTorch, but on Mac, both models run through CoreML. The CAM++ CoreML conversion was done from scratch in this project (see tracing/coreml), but the segmentation-3.0 converted model and interfacing code is taken from the excellent FluidAudio project by Fluid Inference. No pyannote.audio package install is required at runtime.

Showcase

Application	Description
reaper_speech_diarizer	Split a downmixed voice recording into separate tracks for each speaker in REAPER DAW
scribe	Produce speaker-attributed transcripts using parakeet-mlx and Senko
verbatim	High quality multilingual speech to text with diarization

Create a PR or message on Discord if you'd like your application that uses Senko added here too.

FAQ

Is there any way to visualize the output diarization data?

Absolutely. The Zanshin media player is entirely made for this purpose. Zanshin is powered by Senko, so the easiest way to visualize the diarization data is by simply using it. It's currently available for Mac (Apple Silicon) with packaging. It also works on Windows and Linux, but without packaging (coming soon); you'll need to clone the repo and launch it through the terminal. See here for instructions.

You can also load in the diarization data that Senko generates manually into Zanshin if you want. First, turn off diarization in Zanshin by going into Settings and turning off Identify Speakers. Then, after you add a media item, click on it and on the player page press the H key. In the textbox that appears, paste the contents of the output JSON file that examples/diarize.py generates.

What languages does Senko support?

Generally, the pipeline should work for any language, as it relies on acoustic patterns as opposed to words or speech patterns. That being said, the embeddings model used in this pipeline was trained on a mix of English and Mandarin Chinese. So the pipeline will likely work best on English and Mandarin Chinese.

Are overlapping speaker segments detected correctly?

The current output will not have any overlapping speaker segments; i.e. only one speaker max is reported to be speaking at any given time. However, despite this, the current pipeline still performs great in determining who the dominant speaker is at any given time in chaotic audio with speakers talking over each other (example: casual podcasts). That said, detecting overlapping speaker segments is a planned feature thanks to the bundled segmentation-3.0 model (which we currently only use for VAD) supporting it.

How fast is the pipeline on CPU (cpu)?

On a Ryzen 9 9950X, it takes 42 seconds to process 1 hour of audio.

Does the entire pipeline run fully on the GPU, if available?

On Linux/WSL with device=cuda, all parts of the pipeline run on the GPU, so long as the NVIDIA card has CUDA compute capability ≥ 7.0 (~GTX 16 series and newer); otherwise clustering falls back to the CPU.

On native Windows with device=cuda, everything except fbank extraction and clustering run on the GPU.

On Mac, VAD and embeddings run on the ANE and CPU through CoreML, and fbank extraction and clustering run on the CPU.

Known limitations?

- The pipeline works best when the audio recording quality is good. Ideal setting: professional podcast studio. Heavy background noise, background music, or a generally low fidelity recording will degrade the diarization performance significantly. Note that it's also possible to have generally good recording quality but still low fidelity recorded voice quality; an example is this.

- It is rare but possible that voices that sound very similar get clustered as one voice. This can happen if the voices are genuinely extremely similar, or, more commonly, if the audio recording fidelity is low.

- The same voice recorded with >1 microphones or in >1 recording settings within the same audio file will often get detected as >1 speakers.

- If a single person makes >1 voices in the same recording (as in change the auditory texture/tone of their voice; like if they do an impression of someone else, for example), their speech will almost certainly get detected as >1 speakers.

Troubleshooting

If you run into Numba related errors after upgrading/downgrading the numba package or other packages that use it (umap-learn, pynndescent, etc.), they might be caused by failed Numba cache invalidation. In such a case, clear the cache manually like so:

rm -rf ~/.cache/senko

Such errors may also appear if you have Zanshin installed, with different package versions installed in its Python environment compared to the development venv that you're using for Senko.

If you are using a custom model directory and want to clear reusable CoreML artifacts, delete the <model_dir>/cached directory. Senko will recreate it automatically on the next run if needed.

Community & Support

Join the Discord server to ask questions, suggest features, talk about Senko and Zanshin development etc.

Future Improvements & Directions

Overlapping speaker segments support
Improve speaker colors generation algorithm
Support for Intel and AMD GPUs
Experiment with torch.compile()
Experiment with Modular MAX engine (faster CPU inference speed?)
Background noise removal (DeepFilterNet), speech enhancement
Live progress reporting
VBx-based clustering (DiariZen)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Apr 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

senko-0.1.0.tar.gz (81.0 MB view details)

Uploaded Apr 21, 2026 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

senko-0.1.0-cp311-cp311-win_amd64.whl (81.0 MB view details)

Uploaded Apr 21, 2026 CPython 3.11Windows x86-64

senko-0.1.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (81.0 MB view details)

Uploaded Apr 21, 2026 CPython 3.11manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

senko-0.1.0-cp311-cp311-macosx_14_0_arm64.whl (81.0 MB view details)

Uploaded Apr 21, 2026 CPython 3.11macOS 14.0+ ARM64

File details

Details for the file senko-0.1.0.tar.gz.

File metadata

Download URL: senko-0.1.0.tar.gz
Upload date: Apr 21, 2026
Size: 81.0 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for senko-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`0cdbc62bbd0d04aa8f60788f138257d8a49d4afb6f470c94df86478c96a7d319`
MD5	`45c3fd42ba8de677fde01709a1581ede`
BLAKE2b-256	`bcba995bdaf55c8669d60bd41dc3b5491b84eb308d1efd20897f3c839764251f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for senko-0.1.0.tar.gz:

Publisher: publish.yml on gaspardpetit/senko

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: senko-0.1.0.tar.gz
- Subject digest: 0cdbc62bbd0d04aa8f60788f138257d8a49d4afb6f470c94df86478c96a7d319
- Sigstore transparency entry: 1345663784
- Sigstore integration time: Apr 21, 2026
Source repository:
- Permalink: gaspardpetit/senko@6cbe29b1e85db7be69e98de89a905970ebbfb395
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/gaspardpetit
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@6cbe29b1e85db7be69e98de89a905970ebbfb395
- Trigger Event: push

File details

Details for the file senko-0.1.0-cp311-cp311-win_amd64.whl.

File metadata

Download URL: senko-0.1.0-cp311-cp311-win_amd64.whl
Upload date: Apr 21, 2026
Size: 81.0 MB
Tags: CPython 3.11, Windows x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for senko-0.1.0-cp311-cp311-win_amd64.whl
Algorithm	Hash digest
SHA256	`1e998d3bd2d879acecb38e9d957a738959c9e5c57dbd9e82f2f3149f5497d0b5`
MD5	`b9b800a4d55788809632de619bc0a3f4`
BLAKE2b-256	`4b01fdc3513d1847f306dc64cc1289e4932f3400dca93be573b6a036e40f0821`

See more details on using hashes here.

Provenance

The following attestation bundles were made for senko-0.1.0-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on gaspardpetit/senko

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: senko-0.1.0-cp311-cp311-win_amd64.whl
- Subject digest: 1e998d3bd2d879acecb38e9d957a738959c9e5c57dbd9e82f2f3149f5497d0b5
- Sigstore transparency entry: 1345663941
- Sigstore integration time: Apr 21, 2026
Source repository:
- Permalink: gaspardpetit/senko@6cbe29b1e85db7be69e98de89a905970ebbfb395
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/gaspardpetit
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@6cbe29b1e85db7be69e98de89a905970ebbfb395
- Trigger Event: push

File details

Details for the file senko-0.1.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

Download URL: senko-0.1.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Upload date: Apr 21, 2026
Size: 81.0 MB
Tags: CPython 3.11, manylinux: glibc 2.24+ x86-64, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for senko-0.1.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`b95a40836de872103d3d68db733d728bac78920a8c3b891586d6f82b9f924c70`
MD5	`269acbd83cc34acece76a522fadbcfbc`
BLAKE2b-256	`cf075347f8aef313340f8cecad157f5997fc4755fc3691436882ac806fcce83d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for senko-0.1.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: publish.yml on gaspardpetit/senko

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: senko-0.1.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
- Subject digest: b95a40836de872103d3d68db733d728bac78920a8c3b891586d6f82b9f924c70
- Sigstore transparency entry: 1345663866
- Sigstore integration time: Apr 21, 2026
Source repository:
- Permalink: gaspardpetit/senko@6cbe29b1e85db7be69e98de89a905970ebbfb395
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/gaspardpetit
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@6cbe29b1e85db7be69e98de89a905970ebbfb395
- Trigger Event: push

File details

Details for the file senko-0.1.0-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

Download URL: senko-0.1.0-cp311-cp311-macosx_14_0_arm64.whl
Upload date: Apr 21, 2026
Size: 81.0 MB
Tags: CPython 3.11, macOS 14.0+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for senko-0.1.0-cp311-cp311-macosx_14_0_arm64.whl
Algorithm	Hash digest
SHA256	`5155fd9abc6603db7cff5e6107dc96cb2b40d9ee79a76f70f8f144c615e170bc`
MD5	`473fc598edf80dcf0c1c98fdb43f36a7`
BLAKE2b-256	`61253f70ad50c1447e592f0f6c2c9a4c51e4dd577622fba5f81643c491be0f45`

See more details on using hashes here.

Provenance

The following attestation bundles were made for senko-0.1.0-cp311-cp311-macosx_14_0_arm64.whl:

Publisher: publish.yml on gaspardpetit/senko

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: senko-0.1.0-cp311-cp311-macosx_14_0_arm64.whl
- Subject digest: 5155fd9abc6603db7cff5e6107dc96cb2b40d9ee79a76f70f8f144c615e170bc
- Sigstore transparency entry: 1345664023
- Sigstore integration time: Apr 21, 2026
Source repository:
- Permalink: gaspardpetit/senko@6cbe29b1e85db7be69e98de89a905970ebbfb395
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/gaspardpetit
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@6cbe29b1e85db7be69e98de89a905970ebbfb395
- Trigger Event: push

senko 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Senko

Usage

Model Directory

Installation

Accuracy

Technical Details

Showcase

FAQ

Troubleshooting

Community & Support

Future Improvements & Directions

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance