Skip to main content

A simple to use library for speaker diarization

Project description

PyDiar

This repo contains simple to use, pretrained/training-less models for speaker diarization.

Supported Models

  • Binary Key Speaker Modeling

    Based on pyBK by Jose Patino which implements the diarization system from "The EURECOM submission to the first DIHARD Challenge" by Patino, Jose and Delgado, Héctor and Evans, Nicholas

If you have any other models you would like to see added, please open an issue.

Usage

This library seeks to provide a very basic interface. To use the Binary Key model on a file, do something like this:

import numpy as np
from pydiar.models import BinaryKeyDiarizationModel, Segment
from pydiar.util.misc import optimize_segments
from pydub import AudioSegment

INPUT_FILE = "test.wav"

sample_rate = 32000
audio = AudioSegment.from_wav("test.wav")
audio = audio.set_frame_rate(sample_rate)
audio = audio.set_channels(1)

diarization_model = BinaryKeyDiarizationModel()
segments = diarization_model.diarize(
    sample_rate, np.array(audio.get_array_of_samples())
)
optimized_segments = optimize_segments(segments)

Now optimized_segments contains a list of segments with their start, length and speaker id

Example

A simple script which reads an audio file, diarizes it and transcribes it into the WebVTT format can be found in examples/generate_webvtt.py. To use it, download a vosk model from https://alphacephei.com/vosk/models and then run the script using

poetry install
poetry run python -m examples.generate_webvtt -i PATH/TO/INPUT.wav -m PATH/TO/VOSK_MODEL

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PyDiar-0.0.7.tar.gz (28.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

PyDiar-0.0.7-py3-none-any.whl (29.1 kB view details)

Uploaded Python 3

File details

Details for the file PyDiar-0.0.7.tar.gz.

File metadata

  • Download URL: PyDiar-0.0.7.tar.gz
  • Upload date:
  • Size: 28.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.10 CPython/3.9.6 Darwin/21.3.0

File hashes

Hashes for PyDiar-0.0.7.tar.gz
Algorithm Hash digest
SHA256 6f5ed827b655a774e7b67648f565d663523b710c4a51b5df69ce0dadadcc6ef2
MD5 ad764de244355603bff9f84654f70869
BLAKE2b-256 1696e82f6c7d79f7d1b272d28b209bedd9b659f726c95ee51b573640ddf5a146

See more details on using hashes here.

File details

Details for the file PyDiar-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: PyDiar-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 29.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.10 CPython/3.9.6 Darwin/21.3.0

File hashes

Hashes for PyDiar-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 e592afd4f8c753b6763f8049030838c72b337f44282549846a126dadee951d82
MD5 b3651e4407ac81ced4c778745c08d55c
BLAKE2b-256 1ff6f7f38c693e6ed735aed2f194e0642c100b3ad63f7448ee31ff5dc68bd6ac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page