A simple Python package for fast DER computation

Project description

SPYDER

A simple Python package for fast DER computation.

Installation

pip install spy-der

To install version with latest features directly from Github:

pip install git+https://github.com/desh2608/spyder.git@main

For development, clone this repository and run:

pip install --editable .

Usage

Compute DER for a single pair of reference and hypothesis

import spyder

# reference (ground truth)
ref = [("A", 0.0, 2.0), # (speaker, start, end)
       ("B", 1.5, 3.5),
       ("A", 4.0, 5.1)]

# hypothesis (diarization result from your algorithm)
hyp = [("1", 0.0, 0.8),
       ("2", 0.6, 2.3),
       ("3", 2.1, 3.9),
       ("1", 3.8, 5.2)]

# compute DER on full recording
print(spyder.DER(ref, hyp))
# DERMetrics(duration=5.10,miss=9.80%,falarm=21.57%,conf=25.49%,der=56.86%)

# compute DER on single-speaker regions only
print(spyder.DER(ref, hyp, regions="single"))
# DERMetrics(duration=4.10,miss=0.00%,falarm=26.83%,conf=19.51%,der=46.34%)

# compute DER using UEM segments
uem = [(0.5, 5.0)]
print(spyder.DER(ref, hyp, uem=uem))
# DERMetrics(duration=4.50,miss=11.11%,falarm=22.22%,conf=26.67%,der=60.00%)

# compute DER using collar
print(spyder.DER(ref, hyp, collar=0.2))
# DERMetrics(duration=3.10,miss=3.23%,falarm=12.90%,conf=19.35%,der=35.48%)

# get speaker mapping between reference and hypothesis
metrics = spyder.DER(ref, hyp)
print(f"Reference speaker map: {metrics.ref_map}")
print(f"Hypothesis speaker map: {metrics.hyp_map}")
# Reference speaker map: {'A': '0', 'B': '1'}
# Hypothesis speaker map: {'1': '0', '2': '2', '3': '1'}

Compute DER for multiple pairs of reference and hypothesis

import spyder

# for multiple pairs, reference and hypothesis should be lists or dicts
# if lists, ref and hyp must have same length

# reference (ground truth)
ref = {"uttr0":[("A", 0.0, 2.0), # (speaker, start, end)
                ("B", 1.5, 3.5),
                ("A", 4.0, 5.1)],
       "uttr2":[("A", 0.0, 4.3), # (speaker, start, end)
                ("C", 6.0, 8.1),
                ("B", 2.0, 8.5)]}

# hypothesis (diarization result from your algorithm)
hyp = {"uttr0":[("1", 0.0, 0.8),
                ("2", 0.6, 2.3),
                ("3", 2.1, 3.9),
                ("1", 3.8, 5.2)],
       "uttr2":[("1", 0.0, 4.5),
                ("2", 2.5, 8.7)]}

metrics = spyder.DER(ref, hyp)
print(metrics)
# {'Overall': DERMetrics(duration=18.00,miss=17.22%,falarm=8.33%,conf=7.22%,der=32.78%)}

metrics2 = spyder.DER(ref, hyp, per_file=True, verbose=True)  # verbose=True to prints per-file results

Output:

Evaluated 2 recordings on `all` regions. Results:
╒═════════════╤════════════════╤═════════╤════════════╤═════════╤════════╕
│ Recording   │   Duration (s) │   Miss. │   F.Alarm. │   Conf. │    DER │
╞═════════════╪════════════════╪═════════╪════════════╪═════════╪════════╡
│ uttr0       │           5.10 │   9.80% │     21.57% │  25.49% │ 56.86% │
├─────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ uttr2       │          12.90 │  20.16% │      3.10% │   0.00% │ 23.26% │
├─────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ Overall     │          18.00 │  17.22% │      8.33% │   7.22% │ 32.78% │
╘═════════════╧════════════════╧═════════╧════════════╧═════════╧════════╛

Additionally, you can provide UEM and collar parameters similar to single pair case.

Compute per-file and overall DERs between reference and hypothesis RTTMs using command line tool

Alternatively, spyder can also be invoked from the command line to compute the per-file and average DERs between reference and hypothesis RTTMs.

Usage: spyder [OPTIONS] REF_RTTM HYP_RTTM

Options:
  -u, --uem PATH                  UEM file (format: <recording_id> <channel>
                                  <start> <end>)

  -p, --per-file                  If this flag is set, print per file results.
                                  [default: False]

  -s, --skip-missing              Skip recordings which are missing in
                                  hypothesis (i.e., not counted in missed
                                  speech).  [default: False]

  -r, --regions [all|single|overlap|nonoverlap]
                                  Only evaluate on the selected region type.
                                  Default is all.  - all: all regions.  -
                                  single: only single-speaker regions (ignore
                                  silence and multiple speaker).  - overlap:
                                  only regions with multiple speakers in the
                                  reference.  - nonoverlap: only regions
                                  without multiple speakers in the reference.
                                  [default: all]

  -c, --collar FLOAT RANGE        Collar size.  [default: 0.0]
  -m, --print-speaker-map         Print speaker mapping for reference and
                                  hypothesis speakers.  [default: False]

  --help                          Show this message and exit.

Examples:

> spyder ref_rttm hyp_rttm
Evaluated 16 recordings on `all` regions. Results:
╒═════════════╤════════════════╤═════════╤════════════╤═════════╤════════╕
│ Recording   │   Duration (s) │   Miss. │   F.Alarm. │   Conf. │    DER │
╞═════════════╪════════════════╪═════════╪════════════╪═════════╪════════╡
│ Overall     │       33952.95 │  11.48% │      2.27% │   9.81% │ 23.56% │
╘═════════════╧════════════════╧═════════╧════════════╧═════════╧════════╛

> spyder ref_rttm hyp_rttm -r single -p -c 0.25
Evaluated 16 recordings on `single` regions. Results:
╒═════════════════════╤════════════════╤═════════╤════════════╤═════════╤════════╕
│ Recording           │   Duration (s) │   Miss. │   F.Alarm. │   Conf. │    DER │
╞═════════════════════╪════════════════╪═════════╪════════════╪═════════╪════════╡
│ EN2002a.Mix-Headset │        1032.05 │   0.00% │      2.98% │   4.97% │  7.94% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ EN2002b.Mix-Headset │         853.56 │   0.00% │      3.40% │   5.39% │  8.80% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ EN2002c.Mix-Headset │        1641.68 │   0.00% │      1.42% │   1.05% │  2.47% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ EN2002d.Mix-Headset │        1006.27 │   0.00% │      3.12% │   7.14% │ 10.26% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ ES2004a.Mix-Headset │         539.48 │   0.00% │      1.62% │   5.12% │  6.74% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ ES2004b.Mix-Headset │        1582.05 │   0.00% │      0.82% │   1.39% │  2.21% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ ES2004c.Mix-Headset │        1526.84 │   0.00% │      0.45% │   1.27% │  1.72% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ ES2004d.Mix-Headset │        1172.72 │   0.00% │      1.77% │   9.60% │ 11.37% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ IS1009a.Mix-Headset │         425.51 │   0.00% │      3.94% │   4.60% │  8.54% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ IS1009b.Mix-Headset │        1412.03 │   0.00% │      1.23% │   0.85% │  2.08% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ IS1009c.Mix-Headset │        1283.21 │   0.00% │      2.74% │   1.00% │  3.75% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ IS1009d.Mix-Headset │        1164.49 │   0.00% │      2.27% │   3.37% │  5.64% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ TS3003a.Mix-Headset │         804.27 │   0.00% │      0.00% │  11.28% │ 11.28% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ TS3003b.Mix-Headset │        1509.49 │   0.00% │      0.36% │   0.75% │  1.11% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ TS3003c.Mix-Headset │        1566.84 │   0.00% │      1.76% │   1.74% │  3.50% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ TS3003d.Mix-Headset │        1357.45 │   0.00% │      2.42% │   2.93% │  5.35% │
├─────────────────────┼────────────────┼─────────┼────────────┼─────────┼────────┤
│ Overall             │       18877.94 │   0.00% │      1.72% │   3.29% │  5.01% │
╘═════════════════════╧════════════════╧═════════╧════════════╧═════════╧════════╛

Why spyder?

Fast: Implemented in pure C++, and faster than the alternatives (md-eval.pl, dscore, pyannote.metrics). See this benchmark for comparisons with other tools.
Stand-alone: It has no dependency on any other library. We have our own implementation of the Hungarian algorithm, for example, instead of using scipy.
Easy-to-use: No need to write the reference and hypothesis turns to files and read md-eval output with complex regex patterns.
Overlap: Spyder supports overlapping speech in reference and hypothesis. In addition, you can compute metrics on just the single-speaker or overlap regions by passing the keyword argument regions="single" or regions="overlap", respectively.

Contributing

Contributions for core improvements or new recipes are welcome. Please run the following before creating a pull request.

pre-commit install
pre-commit run # Running linter checks

Bugs/issues

Please raise an issue in the issue tracker.

Project details

Release history Release notifications | RSS feed

This version

0.4.1

Jun 29, 2023

0.4.0

Mar 17, 2023

0.3.0

Mar 8, 2022

0.2.1

Feb 1, 2022

0.2.0

Jan 5, 2022

0.1.0

Mar 5, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spy-der-0.4.1.tar.gz (170.7 kB view details)

Uploaded Jun 29, 2023 Source

Built Distribution

spy_der-0.4.1-cp38-cp38-macosx_10_14_x86_64.whl (118.8 kB view details)

Uploaded Jun 29, 2023 CPython 3.8 macOS 10.14+ x86-64

File details

Details for the file spy-der-0.4.1.tar.gz.

File metadata

Download URL: spy-der-0.4.1.tar.gz
Upload date: Jun 29, 2023
Size: 170.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.8

File hashes

Hashes for spy-der-0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`c89fcd9a3ffcd95c51e32a6ec4363201460440d9e263987a980d464dcce91ec0`
MD5	`08e44e60559a8eb0cbb5225cca76beac`
BLAKE2b-256	`a07726f8cd7a6e80a7766c1bfc83436e8571d988ce0e619e83e09a75bd334e31`

See more details on using hashes here.

File details

Details for the file spy_der-0.4.1-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

Download URL: spy_der-0.4.1-cp38-cp38-macosx_10_14_x86_64.whl
Upload date: Jun 29, 2023
Size: 118.8 kB
Tags: CPython 3.8, macOS 10.14+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.8

File hashes

Hashes for spy_der-0.4.1-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm	Hash digest
SHA256	`a132c2aeebc91f476cf9bcc7f76daec978387b9098fc62979df900a0d3c70b3f`
MD5	`7619fc9af2b9ba354248f0f0d17ef2ec`
BLAKE2b-256	`cc1a8ba1d8227a6a22b0d509288fa8e3714c20c3870189f79cd124b9e08e5cef`