Skip to main content

Package implementing a revamped method to the librilight-abx.

Project description

libri-light-abx2

The ABX phonetic evaluation metric for unsupervised representation learning as used by the ZeroSpeech challenge, now with context-type options (on-triphone, within-context, any-context). This module is a reworking of https://github.com/zerospeech/libri-light-abx, which in turn is a wrapper around https://github.com/facebookresearch/libri-light/tree/main/eval

Installation

You can install this module from pip directly using the following command :

pip install zerospeech-libriabx2

Or you can install from source by cloning this repository and running:

pip install .

As the final alternative, you can install into a conda environment by running:

conda install -c conda-forge -c pytorch -c coml zerospeech-libriabx2 pytorch::pytorch

Usage

From command line

usage: zrc-abx2 [-h] [--path_checkpoint PATH_CHECKPOINT]
                [--file_extension {.pt,.npy,.wav,.flac,.mp3,.npz,.txt}]
                [--feature_size FEATURE_SIZE] [--cuda]
                [--speaker_mode {all,within,across}]
                [--context_mode {all,within,any}]
                [--distance_mode {euclidian,euclidean,cosine,kl,kl_symmetric}]
                [--max_size_group MAX_SIZE_GROUP]
                [--max_x_across MAX_X_ACROSS] [--out OUT] [--seed SEED]
                [--pooling {none,mean,hamming}] [--seq_norm]
                [--max_size_seq MAX_SIZE_SEQ] [--strict]
                path_data path_item_file

ABX metric

positional arguments:
  path_data             Path to directory containing the submission data
  path_item_file        Path to the .item file containing the timestamps and
                        transcriptions

optional arguments:
  -h, --help            show this help message and exit
  --path_checkpoint PATH_CHECKPOINT
                        Path to a CPC checkpoint. If set, apply the model to
                        the input data to compute the features
  --file_extension {.pt,.npy,.wav,.flac,.mp3,.npz,.txt}
  --feature_size FEATURE_SIZE
                        Size (in s) of one feature
  --cuda                Use the GPU to compute distances
  --speaker_mode {all,within,across}
                        Choose the speaker mode of the ABX score to compute
  --context_mode {all,within,any}
                        Choose the context mode of the ABX score to compute
  --distance_mode {euclidian,euclidean,cosine,kl,kl_symmetric}
                        Choose the kind of distance to use to compute the ABX
                        score.
  --max_size_group MAX_SIZE_GROUP
                        Max size of a group while computing the ABX score. A
                        small value will make the code faster but less
                        precise.
  --max_x_across MAX_X_ACROSS
                        When computing the ABX across score, maximum number of
                        speaker X to sample per couple A,B. A small value will
                        make the code faster but less precise.
  --out OUT             Path where the results should be saved
  --seed SEED           Seed to use in random sampling.
  --pooling {none,mean,hamming}
                        Type of pooling over frame representations of items.
  --seq_norm            Used for CPC features only. If activated, normalize
                        each batch of feature across the time channel before
                        computing ABX.
  --max_size_seq MAX_SIZE_SEQ
                        Used for CPC features only. Maximal number of frames
                        to consider when computing a batch of features.
  --strict              Used for CPC features only. If activated, each batch
                        of feature will contain exactly max_size_seq frames.

Python API

You can also call the abx evaluation from python code. You can use the following example:

import zrc_abx2

args = zrc_abx2.EvalArgs(
    path_data= "/location/to/representations/",
    path_item_file= "/location/to/file.item",
    **other_options
)

result = zrc_abx2.EvalABX().eval_abx(args)

Information on evaluation conditions

A new variable in this ABX version is context. In the within-context condition, a, b, and x have the same surrounding context (i.e. the same preceding and following phoneme). any-context ignores the surrounding context; typically, it varies.

For the within-context and any-context comparison, use an item file that extracts phonemes (rather than XYZ triphones). For the on-triphone condition, which is still available, use an item file that extracts triphones (just like in the previous abx evaluation), and then run it within-context (which was the default behavior of the previous abx evaluation). any-context is not used for the on-triphone version due to excessive noise that would be included in the representation.

Like in the previous version, it is also possible to run within-speaker (a, b, x are all from the same speaker) and across-speaker (a and b are from the same speaker, x is from another) evaluations. So there are four phoneme-based evaluation combinations in total: within_s-within_c, within_s-any-c, across_s-within_c, across_s-any_c; and two triphone-based evaluation combinations: within_s-within_c, across_s-within_c.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zerospeech-libriabx2-0.9.8.tar.gz (163.3 kB view details)

Uploaded Source

File details

Details for the file zerospeech-libriabx2-0.9.8.tar.gz.

File metadata

  • Download URL: zerospeech-libriabx2-0.9.8.tar.gz
  • Upload date:
  • Size: 163.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for zerospeech-libriabx2-0.9.8.tar.gz
Algorithm Hash digest
SHA256 485ffbd6a227af11c828db701a33096ec87a314fa5911279712396365f087900
MD5 c87535edfd899da80f5750a4e0a69c24
BLAKE2b-256 e6ca3df0b37b497a33c3f43154e2552007711fed3de479d615440ec6553dbf88

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page