Skip to main content

Toolkit for speaker recognition

Project description

HYPERION

PyPI version License Python Versions Downloads Documentation Status Build Status Code style: black

Hyperion is a Speaker Recognition Toolkit based on PyTorch and numpy. It provides:

  • x-Vector architectures: ResNet, Res2Net, Spine2Net, ECAPA-TDNN, EfficientNet, Transformers and others.
  • Embedding preprocessing tools: PCA, LDA, NAP, Centering/Whitening, Length Normalization, CORAL
  • Several flavours of PLDA back-ends: Full-rank PLDA, Simplified PLDA, PLDA
  • Calibration and Fusion tools
  • Recipes for popular datasets: VoxCeleb, NIST-SRE, VOiCES

The full API is described in the documentation page https://hyperion-ml.readthedocs.io

Installation Instructions

Prerequisites

We use anaconda or miniconda, though you should be able to make it work in other python distributions
To start, you should create a new enviroment and install PyTorch>=1.6, we recommend PyTorch 1.8.0 (newer versions give trouble with some training scripts) e.g.:
conda create --name ${your_env} python=3.8
conda activate ${your_env}
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.2 -c pytorch

In next Hyperion versions, we will upgrade to Pytorch>=1.9 and drop compatibility with older PyTorch versions.

Installing Hyperion

  • First, clone the repo:
git clone https://github.com/hyperion-ml/hyperion.git
  • You can choose to install hyperion in the environment
cd hyperion
pip install -e .
  • Or add the hyperion toolkit to the PYTHONPATH envirnoment variable This option will allow you to share the same environment if you are working with several hyperion branches at the same time, while installing it requires to have an enviroment per branch. For this, you need to install the requirements
cd hyperion
pip install -r requirements.txt

Then add these lines to your ~/.bashrc or to each script that uses hyperion

HYP_ROOT= #substitute this by your hyperion location
export PYTHONPATH=${HYP_ROOT}:$PYTHONPATH
export PATH=${HYP_ROOT}/bin:$PATH

Recipes

There are recipes for several tasks in the ./egs directory.

Prerequistes to run the recipes

These recipes require some extra tools (e.g. sph2pipe), which need to be installed first:

./install_egs_requirements.sh 

Most recipes do not require Kaldi, only the older ones using Kaldi x-vectors, so we do not install it by default. If you are going to need it install it yourself. Then make a link in ./tools to your kaldi installation

cd tools
ln -s ${your_kaldi_path} kaldi
cd -

Finally configure the python and environment name that you intend to use to run the recipes. For that run

./prepare_egs_paths.sh

This script will ask for the path to your anaconda installation and enviromentment name. It will also detect if hyperion is already installed in the environment, otherwise it will add hyperion to your python path. This will create the file

tools/path.sh

which sets all the enviroment variables required to run the recipes. This has been tested only on JHU computer grids, so you may need to modify this file manually to adapt it to your grid.

Recipes structure

The structure of the recipes is very similar to Kaldi, so if should be familiar for most people. Data preparation is also similar to Kaldi. Each dataset has a directory with files like

wav.scp
utt2spk
spk2utt
...

Running the recipes

Contrary to other toolkits, the recipes do not contain a single run.sh script to run all the steps of the recipe. Since some recipes have many steps and most times you don't want to run all of then from the beginning, we have split the recipe in several run scripts. The scripts have a number indicating the order in the sequence. For example,

run_001_prepare_data.sh
run_002_compute_vad.sh
run_010_prepare_audios_to_train_xvector.sh
run_011_train_xvector.sh
run_030_extract_xvectors.sh
run_040_evaluate_plda_backend.sh

will evaluate the recipe with the default configuration. The default configuration is in the file default_config.sh

We also include extra configurations, which may change the hyperparamters of the recipe. For example:

  • Acoustic features
  • Type of the x-vector neural netwok
  • Hyper-parameters of the models
  • etc.

Extra configs are in the global_conf directory of the recipe. Then you can run the recipe with the alternate config as:

run_001_prepare_data.sh --config-file global_conf/alternative_conf.sh
run_002_compute_vad.sh --config-file global_conf/alternative_conf.sh
run_010_prepare_audios_to_train_xvector.sh --config-file global_conf/alternative_conf.sh
run_011_train_xvector.sh --config-file global_conf/alternative_conf.sh
run_030_extract_xvectors.sh --config-file global_conf/alternative_conf.sh
run_040_evaluate_plda_backend.sh --config-file global_conf/alternative_conf.sh

Note that many alternative configus share hyperparameters with the default configs. That means that you may not need to rerun all the steps to evaluate a new configuration. It mast cases you just need to re-run the steps from the neural network training to the end.

Citing

Each recipe README.md file contains the bibtex to the works that should be cited if you use that recipe in your research

Directory structure:

  • The directory structure of the repo looks like this:
hyperion
hyperion/egs
hyperion/hyperion
hyperion/resources
hyperion/tests
hyperion/tools
  • Directories:
    • hyperion: python classes with utilities for speaker and language recognition
    • egs: recipes for sevaral tasks: VoxCeleb, SRE18/19/20, voices, ...
    • tools: contains external repos and tools like kaldi, python, cudnn, etc.
    • tests: unit tests for the classes in hyperion
    • resources: data files required by unittest or recipes

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hyperion-ml-0.2.6.tar.gz (368.2 kB view details)

Uploaded Source

Built Distribution

hyperion_ml-0.2.6-py3-none-any.whl (713.0 kB view details)

Uploaded Python 3

File details

Details for the file hyperion-ml-0.2.6.tar.gz.

File metadata

  • Download URL: hyperion-ml-0.2.6.tar.gz
  • Upload date:
  • Size: 368.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for hyperion-ml-0.2.6.tar.gz
Algorithm Hash digest
SHA256 389bddcbfa9da1a5880bc58e7e9a0948331a2ab3ef4364103b59f5f483bb6c45
MD5 3b3045da7999044c85b501e8a9bdff9b
BLAKE2b-256 cb96b1332874852b14a5578ae6da406db2c060084ad16b62146caffd1d3d0b18

See more details on using hashes here.

File details

Details for the file hyperion_ml-0.2.6-py3-none-any.whl.

File metadata

  • Download URL: hyperion_ml-0.2.6-py3-none-any.whl
  • Upload date:
  • Size: 713.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for hyperion_ml-0.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 96af5ddb7b5fc3b0720e78a9b9c82ca7378a39c62415790f1fc0776dc80f168b
MD5 db792adf3764d48b9210bdb0c793e9b6
BLAKE2b-256 abe95f7fbd54ab8ed0964d67a554675a91576cad90c19374c5c7dccd015e5bda

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page