Skip to main content

Machine learning audio prediction experiments based on templates

Project description

Nkululeko

Overview

A project to detect speaker characteristics by machine learning experiments with a high-level interface.

The idea is to have a framework (based on e.g. sklearn and torch) that can be used by people not being experienced programmers as they mainly have to adapt an initialization parameter file per experiment.

Here are some examples of typical output:

Confusion matrix

Per default, Nkululeko displays results as a confusion matrix using binning with regression.

Epoch progression

The point when overfitting starts can sometimes be seen by looking at the results per epoch:

Feature importance

Using the explore interface, Nkululeko analyses the importance of acoustic features:

Feature distribution

And can show the distribution of specific features per category:

t-SNE plots

A t-SNE plot can give you an estimate wether your acoustic features are useful at all:

Data distribution

Sometimes you only want to take a look at your data:

Installation

Create and activate a virtual Python environment and simply run

pip install nkululeko

Some examples for ini-files (which you use to control nkululeko) are in the tests folder.

Usage

Basically, you specify your experiment in an "ini" file (e.g. experiment.ini) and then call one of the Nkululeko interfaces to run the experiment like this:

  • python -m nkululeko.nkululeko --config experiment.ini

A basic configuration looks like this:

[EXP]
root = ./
name = exp_emodb
[DATA]
databases = ['emodb']
emodb = ./emodb/
emodb.split_strategy = speaker_split
target = emotion
labels = ['anger', 'boredom', 'disgust', 'fear']
[FEATS]
type = ['praat']
[MODEL]
type = svm
[EXPL]
model = tree
plot_tree = True
[PLOT]
combine_per_speaker = mode

Read the Hello World example for initial usage with Emo-DB dataset.

Here is an overview of the interfaces:

  • nkululeko.nkululeko: doing experiments
  • nkululeko.demo: demo the current best model on command line
  • nkululeko.test: predict a series of files with the current best model
  • nkululeko.explore: perform data exploration
  • nkululeko.augment: augment the current training data

Alternatively, there is a central "experiment" class that can be used by own experiments

There's my blog with tutorials:

The framework is targeted at the speech domain and supports experiments where different classifiers are combined with different feature extractors.

Here's a rough UML-like sketch of the framework. sketch

Currently, the following linear classifiers are implemented (integrated from sklearn):

  • SVM, SVR, XGB, XGR, Tree, Tree_regressor, KNN, KNN_regressor, NaiveBayes, GMM and the following ANNs
  • MLP, CNN (tbd)

Here's an animation that shows the progress of classification done with nkululeko

Initialization file

You could

  • use a generic main python file (like my_experiment.py),
  • adapt the path to your nkululeko src
  • and then adapt an .ini file (again fitting at least the paths to src and data)

Here's an overview of the ini-file options

Hello World example

  • NEW: I made a video to show you how to do this on Windows
  • Set up Python on your computer, version >= 3.6
  • Open a terminal/commandline/console window
  • Test python by typing python, python should start with version >3 (NOT 2!). You can leave the Python Interpreter by typing exit()
  • Create a folder on your computer for this example, let's call it nkulu_work
  • Get a copy of the Berlin emodb in audformat and unpack the same folder (nkulu_work)
  • Make sure the folder is called "emodb" and does contain the database files directly (not box-in-a-box)
  • Also, in the nkulu_work folder:
    • Create a Python environment
      • python -m venv venv
    • Then, activate it:
      • under Linux / mac
        • source venv/bin/activate
      • under Windows
        • venv\Scripts\activate.bat
      • if that worked, you should see a (venv) in front of your prompt
    • Install the required packages in your environment
      • pip install nkululeko
      • Repeat until all error messages vanished (or fix them, or try to ignore them)...
  • Now you should have two folders in your nkulu_work folder:
    • emodb and venv
  • Download a copy of the file exp_emodb.ini to the current working directory (nkulu_work)
  • Run the demo
    • python -m nkululeko.nkululeko --config exp_emodb.ini
  • Find the results in the newly created folder exp_emodb
    • Inspect exp_emodb/images/run_0/emodb_xgb_os_0_000_cnf.png
    • This is the main result of you experiment: a confusion matrix for the emodb emotional categories
  • Inspect and play around with the demo configuration file that defined your experiment, then re-run.
  • There are many ways to experiment with different classifiers and acoustic features sets, all described here

Features

  • Classifiers: Naive Bayes, KNN, Tree, XGBoost, SVM, MLP
  • Feature extractors: Praat, Opensmile, openXBOW BoAW, TRILL embeddings, Wav2vec2 embeddings, audModel embeddings, ...
  • Feature scaling
  • Label encoding
  • Binning (continuous to categorical)
  • Online demo interface for trained models

Outlook

  • Classifiers: CNN
  • Feature extractors: mid-level descriptors, Mel-spectra

License

Nkululeko can be used under the MIT license

Changelog

Version 0.44.1

Version 0.44.0

  • added scatter functions: tsne, pca, umap

Version 0.43.7

  • added clap features

Version 0.43.6

  • small bugs

Version 0.43.5

  • because of difficulties with numba and audiomentations importing audiomentations only when augmenting

Version 0.43.4

  • added error when experiment type and predictor don't match

Version 0.43.3

  • fixed further bugs and added augmentation to the test runs

Version 0.43.2

  • fixed a bug when running continuous variable as classification problem

Version 0.43.1

  • fixed test_runs

Version 0.43.0

  • added augmentation module based on audiomentation

Version 0.42.0

  • age labels should now be detected in databases

Version 0.41.0

  • added feature tree plot

Version 0.40.1

  • fixed a bug: additional test database was not label encoded

Version 0.40.0

  • added EXPL section and first functionality
  • added test module (for test databases)

Version 0.39.0

  • added feature distribution plots
  • added plot format

Version 0.38.3

  • added demo mode with list argument

Version 0.38.2

  • fixed a bug concerned with "no_reuse" evaluation

Version 0.38.1

  • demo mode with file argument

Version 0.38.0

  • fixed demo mode

Version 0.37.2

  • mainly replaced pd.append with pd.concat

Version 0.37.1

  • fixed bug preventing praat feature extraction to work

Version 0.37.0

  • fixed bug cvs import not detecting multiindex

Version 0.36.3

  • published as a pypi module

Version 0.36.0

  • added entry nkululeko.py script

Version 0.35.0

  • fixed bug that prevented scaling (normalization)

Version 0.34.2

  • smaller bug fixed concerning the loss_string

Version 0.34.1

  • smaller bug fixes and tried Soft_f1 loss

Version 0.34.0

  • smaller bug fixes and debug ouputs

Version 0.33.0

  • added GMM as a model type

Version 0.32.0

  • added audmodel embeddings as features

Version 0.31.0

  • added models: tree and tree_reg

Version 0.30.0

  • added models: bayes, knn and knn_reg

Version 0.29.2

  • fixed hello world example

Version 0.29.1

  • bug fix for 0.29

Version 0.29.0

  • added a new FeatureExtractor class to import external data

Version 0.28.2

  • removed some Pandas warnings
  • added no_reuse function to database.load()

Version 0.28.1

  • with database.value_counts show only the data that is actually used

Version 0.28.0

  • made "label_data" configuration automatic and added "label_result"

Version 0.27.0

  • added "label_data" configuration to label data with trained model (so now there can be train, dev and test set)

Version 0.26.1

  • Fixed some bugs caused by the multitude of feature sets
  • Added possibilty to distinguish between absolut or relative pathes in csv datasets

Version 0.26.0

  • added the rename_speakers funcionality to prevent identical speaker names in datasets

Version 0.25.1

  • fixed bug that no features were chosen if not selected

Version 0.25.0

  • made selectable features universal for feature sets

Version 0.24.0

  • added multiple feature sets (will simply be concatenated)

Version 0.23.0

  • added selectable features for Praat interface

Version 0.22.0

  • added David R. Feinberg's Praat features, praise also to parselmouth

Version 0.21.0

  • Revoked 0.20.0
  • Added support for only_test = True, to enable later testing of trained models with new test data

Version 0.20.0

  • implemented reuse of trained and saved models

Version 0.19.0

  • added "max_duration_of_sample" for datasets

Version 0.18.6

  • added support for learning and dropout rate as argument

Version 0.18.5

  • added support for epoch number as argument

Version 0.18.4

  • added support for ANN layers as arguments

Version 0.18.3

  • added reuse of test and train file sets
  • added parameter to scale continous target values: target_divide_by

Version 0.18.2

  • added preference of local dataset specs to global ones

Version 0.18.1

  • added regression value display for confusion matrices

Version 0.18.0

  • added leave one speaker group out

Version 0.17.2

  • fixed scaler, added robust

Version 0.17.0

  • Added minimum duration for test samples

Version 0.16.4

  • Added possibility to combine predictions per speaker (with mean or mode function)

Version 0.16.3

  • Added minimal sample length for databases

Version 0.16.2

  • Added k-fold-cross-validation for linear classifiers

Version 0.16.1

  • Added leave-one-speaker-out for linear classifiers

Version 0.16.0

  • Added random sample splits

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nkululeko-0.44.1.tar.gz (54.6 kB view hashes)

Uploaded Source

Built Distribution

nkululeko-0.44.1-py3-none-any.whl (71.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page