End-to-end spoken language identification (LID) on TensorFlow

These details have not been verified by PyPI

Project links

Homepage

Project description

lidbox

Spoken language identification (LId) out of the box using TensorFlow.
Parallel feature extraction using tf.data.Dataset, with STFT computations on the GPU using the tf.signal package.
Only metadata (e.g. utt2path, utt2label) is fully loaded into memory, rest is done in linear passes over the dataset with the tf.data.Dataset iterator.
Spectrograms, source audio, and utterance ids can be written into TensorBoard summaries.
Model training with tf.keras, some model examples are available here.
Average detection cost (C_avg) implemented as a tf.keras.metrics.Metric.
You can also try lidbox for speaker recognition, since no assumptions will be made of the signal labels. E.g. use utt2speaker as utt2label and see what happens.

Here is a full example notebook showing what lidbox can do.

Why would I want to use this?

You need a simple, deep learning based speech classification pipeline. For example: waveform -> VAD filter -> augment audio data -> serialize all data to a single binary file -> extract log-scale Mel-spectra or MFCC -> use DNN/CNN/LSTM/GRU/attention (etc.) to classify by signal labels
You have thousands of hours of speech data
You have a TensorFlow/Keras model that you train on the GPU and want the tf.data.Dataset extraction pipeline to also be on the GPU
You want an end-to-end pipeline that uses TensorFlow 2 as much as possible

Why would I not want to use this?

You are happy doing everything with Kaldi or some other toolkits
You don't want to debug by reading the source code when something goes wrong
You don't want to install TensorFlow 2 and configure its dependencies (CUDA etc.)
You want to train phoneme recognizers or use CTC

Installing

You need to have Python 3 installed.

With the example

git clone --depth 1 https://github.com/matiaslindgren/lidbox.git
python3 -m pip install ./lidbox

Check that the command line entry point is working:

lidbox -h

If not, make sure the setuptools entry point scripts (e.g. directory $HOME/.local/bin) are on your path.

Without the example

python3 -m pip install lidbox

TensorFlow

TensorFlow 2 is not included in the package requirements because you might want to do custom configuration to get the GPU working etc.

If you don't want to customize anything and instead prefer something that just works for now, the following should be enough:

python3 -m pip install tensorflow

If everything is working, see this for a simple example to get started.

Editable install

If you plan on making changes to the code, it is easier to install lidbox as a Python package in setuptools develop mode:

git clone --depth 1 https://github.com/matiaslindgren/lidbox.git
python3 -m pip install --editable ./lidbox

Then, if you make changes to the code, there's no need to reinstall the package since the changes are reflected immediately. Just be careful not to make changes when lidbox is running, because TensorFlow will use its autograph package to convert some of the Python functions to TF graphs, which might fail if the code changes suddenly.

X-vector embeddings

One benefit of deep learning classifiers is that you can first train them on large amounts of data and then use them as feature extractors to produce low-dimensional, fixed-length language vectors from speech. See e.g. the x-vector approach by Snyder et al.

Below is a visualization of test set language embeddings for 4 languages in 2-dimensional space. Each data point represents 2 seconds of speech in one of the 4 languages.

2-dimensional PCA plot of 400 random x-vectors for 4 Common Voice languages

PLDA + Naive Bayes classifier

There is a simple language embedding classifier backend available. To use it, you need to first install PLDA:

python3 -m pip install plda@https://github.com/RaviSoji/plda/archive/184d6e39b01363b72080f2752819496cd029f1bd.zip

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.0.0rc0 pre-release

Nov 22, 2020

This version

0.7.1

Nov 1, 2020

0.7.0

Oct 21, 2020

0.6.1

Jul 4, 2020

0.6.0

Jul 4, 2020

0.5.0

Jul 4, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lidbox-0.7.1.tar.gz (61.5 kB view details)

Uploaded Nov 1, 2020 Source

Built Distribution

lidbox-0.7.1-py3-none-any.whl (76.3 kB view details)

Uploaded Nov 1, 2020 Python 3

File details

Details for the file lidbox-0.7.1.tar.gz.

File metadata

Download URL: lidbox-0.7.1.tar.gz
Upload date: Nov 1, 2020
Size: 61.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for lidbox-0.7.1.tar.gz
Algorithm	Hash digest
SHA256	`aabd924e6a915acb023b58156192831dc62d2254c168f6476c11eeb2c436267a`
MD5	`dac38b7d61c1c42c2e707334bab4cf2e`
BLAKE2b-256	`9cb78999c9f8cc2d371360d8f2a2e32093c69dae59ed5a0de6f94247bb229453`

See more details on using hashes here.

File details

Details for the file lidbox-0.7.1-py3-none-any.whl.

File metadata

Download URL: lidbox-0.7.1-py3-none-any.whl
Upload date: Nov 1, 2020
Size: 76.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for lidbox-0.7.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`63f3b2065cfdcea95b9c1546d04b1a384aea1bbe5ac58cb10c1240026252ff40`
MD5	`6aa57b3b7aef33cbcd6917deb41e6968`
BLAKE2b-256	`1360f3b54bb7a323d481d1f1b1334957813b41bf3007e9de22c362d4b427d46b`

See more details on using hashes here.

lidbox 0.7.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

lidbox

Why would I want to use this?

Why would I not want to use this?

Installing

With the example

Without the example

TensorFlow

Editable install

X-vector embeddings

PLDA + Naive Bayes classifier

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes