okwugbe

Automatic Speech Recognition Library for African Languages

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Okwugbe

Automatic Speech Recognition Library for (low-resource) African Languages

Motivation

Our aim is to foster ASR for African languages by making the whole process--from dataset gathering and preprocessing to training--as easy as possible. This library follows our work Okwugbé on ASR for Fon and Igbo. Based on the architecture of the network described in our paper, it aims at easing the training process of ASR for other languages. The primary targets are African languages, but it supports other languages as well

Usage

pip install okwugbe

#Import the trainer instance
from train_eval import Train_Okwugbe 

train_path = '/path/to/training_file.csv'
test_path = '/path/to/testing_file.csv'
characters_set = '/path/to/character_set.txt'
 
"""
 /path/to/training_file.csv and /path/to/testing_file.csv are meant to be csv files with two columns:
    the first one containing the full paths to audio wav files
    the second one containing the textual transcription of audio contents
"""

#Initialize the trainer instance
train = Train_Okwugbe(train_path, test_path, characters_set)

#Start the training
train.run()

Parameters

Here are the parameters for the package, as well as their default values.

The default values have been chosen so that you only have to make minimal changes to get a good ASR model going.

Parameter	Description	default
`use_common_voice`	Whether or not to use common voice	False
`lang`	language to use from Common Voice. Must be specified if `use_common_voice` is set to True.	None
`rnn_dim`	RNN Dimension & Hidden Size	512
`num_layers`	Number of Layers	1
`n_cnn`	Number of CNN components	5
`n_rnn`	Number of RNN components	3
`n_feats`	Number of features for the ResCNN	128
`in_channels`	Number of input channels of the ResCNN	1
`out_channels`	Number of output channels of the ResCNN	32
`kernel`	Kernel Size for the ResCNN	3
`stride`	Stride Size for the ResCNN	2
`padding`	Padding Size for the ResCNN	1
`dropout`	Dropout (kept unique for all components)	0.1
`with_attention`	True to use attention mechanism, False else	False
`batch_multiplier`	Batch multiplier for Gradient Accumulation)	1 (no Gradient Accumulation)
`grad_acc`	Gradient Accumulation Option	False
`model_path`	Path for the saved model	'./okwugbe_model'
`characters_set`	Path to the .txt file containing unique characters	required
`validation_set`	Validation set size	0.2
`train_path`	Path to training set	required
`test_path`	Path to testing set	required
`learning_rate`	Learning rate	3e-5
`batch_size`	Batch Size	20
`patience`	Early Stopping Patience	20
`epochs`	Training epochs	500
`optimizer`	Optimizer	'adamw'
`freq_mask`	frequency masking (for speech augmentation)	30
`time_mask`	time masking (for speech augmentation)	100
`display_plot`	whether or not to plot metrics during training	True

Integration with Common Voicee

You easily train on Common Voice data set with Okwugbe by specifying use_common_voice=True and setting lang to the language code of your choice. This language must be hosted on Common Voice.

#Initialize the trainer instance
train = Train_Okwugbe(use_common_voice=True, lang='mn') # for mongolian

#Start the training
train.run()

Here is the list of our current supported languages in Common Voice.

tt	en	de	fr	cy	br	cv	tr	ky	ga-IE	kab	ca	zh-TW	sl	it	nl	cnh	eo	et	fa	pt	eu	es	zh-CN	mn	sah	dv	rw	sv-SE	ru	id	ar	ta	ia	lv	ja	vot	ab	zh-HK	rm-sursilv
tatar	english	german	french	welsh	breton	chuvash	turkish	kyrgyz	irish	kabyle	catalan	taiwanese	slovenian	italian	dutch	hakha chin	esperanto	estonian	persian	portuguese	basque	spanish	chinese	mongolian	sakha	dhivehi	kinyarwanda	swedish	russian	indonesian	arabic	tamil	interlingua	latvian	japanese	votic	abkhaz	cantonese	romansh sursilvan

Tutorials

on using OkwuGbe
on using OkwuGbe with Common Voice

ASR Data for African languages

Wondering where to find dataset for your African language? Here are some resources to check:

Debugging

is strictly for debugging!

Citation

Please cite our paper using the citation below if you use our work in anyway:

@inproceedings{dossou-emezue-2021-okwugbe,
    title = "{O}kwu{G}b{\'e}: End-to-End Speech Recognition for {F}on and {I}gbo",
    author = "Dossou, Bonaventure F. P.  and
      Emezue, Chris Chinenye",
    booktitle = "Proceedings of the Fifth Workshop on Widening Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.winlp-1.1",
    pages = "1--4",
    abstract = "Language is a fundamental component of human communication. African low-resourced languages have recently been a major subject of research in machine translation, and other text-based areas of NLP. However, there is still very little comparable research in speech recognition for African languages. OkwuGb{\'e} is a step towards building speech recognition systems for African low-resourced languages. Using Fon and Igbo as our case study, we build two end-to-end deep neural network-based speech recognition models. We present a state-of-the-art automatic speech recognition (ASR) model for Fon, and a benchmark ASR model result for Igbo. Our findings serve both as a guide for future NLP research for Fon and Igbo in particular, and the creation of speech recognition models for other African low-resourced languages in general. The Fon and Igbo models source code have been made publicly available. Moreover, Okwugbe, a python library has been created to make easier the process of ASR model building and training.",
}```

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.1.8

Dec 30, 2021

0.1.7

Dec 30, 2021

0.1.3

Dec 23, 2021

0.0.1

Jul 27, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

okwugbe-0.1.8.tar.gz (20.0 kB view details)

Uploaded Dec 30, 2021 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

okwugbe-0.1.8-py3-none-any.whl (18.9 kB view details)

Uploaded Dec 30, 2021 Python 3

File details

Details for the file okwugbe-0.1.8.tar.gz.

File metadata

Download URL: okwugbe-0.1.8.tar.gz
Upload date: Dec 30, 2021
Size: 20.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.7.1 importlib_metadata/4.6.1 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.5

File hashes

Hashes for okwugbe-0.1.8.tar.gz
Algorithm	Hash digest
SHA256	`669d875abf6f97939f75b0dbe47ce811bba43e4d4e5a731c843e2243b3637752`
MD5	`1a8cd759ad772e2c6cd57a70fb93bdd6`
BLAKE2b-256	`138b692a1f20f16cd4c9eb421e97566e849f53c45882f7a9f0905fe3a679a5af`

See more details on using hashes here.

File details

Details for the file okwugbe-0.1.8-py3-none-any.whl.

File metadata

Download URL: okwugbe-0.1.8-py3-none-any.whl
Upload date: Dec 30, 2021
Size: 18.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.7.1 importlib_metadata/4.6.1 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.5

File hashes

Hashes for okwugbe-0.1.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8d3cce6f5bbbb4b5a7383376480c0bc4c04c94bc35304e2929ab83d80e291abf`
MD5	`0e5ffc44392b005b674493c66cce0dac`
BLAKE2b-256	`caf65b8a888dcdf5405b4b00a911bd8fecc5dee1b527a294b2e54ed1697115c7`

See more details on using hashes here.

okwugbe 0.1.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Okwugbe

Motivation

Usage

Parameters

Integration with Common Voicee

Tutorials

ASR Data for African languages

Debugging

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes