Speech Emotion Recognition models and training using PyTorch
Project description
Vistec-AIS Speech Emotion Recognition
Speech Emotion Recognition Model and Inferencing using Pytorch
Installation
From Pypi
pip install vistec-ser
From source
git clone https://github.com/tann9949/vistec-ser.git
cd vistec-ser
python setup.py install
Usage
Training with THAI SER Dataset
We provide Google Colaboratory example for training the THAI SER dataset using our repository.
Training using provided scripts
Note that currently, this workflow only supports pre-loaded features. So it might comsume an additional overhead of ~2 Gb or RAM. To run the experiment. Run the following command
Since there are 80 studios recording and 20 zoom recording. We split the dataset into 10-fold, 10 studios each. Then evaluate using
k-fold cross validation method. We provide 2 k-fold experiments: including and excluding zoom recording. This can be configured
in config file (see examples/aisser.yaml
)
python examples/train_fold_aisser.py --config-path <path-to-config> --n-iter <number-of-iterations>
Inferencing
We also implement a FastAPI backend server as an example of deploying a SER model. To run the server, run
cd examples
uvicorn server:app --reload
You can customize the server by modifying example/thaiser.yaml
in inference
field.
Once the server spawn, you can do HTTP POST request in form-data
format. and JSON will return as the following format:
[
{
"name": <request-file-name>,
"prob": {
"neutral": <p(neu)>,
"anger": <p(ang)>,
"happiness": <p(hap)>,
"sadness": <p(sad)>
}
}, ...
]
See an example below:
Author & Sponsor
Chompakorn Chaksangchaichot
Email: chompakornc_pro@vistec.ac.th
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for vistec_ser-0.4.6a3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eaa5b5525295873460bb3f02e428417420fe44668efea18f6cef4e3be5eeed3e |
|
MD5 | 3b4ded24dff29d01df6b8ff5a8f6de83 |
|
BLAKE2b-256 | bc0c7361509d641dd05750da0d204d6fd462484663b8c74636ca818d00add0d9 |