Skip to main content

Test exercise AST model on the ESC-50 dataset

Project description

Test implementation for Audio Spectrogram Transformer by Olga Slizovskaia

This repository provides a test implementation of the Audio Spectrogram Transformer described in the original paper. Please, note, that this implementation is lacking several important details compared to the original paper, such as dataset normalization, data augmentation routines and optimal hyperparameters selection. The results that you will obtain using the code provided in this repository, will differ severely from the results reported in the original paper.

Requirements

This repository requires a working python3.9 installation and uses poetry for dependency management and packaging.
Please, install poetry using the official guidelines.

You also need to download the ESC-50 dataset and specify the path to the dataset as dataset_dir parameter in hparams.py configuration file.

Installation

To install all necessary dependencies, run:

poetry env use 3.9

poetry install

Usage

We use the standard 5-fold cross-validation scheme for evaluating the classification model. The folds are defined in the datasets meta file and hardcoded for training. To train and evaluate the model, run:

python train.py

or

poetry run python train.py .

Results

The best test accuracy score achieved with this model without any pretraining is 0.39 as you can see in the following plot:

test_accuracy

The model overfits singnificantly reaching training loss values as low as 1.8 and only reaching validation and test loss values about 2.3.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ast_slizovskaia-0.1.4.tar.gz (35.9 kB view details)

Uploaded Source

Built Distribution

ast_slizovskaia-0.1.4-py3-none-any.whl (35.0 kB view details)

Uploaded Python 3

File details

Details for the file ast_slizovskaia-0.1.4.tar.gz.

File metadata

  • Download URL: ast_slizovskaia-0.1.4.tar.gz
  • Upload date:
  • Size: 35.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.0 CPython/3.8.8 Darwin/20.5.0

File hashes

Hashes for ast_slizovskaia-0.1.4.tar.gz
Algorithm Hash digest
SHA256 aeadd043a43db1869a498f1740ca4463cd951cc3b15540c6398ab4ad69c463cd
MD5 f87721f3fa91a93de0af370a75d1c7e6
BLAKE2b-256 689af0c9bef57cbc926b5a3157d0204df49f4abef0870d09fff997f5547f32d7

See more details on using hashes here.

File details

Details for the file ast_slizovskaia-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: ast_slizovskaia-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 35.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.0 CPython/3.8.8 Darwin/20.5.0

File hashes

Hashes for ast_slizovskaia-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 7f3d9f3c11be1a9fbd0b25b4c5e8217e95934c6506a54581c0d337094a9280ee
MD5 388f7a2c797ad338403d5d6fbe2a1d03
BLAKE2b-256 1466a9b80e075b950744f1c7a4e0cd4e2738959ec023d7a4c1d9a4671c5bf9f4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page