Skip to main content

Add your description here

Project description

📦 emota_loader — Python Dataloader for EmoTa Dataset

EmoTa: A Tamil Emotional Speech Dataset (Thevakumar et al., CHiPSAL 2025) is the first open-access emotional speech corpus in Tamil, designed to capture the dialectal diversity of Sri Lankan Tamil speakers[^1].

Statistic Value
Utterances 936 (22 speakers × 19 sentences × 5 emotions)
Speakers 22 native Sri Lankan Tamil (11 male, 11 female)
Sentences 19 semantically neutral sentences
Emotions angry, happy, sad, fear, neutral
Inter-annotator Agreement Fleiss’ Kappa = 0.74
Baseline F1 Scores XGBoost: 0.91, Random Forest: 0.90

🔧 Installation

You can install the package from PyPI using:

pip install emota_loader

Make sure to clone/download the EmoTa dataset separately and point the loader to its root directory.


🚀 Sample Usage

from emota_loader import EmoTaDataset

dataset = EmoTaDataset(root_dir="path/to/EmoTa").samples

print(f"Loaded {len(dataset)} samples")

sample = dataset[0]
print(f"  Audio Path      : {sample.audio_path}")
print(f"  Speaker ID      : {sample.speaker_id}")
print(f"  Speaker Gender  : {sample.speaker_gender}")
print(f"  Speaker Age     : {sample.speaker_age}")
print(f"  Speaker Region  : {sample.speaker_region}")
print(f"  Sentence ID     : {sample.sentence_id}")
print(f"  Transcript      : {sample.transcript}")
print(f"  Emotion         : {sample.emotion}")

Example Output

Loaded 936 samples

  Audio Path      : EmoTa/19_18_ang.wav
  Speaker ID      : 19
  Speaker Gender  : male
  Speaker Age     : 25
  Speaker Region  : northern
  Sentence ID     : 18
  Transcript      : நான் உன்னை சந்திக்க வேண்டும்.
  Emotion         : angry

📄 Citation

Please cite the dataset as:

@inproceedings{thevakumar-etal-2025-emota,
  title = "{E}mo{T}a: A {T}amil Emotional Speech Dataset",
  author = "Thevakumar, Jubeerathan and Thavarasa, Luxshan and Sivatheepan, Thanikan and Kugarajah, Sajeev and Thayasivam, Uthayasanker",
  booktitle = "Proceedings of the First Workshop on Challenges in Processing South Asian Languages (CHiPSAL 2025)",
  year = "2025",
  pages = "193--201",
  address = "Abu Dhabi, UAE",
  publisher = "International Committee on Computational Linguistics"
}

📘 License

Academic use only — see the EmoTa dataset license for details.


[^1]: Thevakumar, J., Thavarasa, L., et al. (2025). EmoTa: A Tamil Emotional Speech Dataset. Proceedings of CHiPSAL 2025.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

emota_loader-0.1.0.tar.gz (4.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

emota_loader-0.1.0-py3-none-any.whl (5.3 kB view details)

Uploaded Python 3

File details

Details for the file emota_loader-0.1.0.tar.gz.

File metadata

  • Download URL: emota_loader-0.1.0.tar.gz
  • Upload date:
  • Size: 4.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for emota_loader-0.1.0.tar.gz
Algorithm Hash digest
SHA256 969b7ed2d21b7066f5036d9f4467ee425b65a43711b35f705885ad6705e30c80
MD5 c32c96bba0c700c8d1e72adc73180e2a
BLAKE2b-256 08523f24ed93f2cfb7a62de35c0278b49891562c5c0c281b341bc7767aa51160

See more details on using hashes here.

Provenance

The following attestation bundles were made for emota_loader-0.1.0.tar.gz:

Publisher: publish.yml on aaivu/EmoTa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file emota_loader-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: emota_loader-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for emota_loader-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1737e1018c0fd4a8d660919dd8f7aaeac3912c2665093999f9731ee1a70e79b1
MD5 918e74c7cee06a5badf2a0161299e766
BLAKE2b-256 d1635f8112088521026ac69d322a5a849c141f3bb3055197c7131ada5532c5a0

See more details on using hashes here.

Provenance

The following attestation bundles were made for emota_loader-0.1.0-py3-none-any.whl:

Publisher: publish.yml on aaivu/EmoTa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page