Skip to main content

S-SONDO: Lightweight audio embeddings from self-supervised knowledge distillation

Project description

S-SONDO

Lightweight audio embeddings from self-supervised knowledge distillation.

Up to 61x smaller than teacher models, retaining up to 96% performance.

Paper  HuggingFace  GitHub

ICASSP 2026


Install

pip install ssondo

Quick Start

from ssondo import get_ssondo

model = get_ssondo()
embeddings = model(audio)  # (batch, n_segments, 960)

No preprocessing, no config files, no manual downloads. Pass raw mono audio at 32 kHz and get embeddings.

Pretrained Classifiers

7 ready-to-use classifiers trained on standard audio benchmarks:

model = get_ssondo(head="esc50")
logits = model(audio)  # (batch, 50)
Head Task Classes
esc50 Environmental sound 50
us8k Urban sound 10
fsd50k Sound events 200
gtzan Music genre 10
openmic Instrument recognition 20
nsynth Instrument family 11
magna-tag-a-tune Music auto-tagging 50

Custom Heads

# Linear
model = get_ssondo(head="linear", n_classes=10)

# MLP
model = get_ssondo(head="mlp", n_classes=10, hidden_sizes=[512, 256])

Finetuning

# Linear probing (frozen backbone)
model = get_ssondo(head="linear", n_classes=10)
model.freeze_backbone()
model.train()

logits = model(audio)
loss = criterion(logits, labels)
loss.backward()  # only head parameters update

# Full finetuning
model.unfreeze_backbone()

API at a Glance

from ssondo import get_ssondo, list_models, list_heads

model = get_ssondo()                          # default backbone
model = get_ssondo("matpac-dymn")             # specific backbone
model = get_ssondo(head="esc50")              # pretrained classifier
model = get_ssondo(head="linear", n_classes=10)  # custom head
model = get_ssondo(device="cuda")             # GPU
model = get_ssondo("path/to/checkpoint.ckpt") # local checkpoint

embeddings = model(audio)                     # (batch, n_segments, 960)
emb = model.get_embeddings(audio)             # (batch, 960) mean-pooled
model.embedding_dim                           # 960
model.backbone                                # raw nn.Module

list_models()                                 # available backbones
list_heads()                                  # available classifiers

Available Models

Model Teacher Student Params Emb.
matpac-mobilenetv3 MATPAC++ MobileNetV3 2.9M 960
matpac-dymn MATPAC++ DyMN 8.7M 960
matpac-eres2net MATPAC++ ERes2Net 1.4M 10240
m2d-mobilenetv3 M2D MobileNetV3 2.9M 960
m2d-dymn M2D DyMN 8.7M 960
m2d-eres2net M2D ERes2Net 1.4M 10240

Input

  • Mono audio, single channel
  • Sample rate: 32,000 Hz
  • Internally sliced into 10 s segments and converted to 128-band log-mel spectrograms

Links

Citation

@inproceedings{eladlouni2026ssondo,
  title={S-SONDO: Self-Supervised Knowledge Distillation for General Audio Foundation Models},
  author={El Adlouni, Mohammed Ali and Quelennec, Aurian and Chouteau, Pierre and Peeters, Geoffroy and Essid, Slim},
  booktitle={IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  year={2026}
}

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ssondo-0.3.0.tar.gz (33.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ssondo-0.3.0-py3-none-any.whl (36.3 kB view details)

Uploaded Python 3

File details

Details for the file ssondo-0.3.0.tar.gz.

File metadata

  • Download URL: ssondo-0.3.0.tar.gz
  • Upload date:
  • Size: 33.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for ssondo-0.3.0.tar.gz
Algorithm Hash digest
SHA256 5a633a2bf6e42660a076269e962bd193f9256d152b29e0326fe8b9717c67e690
MD5 9318bb95f24d5504d80d4c10b08f7097
BLAKE2b-256 850b0c940e9aa0a32d9644deaa53928a0e6587188f7d4a9ec9a8232bfaf70926

See more details on using hashes here.

File details

Details for the file ssondo-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: ssondo-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 36.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for ssondo-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 036b5a0e6a2f630f939502fe7a7f92a137ad950aaf37d202ce7d6e17e5b31876
MD5 f7d8a28a5b1beb288eb9e66d61976d67
BLAKE2b-256 5464cf91912c442c35f068d21584ce87c4c62a085d6d1b78b55f60a1c871e24e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page