Skip to main content

S-SONDO: Lightweight audio embeddings from self-supervised knowledge distillation

Project description

S-SONDO

Lightweight audio embeddings from self-supervised knowledge distillation.

Up to 61x smaller than teacher models, retaining up to 96% performance.

Paper  HuggingFace  GitHub

ICASSP 2026


Install

pip install ssondo

Quick Start

from ssondo import get_ssondo

model = get_ssondo()
embeddings = model(audio)  # (batch, n_segments, 960)

No preprocessing, no config files, no manual downloads. Pass raw mono audio at 32 kHz and get embeddings.

Pretrained Classifiers

7 ready-to-use classifiers trained on standard audio benchmarks:

model = get_ssondo(head="esc50")
logits = model(audio)  # (batch, 50)
Head Task Classes
esc50 Environmental sound 50
us8k Urban sound 10
fsd50k Sound events 200
gtzan Music genre 10
openmic Instrument recognition 20
nsynth Instrument family 11
magna-tag-a-tune Music auto-tagging 50

Custom Heads

# Linear
model = get_ssondo(head="linear", n_classes=10)

# MLP
model = get_ssondo(head="mlp", n_classes=10, hidden_sizes=[512, 256])

Finetuning

# Linear probing (frozen backbone)
model = get_ssondo(head="linear", n_classes=10)
model.freeze_backbone()
model.train()

logits = model(audio)
loss = criterion(logits, labels)
loss.backward()  # only head parameters update

# Full finetuning
model.unfreeze_backbone()

API at a Glance

from ssondo import get_ssondo, list_models, list_heads

model = get_ssondo()                          # load model
model = get_ssondo(head="esc50")              # pretrained classifier
model = get_ssondo(head="linear", n_classes=10)  # custom head
model = get_ssondo(device="cuda")             # GPU
model = get_ssondo("path/to/checkpoint.ckpt") # local checkpoint

embeddings = model(audio)                     # (batch, n_segments, 960)
emb = model.get_embeddings(audio)             # (batch, 960) mean-pooled
model.embedding_dim                           # 960
model.backbone                                # raw nn.Module

list_heads()                                  # available classifiers

Model

S-SONDO ships with matpac-mobilenetv3 — a MobileNetV3 (2.9M params) distilled from MATPAC++, achieving the best downstream performance across all 7 benchmarks (96.4% of teacher performance at 61x fewer parameters). Embeddings are 960-dimensional.

Input

  • Mono audio, single channel
  • Sample rate: 32,000 Hz
  • Internally sliced into 10 s segments and converted to 128-band log-mel spectrograms

Links

Citation

@inproceedings{eladlouni2026ssondo,
  title={S-SONDO: Self-Supervised Knowledge Distillation for General Audio Foundation Models},
  author={El Adlouni, Mohammed Ali and Quelennec, Aurian and Chouteau, Pierre and Peeters, Geoffroy and Essid, Slim},
  booktitle={IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  year={2026}
}

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ssondo-0.3.1.tar.gz (33.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ssondo-0.3.1-py3-none-any.whl (36.3 kB view details)

Uploaded Python 3

File details

Details for the file ssondo-0.3.1.tar.gz.

File metadata

  • Download URL: ssondo-0.3.1.tar.gz
  • Upload date:
  • Size: 33.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for ssondo-0.3.1.tar.gz
Algorithm Hash digest
SHA256 8de21007f7190c20605a41b02c563064934b724881aff829abd5ece88fc93257
MD5 606920027ca3533550dd47e979ceca7d
BLAKE2b-256 335f67cd3deb615942380c781fed38191778ffef7c4b1109172bf1486d11297b

See more details on using hashes here.

File details

Details for the file ssondo-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: ssondo-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 36.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for ssondo-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c85351887c0fad1804d116048672102be7eccc18c23d21f6315f426083624098
MD5 84afc44529de35fabbfcc9fd3c5388d5
BLAKE2b-256 27ada42f332930ddef19c61ac3422026ee8642ad07785c935463a7e80a65be12

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page