Skip to main content

S-SONDO: Lightweight audio embeddings from self-supervised knowledge distillation

Project description

S-SONDO

Lightweight audio embeddings from self-supervised knowledge distillation.

S-SONDO provides compact audio models (MobileNetV3, DyMN, ERes2Net) trained via knowledge distillation from large audio foundation models (MATPAC, M2D). Extract general-purpose audio embeddings with a single function call.

Paper: S-SONDO: Self-Supervised Knowledge Distillation for General Audio Foundation Models (ICASSP 2026)

Installation

pip install ssondo

Quick Start

import torchaudio
from ssondo import get_ssondo

# Load a pretrained model (auto-downloads from Hugging Face Hub)
model = get_ssondo("matpac-mobilenetv3")

# Load audio (mono, 32kHz)
x, sr = torchaudio.load("audio.wav")
x = x.mean(dim=0, keepdim=True)  # mono

# Extract embeddings
embeddings = model(x)  # (1, n_segments, 960)

Available Models

from ssondo import list_models

for name, description in list_models().items():
    print(f"{name}: {description}")
Model Teacher Student Embedding Size
matpac-mobilenetv3 MATPAC++ MobileNetV3 960
matpac-dymn MATPAC++ DyMN 960
matpac-eres2net MATPAC++ ERes2Net varies
m2d-mobilenetv3 M2D MobileNetV3 960
m2d-dymn M2D DyMN 960
m2d-eres2net M2D ERes2Net varies

Usage

Extract Embeddings

model = get_ssondo("matpac-mobilenetv3")
embeddings = model(audio)  # (batch, n_segments, emb_size)

Get Logits Too

model = get_ssondo("matpac-mobilenetv3", return_logits=True)
embeddings, logits = model(audio)

GPU Inference

model = get_ssondo("matpac-mobilenetv3", device="cuda")
embeddings = model(audio.cuda())

Load from Local Checkpoint

model = get_ssondo("path/to/checkpoint.ckpt")

Finetuning with Frozen Backbone (Linear Probe)

import torch
from ssondo import get_ssondo

model = get_ssondo("matpac-mobilenetv3")
model.freeze_backbone()  # freeze all backbone params
model.train()

# Add a linear classifier for your task
head = torch.nn.Linear(model.embedding_dim, num_classes)

# Extract embeddings (backbone frozen, no grad)
emb = model.get_embeddings(audio)  # (batch, 960)
logits = head(emb)
loss = criterion(logits, labels)
loss.backward()  # only head parameters are updated

Full Finetuning

model = get_ssondo("matpac-mobilenetv3")
model.train()  # all parameters trainable by default

Useful Properties

model.embedding_dim   # 960 — size of backbone embeddings
model.backbone        # the raw backbone nn.Module (e.g., MobileNetV3)

Input Requirements

  • Mono audio (single channel)
  • Sample rate: 32,000 Hz
  • Audio is internally sliced into 10-second segments and converted to 128-band log-mel spectrograms

How It Works

get_ssondo() auto-detects everything from the checkpoint: student backbone, preprocessing parameters, and classification head. No manual configuration needed.

When you pass a model name (e.g., "matpac-mobilenetv3"), the checkpoint is automatically downloaded from Hugging Face Hub and cached locally.

Citation

@inproceedings{eladlouni2026ssondo,
  title={S-SONDO: Self-Supervised Knowledge Distillation for General Audio Foundation Models},
  author={El Adlouni, Mohammed Ali and Quelennec, Aurian and Chouteau, Pierre and Peeters, Geoffroy and Essid, Slim},
  booktitle={IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  year={2026}
}

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ssondo-0.1.1.tar.gz (31.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ssondo-0.1.1-py3-none-any.whl (34.9 kB view details)

Uploaded Python 3

File details

Details for the file ssondo-0.1.1.tar.gz.

File metadata

  • Download URL: ssondo-0.1.1.tar.gz
  • Upload date:
  • Size: 31.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for ssondo-0.1.1.tar.gz
Algorithm Hash digest
SHA256 a83301761755e335c0b97ce8a201809253df8d4a75902ae013a0ed0c4bbe6ae9
MD5 2761ce462cb8f9ec5e056202f45ba613
BLAKE2b-256 78be724230ad0d6688af91ec19a643f5883668e1a273a1fe5b8c42969948d867

See more details on using hashes here.

File details

Details for the file ssondo-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: ssondo-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 34.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for ssondo-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 96db50b69b64cc2e31c9244831878ef3b6e5068753a16b2b3ec42a4bddbd74eb
MD5 660c18edf480b523377ef39d40fd6b24
BLAKE2b-256 6db8d01332661d0d9896caffd88a92aeaaf1d3671e0b9fe786b12e33cb567bb0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page