PyTorch implementation of auditory models from the MATLAB Auditory Modeling Toolbox
Project description
🔊 torch_amt - PyTorch Auditory Modeling Toolbox
Differentiable, GPU-accelerated PyTorch implementations of computational auditory models from the Auditory Modeling Toolbox (AMT).
Built for researchers in psychoacoustics, computational neuroscience, and Audio Deep Learning who need:
- 🔥 Hardware acceleration for fast batch processing
- 📊 Differentiable models for gradient-based optimization
- 🧩 Modular components for custom auditory pipelines
- 🎓 Scientific adherence matching MATLAB AMT implementations
📦 Installation
From PyPI
pip install torch-amt
From Source
git clone https://github.com/StefanoGiacomelli/torch_amt.git
cd torch_amt
pip install -e .
Requirements
- Tested w. Python ≥ 3.14
🚀 Quick Start
Complete Auditory Model
import torch
import torch_amt
# Load Dau et al. (1997) model
model = torch_amt.Dau1997(fs=48000)
# Process 1 second of audio
audio = torch.randn(1, 48000) # (batch, time)
output = model(audio)
print(f"Input: {audio.shape}")
# Input: torch.Size([1, 48000])
print(f"Output: List of {len(output)} frequency channels")
# Output: List of 31 frequency channels
print(f"Each channel shape: {output[0].shape}")
# Each channel shape: torch.Size([1, 8, 48000]) - (batch, modulation_channels, time)
Custom Processing Pipeline
import torch
import torch_amt
# Build custom auditory processing chain
filterbank = torch_amt.GammatoneFilterbank(fs=48000, fc=(80, 8000))
ihc = torch_amt.IHCEnvelope(fs=48000)
adaptation = torch_amt.AdaptLoop(fs=48000)
# Process signal
audio = torch.randn(2, 48000) # Batch of 2 signals
filtered = filterbank(audio) # (2, 31, 48000) - 31 frequency channels
envelope = ihc(filtered) # (2, 31, 48000) - Envelope extraction
adapted = adaptation(envelope) # (2, 31, 48000) - Temporal adaptation
print(f"Input: {audio.shape}")
# Input: torch.Size([2, 48000])
print(f"After Gammatone filterbank: {filtered.shape}")
# After Gammatone filterbank: torch.Size([2, 31, 48000])
print(f"After IHC envelope: {envelope.shape}")
# After IHC envelope: torch.Size([2, 31, 48000])
print(f"After adaptation: {adapted.shape}")
# After adaptation: torch.Size([2, 31, 48000])
Hardware Acceleration
import torch
import torch_amt
# Check available hardware
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"MPS available: {torch.backends.mps.is_available()}")
# Move model to GPU (CUDA or MPS)
model = torch_amt.Dau1997(fs=48000)
if torch.backends.mps.is_available():
model = model.to('mps') # Apple Silicon
print(f"Using device: mps")
elif torch.cuda.is_available():
model = model.cuda() # NVIDIA GPU
print(f"Using device: cuda")
else:
print(f"Using device: cpu")
# Process on accelerated hardware
audio = torch.randn(8, 48000).to(model.gammatone_fb.fc.device)
output = model(audio)
Learnable Models for Neural Networks
import torch
import torch.nn as nn
import torch_amt
class AudioClassifier(nn.Module):
def __init__(self):
super().__init__()
# Learnable auditory front-end
self.auditory = torch_amt.King2019(fs=48000, learnable=True)
self.classifier = nn.Linear(155, 10) # 31 freqs × 5 mods = 155 → 10 classes
def forward(self, audio):
features = self.auditory(audio) # (B, T, F, M) e.g., (4, 24000, 31, 5)
pooled = features.mean(dim=1) # (B, F, M) e.g., (4, 31, 5) - Pool over time
flattened = pooled.flatten(1) # (B, F×M) e.g., (4, 155)
return self.classifier(flattened) # (B, 10)
# Train end-to-end with backpropagation
model = AudioClassifier()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-1)
# Example forward pass
audio = torch.randn(4, 24000) # Batch of 4 signals, 0.5 seconds @ 48kHz
logits = model(audio) # (4, 10)
print(f"Input: {audio.shape} → Output: {logits.shape}")
# Input: torch.Size([4, 24000]) → Output: torch.Size([4, 10])
📚 Available Models
| Model | Year | Key Features | Use Cases |
|---|---|---|---|
| Dau1997 | 1997 | Adaptation loops, modulation filterbank | AM detection, temporal processing |
| Glasberg2002 | 2002 | Specific loudness, temporal integration | Loudness perception, hearing aids |
| Moore2016 | 2016 | Binaural processing, spatial smoothing | Binaural loudness, spatial hearing |
| King2019 | 2019 | Broken-stick compression, FM/AM analysis | FM masking, modulation interactions |
| Osses2021 | 2021 | Extended temporal integration | Speech perception, temporal resolution |
| Paulick2024 | 2024 | Physiological IHC, CASP framework | Physiological modeling, cochlear implants |
📖 Documentation
- API Reference: See docstrings (comprehensive documentation with equations and examples)
- Documentation: Coming soon on Read the Docs
- 🤝 Contributing: See DEV templates
📊 Performance
TODO: Placeholder table w. Components runtime analysis forward over 10 runs, and final forward+backward (on CPU, GPU, MPS)
📄 License
This project is aligned to the original AMT license, hence licensed under the GNU General Public License v3.0 or later (GPLv3+).
See LICENSE for full details.
🙏 Acknowledgments
This work is based on the Auditory Modeling Toolbox (AMT) developed by:
- Piotr Majdak
- Clara Hollomey
- Robert Baumgartner
- ...and many contributors from the auditory research community
Primary Reference:
Majdak, P., Hollomey, C., & Baumgartner, R. (2022). "AMT 1.x: A toolbox for reproducible research in auditory modeling." Acta Acustica, 6, 19. https://doi.org/10.1051/aacus/2022011
Individual model implementations are based on their respective publications (see model docstrings for specific citations).
Official Site: https://amtoolbox.org/
Contacts
Author: Stefano Giacomelli
Affiliation: Ph.D. Candidate @ DISIM Department, University of L'Aquila
Email: stefano.giacomelli@graduate.univaq.it
ORCID: https://orcid.org/0009-0009-0438-1748
📝 Citations
If you use torch_amt in your research, please cite:
@software{giacomelli2026torch_amt,
author = {Giacomelli, Stefano},
title = {torch\_amt: PyTorch Auditory Modeling Toolbox},
year = {2026},
url = {https://github.com/StefanoGiacomelli/torch_amt},
version = {0.1.0}
}
Also consider citing the original AMT papers.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file torch_amt-0.1.0.tar.gz.
File metadata
- Download URL: torch_amt-0.1.0.tar.gz
- Upload date:
- Size: 218.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a2ccb813b2732bd9d0977173df80c7e4b1bfe92c188489944caca83fa0a3130
|
|
| MD5 |
d2441e425498dcb7a5066c4dac125b2e
|
|
| BLAKE2b-256 |
3ba357cc84c2702d3996c10ded25ba864fcfcc070c6d3e23e22809560ad4c638
|
File details
Details for the file torch_amt-0.1.0-py3-none-any.whl.
File metadata
- Download URL: torch_amt-0.1.0-py3-none-any.whl
- Upload date:
- Size: 233.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
664c314dc1fc643eec9c438e4315a41ab144c4b7e9188e9d74c87361932582be
|
|
| MD5 |
e0a3671789ea942b2bc541a8da9e3b5a
|
|
| BLAKE2b-256 |
30ad34e6fbad93a4e250a3bbede29d6d2aa5303a0d693bd3a62420eb5767149a
|