PyTorch implementation of auditory models from the MATLAB Auditory Modeling Toolbox

These details have not been verified by PyPI

Project links

Project description

torch_amt - PyTorch Auditory Modeling Toolbox

Differentiable, GPU-accelerated PyTorch implementations of Computational Auditory models from the MATLAB Auditory Modeling Toolbox (AMT).

Built for researchers in psychoacoustics, computational neuroscience, and Audio Deep Learning who need:

🔥 Hardware acceleration for fast batch processing
📊 Differentiable models for gradient-based optimization
🧩 Modular components for custom auditory pipelines
🎓 Scientific adherence matching MATLAB AMT implementations

📦 Installation

From PyPI

pip install torch-amt

From Source

git clone https://github.com/StefanoGiacomelli/torch_amt.git
cd torch_amt
pip install -e .

Requirements

Tested w. Python ≥ 3.14

🚀 Quick Start

Complete Auditory Model

import torch
import torch_amt

# Load Dau et al. (1997) model
model = torch_amt.Dau1997(fs=48000)

# Process 1 second of audio
audio = torch.randn(1, 48000)  # (batch, time)
output = model(audio)

print(f"Input: {audio.shape}")
# Input: torch.Size([1, 48000])
print(f"Output: List of {len(output)} frequency channels")
# Output: List of 31 frequency channels
print(f"Each channel shape: {output[0].shape}")
# Each channel shape: torch.Size([1, 8, 48000]) - (batch, modulation_channels, time)

Custom Processing Pipeline

import torch
import torch_amt

# Build custom auditory processing chain
filterbank = torch_amt.GammatoneFilterbank(fs=48000, fc=(80, 8000))
ihc = torch_amt.IHCEnvelope(fs=48000)
adaptation = torch_amt.AdaptLoop(fs=48000)

# Process signal
audio = torch.randn(2, 48000)     # Batch of 2 signals
filtered = filterbank(audio)      # (2, 31, 48000) - 31 frequency channels
envelope = ihc(filtered)          # (2, 31, 48000) - Envelope extraction
adapted = adaptation(envelope)    # (2, 31, 48000) - Temporal adaptation

print(f"Input: {audio.shape}")
# Input: torch.Size([2, 48000])
print(f"After Gammatone filterbank: {filtered.shape}")
# After Gammatone filterbank: torch.Size([2, 31, 48000])
print(f"After IHC envelope: {envelope.shape}")
# After IHC envelope: torch.Size([2, 31, 48000])
print(f"After adaptation: {adapted.shape}")
# After adaptation: torch.Size([2, 31, 48000])

Hardware Acceleration

import torch
import torch_amt

# Check available hardware
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"MPS available: {torch.backends.mps.is_available()}")

# Move model to GPU (CUDA or MPS)
model = torch_amt.Dau1997(fs=48000)

if torch.backends.mps.is_available():
    model = model.to('mps')  # Apple Silicon
    print(f"Using device: mps")
elif torch.cuda.is_available():
    model = model.cuda()  # NVIDIA GPU
    print(f"Using device: cuda")
else:
    print(f"Using device: cpu")

# Process on accelerated hardware
audio = torch.randn(8, 48000).to(model.gammatone_fb.fc.device)
output = model(audio)

Learnable Models for Neural Networks

import torch
import torch.nn as nn
import torch_amt

class AudioClassifier(nn.Module):
    def __init__(self):
        super().__init__()
        # Learnable auditory front-end
        self.auditory = torch_amt.King2019(fs=48000, learnable=True)
        self.classifier = nn.Linear(155, 10)  # 31 freqs × 5 mods = 155 → 10 classes
    
    def forward(self, audio):
        features = self.auditory(audio)     # (B, T, F, M) e.g., (4, 24000, 31, 5)
        pooled = features.mean(dim=1)       # (B, F, M) e.g., (4, 31, 5) - Pool over time
        flattened = pooled.flatten(1)       # (B, F×M) e.g., (4, 155)
        return self.classifier(flattened)   # (B, 10)

# Train end-to-end with backpropagation
model = AudioClassifier()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-1)

# Example forward pass
audio = torch.randn(4, 24000)  # Batch of 4 signals, 0.5 seconds @ 48kHz
logits = model(audio)  # (4, 10)
print(f"Input: {audio.shape} → Output: {logits.shape}")
# Input: torch.Size([4, 24000]) → Output: torch.Size([4, 10])

📚 Available Models

Model	Year	Key Features	Use Cases
Dau1997	1997	Adaptation loops, modulation filterbank	AM detection, temporal processing
Glasberg2002	2002	Specific loudness, temporal integration	Loudness perception, hearing aids
Moore2016	2016	Binaural processing, spatial smoothing	Binaural loudness, spatial hearing
King2019	2019	Broken-stick compression, FM/AM analysis	FM masking, modulation interactions
Osses2021	2021	Extended temporal integration	Speech perception, temporal resolution
Paulick2024	2024	Physiological IHC, CASP framework	Physiological modeling, cochlear implants

📖 Documentation

API Reference: See docstrings (comprehensive documentation with equations and examples)
Documentation: https://torch-amt.readthedocs.io/en/latest/index.html
🤝 Contributing: See DEV templates

📊 Performance

TODO: Placeholder table w. Components runtime analysis forward over 10 runs, and final forward+backward (on CPU, GPU, MPS)

📄 License

This project is aligned to the original AMT license, hence licensed under the GNU General Public License v3.0 or later (GPLv3+). See LICENSE for full details.

🙏 Acknowledgments

This work is based on the Auditory Modeling Toolbox (AMT) developed by:

Piotr Majdak
Clara Hollomey
Robert Baumgartner
...and many contributors from the auditory research community

Reference:

Majdak, P., Hollomey, C., & Baumgartner, R. (2022). "AMT 1.x: A toolbox for reproducible research in auditory modeling." Acta Acustica, 6, 19. https://doi.org/10.1051/aacus/2022011

Official Site: https://amtoolbox.org/

Individual model implementations are based on their respective publications (see docstrings for specific citations).

Contacts

Stefano Giacomelli
ICT - Ph.D. Candidate
Department of Engineering, Information Science & Mathematics (DISIM dpt.) University of L'Aquila, Italy

DISIM_logo

📧 Email: stefano.giacomelli@graduate.univaq.it
🔗 GitHub: https://github.com/StefanoGiacomelli 🆔 ORCID: https://orcid.org/0009-0009-0438-1748 🎓 Scholar: https://scholar.google.com/citations?user=l-n0hl4AAAAJ&hl=it
💼 LinkedIn: https://www.linkedin.com/in/stefano-giacomelli-811654135

This project is funded under the Italian National Ministry of University and Research, for the Italian National Recovery and Resilience Plan (NRRP) "Methods of Computational Auditory Scene Analysis and Synthesis supporting eXtended and Immersive Reality Services"

📝 Citations

If you use torch_amt in your research, please cite:

@software{giacomelli2026torch_amt,
  author = {Giacomelli, Stefano},
  title = {torch\_amt: PyTorch Auditory Modeling Toolbox},
  year = {2026},
  url = {https://github.com/StefanoGiacomelli/torch_amt},
  version = {0.1.0}
}

Also consider citing the original AMT papers.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Feb 10, 2026

0.1.0

Feb 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torch_amt-0.2.0.tar.gz (219.7 kB view details)

Uploaded Feb 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

torch_amt-0.2.0-py3-none-any.whl (233.8 kB view details)

Uploaded Feb 10, 2026 Python 3

File details

Details for the file torch_amt-0.2.0.tar.gz.

File metadata

Download URL: torch_amt-0.2.0.tar.gz
Upload date: Feb 10, 2026
Size: 219.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for torch_amt-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`7ce7a4e5604f1e95ef88229844c589be883430fad6b9585a4d80884b2f4958b1`
MD5	`794b86d2417cbaf73dc24cc2d54a7937`
BLAKE2b-256	`14c16eb67013595b6847e8fa2790ac89adab1a1f3ed136ff81b776ef93425208`

See more details on using hashes here.

File details

Details for the file torch_amt-0.2.0-py3-none-any.whl.

File metadata

Download URL: torch_amt-0.2.0-py3-none-any.whl
Upload date: Feb 10, 2026
Size: 233.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for torch_amt-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d36c7b050672c9993df1e62177bc3f98377d41529e33e80f553bec526703f4b9`
MD5	`9ce9106596dd768123e435c0f22f7a09`
BLAKE2b-256	`6e135a944ef3664d6dbd77d2940d26dc66c96f4e414dbecdcf22c1262623b8f5`

See more details on using hashes here.

torch-amt 0.2.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

torch_amt - PyTorch Auditory Modeling Toolbox

📦 Installation

From PyPI

From Source

Requirements

🚀 Quick Start

Complete Auditory Model

Custom Processing Pipeline

Hardware Acceleration

Learnable Models for Neural Networks

📚 Available Models

📖 Documentation

📊 Performance

📄 License

🙏 Acknowledgments

Contacts

📝 Citations

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes