A pip package for an improved perceptual audio metric
Project description
Contrastive learning-based Deep Perceptual Audio Metric (CDPAM) [Webpage]
Contrastive Learning For Perceptual Audio Similarity
Pranay Manocha, Zeyu Jin, Richard Zhang, Adam Finkelstein
This is a Pytorch implementation of our new and improved audio perceptual metric. It contains (0) minimal code to run our perceptual metric (CDPAM).
(0) Usage as a loss function
Minimal basic usage as a distance metric
Running the command below takes two audio files as input and gives the perceptual distance between the files. It should return (approx)distance = 0.1696. Some GPU's are non-deterministic, and so the distance could vary in the lsb.
Installing the metric (CDPAM - perceptual audio similarity metric)
pip install cdpam
Using the metric is as simple as:
import cdpam
loss_fn = cdpam.DPAM()
wav_ref = cdpam.load_audio('sample_audio/ref.wav')
wav_out = cdpam.load_audio('sample_audio/2.wav')
dist = loss_fn.forward(wav_ref,wav_out)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.