This package is written for learning adaptive resolution of time series data.
Project description
Learning differetiable temporal resolution
This work introduces DiffRes, which enables differentiable temporal resolution learning on audio spectrogram (as opposed to common fixed hop size approaches) to improve the performance of audio classification models.
DiffRes-based method can achieve the same or better classification accuracy with 25%-75% fewer temporal dimensions on the feature level.
This algo could also be useful for the compression of other time-series data, by merging non-essential time frame and preserve important frame.
DiffRes:
- Enables differentiable temporal resolution learning to improve the performance of audio classification models.
- Merges non-essential time frames while preserving important frames.
- Acts as a "drop-in" module between an audio spectrogram and a classifier, and can be end-to-end optimized.
Fun fact:
- Working on spectrogram with a tiny hop size / very-high temporal resolution (e.g., 1 ms) becomes computationally tractable, by selectively compressing the time dimension using DiffRes.
- The dynamic time compression of DiffRes act as data augmentation.
Usage
Very simple, just insert DiffRes between the spectrogram and your down stream task!
First, install this algorithm:
git clone git@github.com:haoheliu/diffres-python.git
cd diffres-python
pip install -e .
# Install torch-scatter
conda install pytorch-scatter -c pyg
Then try out this example:
import os
import torch
from pydiffres import DiffRes
# =========> Assume you have already got a spectrogram (better in log-scale)
# The spectrogram data: [Batchsize, T-steps, F-bins]
data = torch.randn(1, 3000, 128)
model = DiffRes(
in_t_dim=3000, # The temporal dimension of your spectrogram
in_f_dim=128, # The frequency dimension of your spectrogram
dimension_reduction_rate=0.75, # How much temporal dimension to remove
learn_pos_emb=False # If you like to make the resolution encoding learnable
)
# Use DiffRes to compress the temporal dimension
# fix-resolution/hopsize Spectrogram ===========> compressed spectrogram
ret = model(data)
# 1. Add this to your loss function: ret["guide_loss"].
# 2. Use this for classification: ret["feature"].
# 3. Alternatively, you can also use ret["avgpool"] or ret["maxpool] classification with/without ret["resolution_enc"] for classification.
The ret variable is a python dict, which has the following keys:
- "score":
- The importance score of each spectrogram frame.
- "guide_loss":
- A loss value that you need to optimize so that DiffRes can work.
- "avgpool":
- The compressed feature using avgpool aggregation.
- "maxpool":
- The compressed feature using maxpool aggregation.
- "resolution_enc":
- The resolution encoding.
- "feature":
- The concatenation of ret["avgpool"], ret["maxpool], and ret["resolution_enc"]
- "activeness":
- A value indicates the activeness of DiffRes. Higher value means DiffRes is distinguishing important frame actively.
- "x":
- The original input spectrogram.
You can directly using ret["feature"] for classification. Or you can DIY your own version using ret["avgpool"], ret["maxpool], and ret["resolution_enc"].
You can visualize the DiffRes output dict by
# Visualization of DiffRes.
model.visualize(ret, savepath=module)
The classification pipline
Coming soon
Cite as
Coming soon
Examples
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pydiffres-0.0.1.tar.gz
.
File metadata
- Download URL: pydiffres-0.0.1.tar.gz
- Upload date:
- Size: 11.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2504f62ec28073529a3631823bd18b734466e3bde47da1d9d2c6ec2ed702cae9 |
|
MD5 | 386cd0d2234ed079d8063222f41195da |
|
BLAKE2b-256 | c0d75a1961b0df885c051943228697c01f3e1aa92f7c0a592da8981738593e65 |
File details
Details for the file pydiffres-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: pydiffres-0.0.1-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 80671bc0a31ea32253dde9dce28e995162515dcae92396c9a92a832e8bb5bd47 |
|
MD5 | 2137e4a2c88e8f4b9315d98e9cabdcd0 |
|
BLAKE2b-256 | a5a2329c3f1818d170bc8b456aa53413b32123974b3e1d6872630eb81b2e2352 |