Skip to main content

This package is written for learning adaptive resolution of time series data.

Project description

Learning differetiable temporal resolution

This work introduces DiffRes, which enables differentiable temporal resolution learning on audio spectrogram (as opposed to common fixed hop size approaches) to improve the performance of audio classification models.

DiffRes-based method can achieve the same or better classification accuracy with 25%-75% fewer temporal dimensions on the feature level.

This algo could also be useful for the compression of other time-series data, by merging non-essential time frame and preserve important frame.

DiffRes:

  1. Enables differentiable temporal resolution learning to improve the performance of audio classification models.
  2. Merges non-essential time frames while preserving important frames.
  3. Acts as a "drop-in" module between an audio spectrogram and a classifier, and can be end-to-end optimized.

Fun fact:

  1. Working on spectrogram with a tiny hop size / very-high temporal resolution (e.g., 1 ms) becomes computationally tractable, by selectively compressing the time dimension using DiffRes.
  2. The dynamic time compression of DiffRes act as data augmentation.

main

Usage

Very simple, just insert DiffRes between the spectrogram and your down stream task!

First, install this algorithm:

git clone git@github.com:haoheliu/diffres-python.git
cd diffres-python
pip install -e .
# Install torch-scatter
conda install pytorch-scatter -c pyg

Then try out this example:

import os
import torch
from pydiffres import DiffRes

# =========> Assume you have already got a spectrogram (better in log-scale)
# The spectrogram data: [Batchsize, T-steps, F-bins]
data = torch.randn(1, 3000, 128)  

model = DiffRes(
    in_t_dim=3000, # The temporal dimension of your spectrogram
    in_f_dim=128, # The frequency dimension of your spectrogram
    dimension_reduction_rate=0.75, # How much temporal dimension to remove
    learn_pos_emb=False # If you like to make the resolution encoding learnable
)


# Use DiffRes to compress the temporal dimension
# fix-resolution/hopsize Spectrogram ===========> compressed spectrogram
ret = model(data)

# 1. Add this to your loss function: ret["guide_loss"].
# 2. Use this for classification: ret["feature"].
# 3. Alternatively, you can also use ret["avgpool"] or ret["maxpool] classification with/without ret["resolution_enc"] for classification.

The ret variable is a python dict, which has the following keys:

  • "score":
    • The importance score of each spectrogram frame.
  • "guide_loss":
    • A loss value that you need to optimize so that DiffRes can work.
  • "avgpool":
    • The compressed feature using avgpool aggregation.
  • "maxpool":
    • The compressed feature using maxpool aggregation.
  • "resolution_enc":
    • The resolution encoding.
  • "feature":
    • The concatenation of ret["avgpool"], ret["maxpool], and ret["resolution_enc"]
  • "activeness":
    • A value indicates the activeness of DiffRes. Higher value means DiffRes is distinguishing important frame actively.
  • "x":
    • The original input spectrogram.

You can directly using ret["feature"] for classification. Or you can DIY your own version using ret["avgpool"], ret["maxpool], and ret["resolution_enc"].

You can visualize the DiffRes output dict by

# Visualization of DiffRes. 
model.visualize(ret, savepath=module)

The classification pipline

Coming soon

Cite as

Coming soon

Examples

main

main

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydiffres-0.0.1.tar.gz (11.3 kB view details)

Uploaded Source

Built Distribution

pydiffres-0.0.1-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file pydiffres-0.0.1.tar.gz.

File metadata

  • Download URL: pydiffres-0.0.1.tar.gz
  • Upload date:
  • Size: 11.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.4

File hashes

Hashes for pydiffres-0.0.1.tar.gz
Algorithm Hash digest
SHA256 2504f62ec28073529a3631823bd18b734466e3bde47da1d9d2c6ec2ed702cae9
MD5 386cd0d2234ed079d8063222f41195da
BLAKE2b-256 c0d75a1961b0df885c051943228697c01f3e1aa92f7c0a592da8981738593e65

See more details on using hashes here.

File details

Details for the file pydiffres-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: pydiffres-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.4

File hashes

Hashes for pydiffres-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 80671bc0a31ea32253dde9dce28e995162515dcae92396c9a92a832e8bb5bd47
MD5 2137e4a2c88e8f4b9315d98e9cabdcd0
BLAKE2b-256 a5a2329c3f1818d170bc8b456aa53413b32123974b3e1d6872630eb81b2e2352

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page