tfr·PyPI

Time-frequency reassigned spectrograms

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 3 - Alpha
Intended Audience
- Developers
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- MacOS :: MacOS X
- POSIX :: Linux
Programming Language
- Python :: 2
- Python :: 3
Topic
- Multimedia :: Sound/Audio :: Analysis

Project description

Supported Python versions License

Spectral audio feature extraction using time-frequency reassignment.

Besides normal spectrograms it allows to compute reassigned spectrograms, transform them (eg. to log-frequency scale) and requantize them (eg. to musical pitch bins). This is useful to obtain good features for audio analysis or machine learning on audio data.

A reassigned spectrogram often provides more precise localization of energy in the time-frequency plane than a plain spectrogram. Roughly said in the reassignment method we use the phase (which is normally discarded) and move the samples on the time-frequency plane to a more suitable place computed from derivatives of the phase.

This library supports reassignment in both frequency and time (both are optional). As well it does requantization from the input overlapping grid to an non-overlapping output grid.

It is a good building block to compute chromagram features (aka pitch class profiles) where pitch is transformed into pitch class by ignoring the octave. See also harmonic pitch class profiles.

Installation

pip install tfr

Or for development (all code changes will be available):

git clone https://github.com/bzamecnik/tfr.git
pip install -e tfr

Usage

Split audio signal to frames

You can read time-domain signal from an audio file (using the soundfile library) and split it into frames for spectral processing.

import tfr
signal_frames = tfr.SignalFrames('audio.flac')

SignalFrames instance contains the signal split into frames and some metadata useful for further processing.

The signal values are normalized to [0.0, 1.0] and the channels are converted to mono.

It is possible to provide the signal a numpy array as well.

import tfr
x = np.sin(2 * np.pi * 10 * np.linspace(0, 1, 1000))
signal_frames = tfr.SignalFrames(x)

Minimal example - pitchgram from audio file

import tfr
x_pitchgram = tfr.pitchgram(tfr.SignalFrames('audio.flac'))

From audio frames it computes a reassigned pitchgram of shape (frame_count, bin_count) with values being log-magnitudes in dBFS [-120.0, 0.0]. Sensible parameters are used by default, but you can change them if you wish.

Reassigned spectrogram

Like normal one but sharper and requantized.

import tfr
x_spectrogram = tfr.reassigned_spectrogram(tfr.SignalFrames('audio.flac'))

Signal frames with specific parameters

frame_size - affects the FFT size - trade-off between frequency and time resolution, good to use powers of two, eg. 4096
hop_size - affects the overlap between frames since a window edges fall to zero, eg. half of frame_size (2048)

import tfr
signal_frames = tfr.SignalFrames('audio.flac', frame_size=1024, hop_size=256)

General spectrogram API

The pitchgram and reassigned_spectrogram functions are just syntax sugar for the Spectrogram class. You can use it directly to gain more control.

General usage:

x_spectrogram = tfr.Spectrogram(signal_frames).reassigned()

From one Spectrogram instance you can efficiently compute reassigned spectrograms with various parameters.

s = tfr.Spectrogram(signal_frames)
x_spectrogram_tf = s.reassigned(output_frame_size=4096)
x_spectrogram_f = s.reassigned(output_frame_size=512)

Different window function (by default we use Hann window):

import scipy
x_spectrogram = tfr.Spectrogram(signal_frames, window=scipy.blackman).reassigned()

Different output frame size (by default we make it the same as input hop size):

x_spectrogram = tfr.Spectrogram(signal_frames).reassigned(output_frame_size=512)

Disable reassignment of time and frequency separately:

s = tfr.Spectrogram(signal_frames)
x_spectrogram = s.reassigned(reassign_time=False, reassign_frequency=False)
x_spectrogram_t = s.reassigned(reassign_frequency=False)
x_spectrogram_f = s.reassigned(reassign_time=False)
x_spectrogram_tf = s.reassigned()

Disable decibel transform of output values:

x_spectrogram = tfr.Spectrogram(signal_frames).reassigned(magnitudes='power')

Magnitudes in the spectrogram can be transformed at the end in multiple ways given by the magnitudes parameter:

linear - energy spectrum
power - power spectrum
power_db - power spectrum in decibels, range: [-120, 0]
power_db_normalized - power spectrum in decibels normalized to range: [0, 1]
- this is useful as a feature

Use some specific transformation of the output values. LinearTransform (default) is just for normal spectrogram, PitchTransform is for pitchgram. Or you can write your own.

x_spectrogram = tfr.Spectrogram(signal_frames).reassigned(transform=LinearTransform())

x_pitchgram = tfr.Spectrogram(signal_frames).reassigned(transform=PitchTransform())

class LogTransform():
  def __init__(self, bin_count=100)
    self.bin_count = bin_count

  def transform_freqs(self, X_inst_freqs, sample_rate):
      X_y = np.log10(np.maximum(sample_rate * X_inst_freqs, eps))
      bin_range = (0, np.log10(sample_rate))
      return X_y, self.bin_count, bin_range

x_log_spectrogram = tfr.Spectrogram(signal_frames).reassigned(transform=LogTransform())

Pitchgram parameters

In pitchgram the frequencies are transformed into pitches in some tuning and then quantized to bins. You can specify the tuning range of pitch bins and their subdivision.

tuning - instance of Tuning class, transforms between pitch and frequency
bin_range is in pitches where 0 = 440 Hz (A4), 12 is A5, -12 is A3, etc.
bin_division - bins per each pitch

Extract features via CLI

# basic STFT spectrogram
python -m tfr.spectrogram_features audio.flac spectrogram.npz
# reassigned STFT spectrogram
python -m tfr.spectrogram_features audio.flac -t reassigned reassigned_spectrogram.npz
# reassigned pitchgram
python -m tfr.spectrogram_features audio.flac -t pitchgram pitchgram.npz

Look for other options:

python -m tfr.spectrogram_features --help

scikit-learn transformer

In order to extract pitchgram features within a sklearn pipeline, we can use PitchgramTransformer:

import soundfile as sf
x, fs = sf.read('audio.flac')

from tfr.signal import to_mono
from tfr.sklearn import PitchgramTransformer
ct = PitchgramTransformer(sample_rate=fs)
x_pitchgram = ct.transform(x)

# output:
#  - shape: (frame_count, bin_count)
#   - values in dBFB normalized to [0.0, 1.0]

Status

Currently it’s alpha. I’m happy to extract it from some other project into a separate repo and package it. However, the API must be completely redone to be more practical and obvious.

About

Author: Bohumír Zámečník ([@bzamecnik](http://twitter.com/bzamecnik))
License: MIT

Support the project

Need some consulting or coding work regarding audio processing, machine learning or big data? Drop me a message via email or LinkedIn. Or just say hello :).

Literature

A Unified Theory of Time-Frequency Reassignment - Kelly R. Fitz, Sean A. Fulop, Digital Signal Processing 30 September 2005
Algorithms for computing the time-corrected instantaneous frequency (reassigned) spectrogram, with applications - Sean A. Fulop, Kelly Fitz, Journal of Acoustical Society of America, Jan 2006
Time Frequency Reassignment: A Review and Analysis - Stephen W. Hainsworth, Malcolm D. Macleod, Technical Report, Cambridge University Engineering Dept.
Improving the Readability of Time-Frequency and Time-Scale Representations by the Reassignment Method - Francois Auger, Patrick Flandrin, IEEE Transactions on Signal Processing, vol. 43, no. 5, May 1995
Time–frequency reassignment: from principles to algorithms - P. Flandrin, F. Auger, E. Chassande-Mottin, CRC Press 2003
Time-frequency toolbox for Matlab, user’s guide and reference guide - F.Auger, P.Flandrin, P.Goncalves, O.Lemoine

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 3 - Alpha
Intended Audience
- Developers
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- MacOS :: MacOS X
- POSIX :: Linux
Programming Language
- Python :: 2
- Python :: 3
Topic
- Multimedia :: Sound/Audio :: Analysis

Release history Release notifications | RSS feed

This version

0.2.4

Sep 26, 2018

0.2.3

Nov 19, 2017

0.2.2

Nov 2, 2016

0.2.1

Nov 2, 2016

0.2

Oct 31, 2016

0.1

Oct 18, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tfr-0.2.4.tar.gz (17.0 kB view details)

Uploaded Sep 26, 2018 Source

Built Distribution

tfr-0.2.4-py2.py3-none-any.whl (15.2 kB view details)

Uploaded Sep 26, 2018 Python 2Python 3

File details

Details for the file tfr-0.2.4.tar.gz.

File metadata

Download URL: tfr-0.2.4.tar.gz
Upload date: Sep 26, 2018
Size: 17.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.7.0

File hashes

Hashes for tfr-0.2.4.tar.gz
Algorithm	Hash digest
SHA256	`aca862fbaa5a87f75f7f4ebf11fadd50d780522437ccb909e649fd9f57ad7c95`
MD5	`caf02d024690cdcf944fe5dcb556b8a9`
BLAKE2b-256	`28d9343ab53b6494d65dbd71eb44d31725ea282c003e30988dea18f3ded38954`

See more details on using hashes here.

File details

Details for the file tfr-0.2.4-py2.py3-none-any.whl.

File metadata

Download URL: tfr-0.2.4-py2.py3-none-any.whl
Upload date: Sep 26, 2018
Size: 15.2 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.7.0

File hashes

Hashes for tfr-0.2.4-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`3414cc915c45d9989b7daec9b9fd8be5b23b5c5ddfb7b02963e8fe6546ac38e7`
MD5	`91b8728bd913a09a1f40c3f7437a5054`
BLAKE2b-256	`30da6193a6008ed048359d364a61ffdaee5948f3e66ac0d50e84d40493cbb135`

See more details on using hashes here.

tfr 0.2.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Installation

Usage

Split audio signal to frames

Minimal example - pitchgram from audio file

Reassigned spectrogram

Signal frames with specific parameters

General spectrogram API

Pitchgram parameters

Extract features via CLI

scikit-learn transformer

Status

About

Support the project

Literature

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes