tensorflow generation of SOX-style spectrograms on the GPU

These details have not been verified by PyPI

Project description

sox_tensorflow

TensorFlow implementation of SoX-style spectrogram generation that uses TensorFlow operations for GPU acceleration.

LINKS

Our analysis shows

sox_tensorflow spectrograms are 99.81% exact-pixel-match on average relative to sox.
Every segment falls within ±2 pixel values
The small residual error is concentrated in the darkest pixels (0–10% brightness decile: ~99.3% accuracy) and vanishes almost entirely in brighter regions where the signals live
100% agreement with top-5 ranks agreement when passed through PNW-Cnet v4 model
The model-output classes with the largest mean absolute difference are BUVI and PSFL (around 0.0004)

Pixel accuracy by brightness decile

For more details see the scripts and notebooks found in the comparison folder.

QUICK START

import soundfile as sf
import tensorflow as tf
from sox_tensorflow import spectrogram, spectrogram_from_flac

# From a numpy array
samples, sr = sf.read('audio.flac', dtype='float64', always_2d=True)
samples = samples[:, 0]  # mono
pixels = spectrogram(
    audio_array=tf.constant(samples, dtype=tf.float64),
    shape=(257, 1000),
    sample_rate=sr,
    dest='spectrogram.png'
)


# From a FLAC file directly
path = spectrogram_from_flac(
    flac_path='audio.flac',
    shape=(257, 1000),
    duration=12.0,
    segment=0,
    dest='spectrogram.png'
)

API

`spectrogram(audio_array, shape, dest, ...)`

Generates a SoX-matching spectrogram from an audio array or TensorFlow tensor.

Argument	Type	Description
`audio_array`	`tf.Tensor` or `np.ndarray`	Audio samples, float32/float64 in [-1, 1]
`shape`	`(int, int)`	Output shape as `(height, width)`. Height determines frequency resolution: DFT size = 2 × (height − 1)
`dest`	`str` or `Path`, optional	Output PNG path. If `None`, returns a `tf.Tensor`
`segment`	`int`, optional	Segment index (0-based) to extract from the audio
`segment_duration`	`float`, optional	Duration of each segment in seconds
`segment_overlap`	`float`, optional	Overlap between segments in seconds
`sample_rate`	`int`	Sample rate of the input audio in Hz
`output_sample_rate`	`int`	Sample rate for spectrogram generation (default: 8000)
`db_range`	`int`	Dynamic range in dB (default: 90)

Returns a tf.Tensor of pixel values (uint8, shape (height, width)) if dest is None, otherwise the path to the saved PNG.

`spectrogram_from_flac(flac_path, shape, dest, ...)`

Convenience wrapper that loads a FLAC file and generates a spectrogram in one call. Accepts the same shape/segment/dest arguments as spectrogram(), plus:

Argument	Type	Description
`flac_path`	`str`	Path to FLAC file
`start_time`	`float`, optional	Start time in seconds
`duration`	`float`	Duration in seconds (default: 12)
`channel`	`int`	Channel to extract (default: 0)

`load_audio(flac_path, start_time, segment, duration, channel)`

Reads a FLAC file into a tf.Tensor. Returns (tensor, sample_rate).

NOTES

PNW-Cnet compatibility

When loading H5 models saved with older TensorFlow/Keras versions, set:

export TF_USE_LEGACY_KERAS=1

This forces TensorFlow 2.16+ to use the legacy Keras implementation, which maintains compatibility with older H5 model files.

SoX accuracy

The implementation replicates SoX's spectrogram algorithm exactly:

Hann window with SoX-specific normalization
FFT with SoX edge handling (partial windows at start/end of signal)
dB conversion and pixel rendering matching SoX's palette

Resampling uses soxr (the SoX Resampler library) at HQ quality, achieving 99.8%+ pixel match with the SoX binary.

STYLE-GUIDE

Following PEP8. See setup.cfg for exceptions. Keeping honest with pycodestyle .

Project details

These details have not been verified by PyPI

Development Status
- 3 - Alpha
Intended Audience
- Science/Research
Programming Language
Topic
- Scientific/Engineering

Release history Release notifications | RSS feed

This version

0.1.2

Mar 31, 2026

0.1.1

Mar 16, 2026

0.1.0

Mar 16, 2026

0.0.1

Feb 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sox_tensorflow-0.1.2.tar.gz (22.7 kB view details)

Uploaded Mar 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sox_tensorflow-0.1.2-py3-none-any.whl (23.2 kB view details)

Uploaded Mar 31, 2026 Python 3

File details

Details for the file sox_tensorflow-0.1.2.tar.gz.

File metadata

Download URL: sox_tensorflow-0.1.2.tar.gz
Upload date: Mar 31, 2026
Size: 22.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for sox_tensorflow-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`1275650744889fcbe51189d3a14f245da72dee0d26eb51eee23016374cec355f`
MD5	`0e2ef1678d651f7f3d75cb44ff076d45`
BLAKE2b-256	`f0f8dda21b8b75e9df5f3eeb1dcc0b683ed951d5ebbeb81ba849b7987f80c648`

See more details on using hashes here.

File details

Details for the file sox_tensorflow-0.1.2-py3-none-any.whl.

File metadata

Download URL: sox_tensorflow-0.1.2-py3-none-any.whl
Upload date: Mar 31, 2026
Size: 23.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for sox_tensorflow-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ddd4e6f3bd5dd5e355435929c2fcee01f54436a47f54687038231e4e013862ec`
MD5	`dd8dd89c33e5d387705843d1bfaf1c98`
BLAKE2b-256	`f286379d7bd3d72bb788a0ab8194bc87e79319f18c3b2a8b78e9fef530c3e992`

See more details on using hashes here.

sox-tensorflow 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

sox_tensorflow

QUICK START

API

`spectrogram(audio_array, shape, dest, ...)`

`spectrogram_from_flac(flac_path, shape, dest, ...)`

`load_audio(flac_path, start_time, segment, duration, channel)`

NOTES

PNW-Cnet compatibility

SoX accuracy

STYLE-GUIDE

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes