Skip to main content

Phase-preserving spectrogram encoder/decoder for high-quality audio reconstruction

Project description

phase-spectrogram

Phase-preserving spectrogram encoder/decoder for high-quality audio reconstruction.

This Python package implements a phase-preserving spectrogram encoder/decoder that converts audio waveforms to spectrograms and back to audio without loss of phase information, enabling high-quality audio reconstruction.

Installation

pip install phase-spectrogram

Quick Start

from phase import Phase

# Initialize with sample rate
phase = Phase(sample_rate=44100)

# Convert audio file to spectrogram image
phase.to_phase_wav('input.wav', 'output.png')

# Convert spectrogram image back to audio
phase.to_wav_png('output.png', 'reconstructed.wav')

Features

  • Phase-Preserving: Retains both magnitude and phase information for lossless reconstruction
  • High-Quality Audio: Near-lossless audio reconstruction without iterative algorithms
  • Multiple Sample Rates: Support for 8kHz, 11.025kHz, 16kHz, 22.05kHz, 24kHz, 32kHz, 44.1kHz, and 48kHz
  • Flexible Formats: WAV and FLAC input support
  • PNG Export: Save spectrograms as images for visualization or ML applications
  • HDR Support: Optional 16-bit per channel PNG for higher dynamic range

Usage Examples

Basic Audio Processing

import numpy as np
from phase import Phase

# Create Phase encoder/decoder
phase = Phase(sample_rate=44100)

# Generate test audio (1 second of 440Hz sine wave)
t = np.linspace(0, 1.0, 44100)
audio = np.sin(2 * np.pi * 440 * t)

# Convert to spectrogram
spectrogram = phase.to_phase(audio)

# Reconstruct audio
reconstructed = phase.from_phase(spectrogram)

File Conversion

from phase import Phase

phase = Phase(sample_rate=44100)

# WAV to PNG
phase.to_phase_wav('input.wav', 'spectrogram.png')

# FLAC to PNG
phase.to_phase_flac('input.flac', 'spectrogram.png')

# PNG back to WAV
phase.to_wav_png('spectrogram.png', 'output.wav')

Advanced Configuration

from phase import Phase

# High Dynamic Range (16-bit per channel)
phase_hdr = Phase(
    sample_rate=48000,
    HDR=True,
    volume_boost=2.0,
    y_reverse=False
)

# Custom window and FFT resolution
phase_custom = Phase(
    sample_rate=44100,
    window=2560,
    resolut=8192
)

# With inverse hyperbolic sine compression
phase_ihs = Phase(
    sample_rate=44100,
    IHS=True
)

API Reference

Phase Class

Phase(sample_rate=None, num_freqs=None, window=1280, 
      resolut=4096, y_reverse=True, volume_boost=0.0, 
      HDR=False, IHS=False)

Parameters:

  • sample_rate (int): Audio sample rate (8000, 11025, 16000, 22050, 24000, 32000, 44100, or 48000)
  • num_freqs (int): Number of frequency bins (auto-set based on sample_rate if not provided)
  • window (int): STFT window size (default: 1280)
  • resolut (int): FFT resolution (default: 4096)
  • y_reverse (bool): Flip Y-axis in PNG images (default: True)
  • volume_boost (float): Volume multiplier for reconstruction (default: 0.0 = no boost)
  • HDR (bool): Use 16 bits per channel PNG (default: False = 8 bits per channel)
  • IHS (bool): Enable inverse hyperbolic sine compression (default: False)

Methods

to_phase(audio_buffer)

Convert audio buffer to phase-preserving spectrogram.

Parameters:

  • audio_buffer (numpy.ndarray): 1D array of float64 audio samples

Returns:

  • numpy.ndarray: 2D array of shape (time_frames * num_freqs, 2)

from_phase(spectrogram)

Reconstruct audio from phase-preserving spectrogram.

Parameters:

  • spectrogram (numpy.ndarray): 2D array of shape (time_frames * num_freqs, 2)

Returns:

  • numpy.ndarray: 1D array of float64 audio samples

to_phase_wav(input_file, output_file)

Convert WAV file to PNG spectrogram.

Parameters:

  • input_file (str): Path to input WAV file
  • output_file (str): Path to output PNG file

to_phase_flac(input_file, output_file)

Convert FLAC file to PNG spectrogram.

Parameters:

  • input_file (str): Path to input FLAC file
  • output_file (str): Path to output PNG file

to_wav_png(input_file, output_file)

Convert PNG spectrogram to WAV file.

Parameters:

  • input_file (str): Path to input PNG file
  • output_file (str): Path to output WAV file

Returns:

  • int: Detected sample rate from the spectrogram

Supported Sample Rates

Sample Rate Frequency Bins Family
8000 Hz 768 48k
16000 Hz 768 48k
24000 Hz 768 48k
32000 Hz 768 48k
48000 Hz 768 48k
11025 Hz 836 44.1k
22050 Hz 836 44.1k
44100 Hz 836 44.1k

Note: HDR mode doubles the frequency bin count.

Requirements

  • Python >= 3.7
  • numpy >= 1.20.0
  • scipy >= 1.7.0
  • soundfile >= 0.10.0
  • Pillow >= 8.0.0
  • pypng >= 0.20220715.0

License

See LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Related Projects

This is a Python implementation based on the Go package gomel.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phase_spectrogram-0.0.11.tar.gz (63.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

phase_spectrogram-0.0.11-py3-none-any.whl (11.7 kB view details)

Uploaded Python 3

File details

Details for the file phase_spectrogram-0.0.11.tar.gz.

File metadata

  • Download URL: phase_spectrogram-0.0.11.tar.gz
  • Upload date:
  • Size: 63.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for phase_spectrogram-0.0.11.tar.gz
Algorithm Hash digest
SHA256 a56838d1bc85e0fff205419bc2089a6df46edd0bfd425e3838df7701aac699db
MD5 b500f231a87e14c217a526700d120d84
BLAKE2b-256 fe7da732fd42c57bafdacc2a5736c917622047efd5fda5741669519ffae34972

See more details on using hashes here.

Provenance

The following attestation bundles were made for phase_spectrogram-0.0.11.tar.gz:

Publisher: python-publish.yml on neurlang/gomel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file phase_spectrogram-0.0.11-py3-none-any.whl.

File metadata

File hashes

Hashes for phase_spectrogram-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 66e91eab5739227029aa93c6d1208a0e36094a260627311b6158c7aca3164936
MD5 d70877fed6020ce845defaead1248b1e
BLAKE2b-256 52aadc58dd4ce8ea6d3971a0fdbb53f6e51a6a76582432fe9193e32850962544

See more details on using hashes here.

Provenance

The following attestation bundles were made for phase_spectrogram-0.0.11-py3-none-any.whl:

Publisher: python-publish.yml on neurlang/gomel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page