Skip to main content

Phase-preserving spectrogram encoder/decoder for high-quality audio reconstruction

Project description

phase-spectrogram

Phase-preserving spectrogram encoder/decoder for high-quality audio reconstruction.

This Python package implements a phase-preserving spectrogram encoder/decoder that converts audio waveforms to spectrograms and back to audio without loss of phase information, enabling high-quality audio reconstruction.

Installation

pip install phase-spectrogram

Quick Start

from phase import Phase

# Initialize with sample rate
phase = Phase(sample_rate=44100)

# Convert audio file to spectrogram image
phase.to_phase_wav('input.wav', 'output.png')

# Convert spectrogram image back to audio
phase.to_wav_png('output.png', 'reconstructed.wav')

Features

  • Phase-Preserving: Retains both magnitude and phase information for lossless reconstruction
  • High-Quality Audio: Near-lossless audio reconstruction without iterative algorithms
  • Multiple Sample Rates: Support for 8kHz, 11.025kHz, 16kHz, 22.05kHz, 24kHz, 32kHz, 44.1kHz, and 48kHz
  • Flexible Formats: WAV and FLAC input support
  • PNG Export: Save spectrograms as images for visualization or ML applications
  • HDR Support: Optional 16-bit per channel PNG for higher dynamic range

Usage Examples

Basic Audio Processing

import numpy as np
from phase import Phase

# Create Phase encoder/decoder
phase = Phase(sample_rate=44100)

# Generate test audio (1 second of 440Hz sine wave)
t = np.linspace(0, 1.0, 44100)
audio = np.sin(2 * np.pi * 440 * t)

# Convert to spectrogram
spectrogram = phase.to_phase(audio)

# Reconstruct audio
reconstructed = phase.from_phase(spectrogram)

File Conversion

from phase import Phase

phase = Phase(sample_rate=44100)

# WAV to PNG
phase.to_phase_wav('input.wav', 'spectrogram.png')

# FLAC to PNG
phase.to_phase_flac('input.flac', 'spectrogram.png')

# PNG back to WAV
phase.to_wav_png('spectrogram.png', 'output.wav')

Advanced Configuration

from phase import Phase

# High Dynamic Range (16-bit per channel)
phase_hdr = Phase(
    sample_rate=48000,
    HDR=True,
    volume_boost=2.0,
    y_reverse=False
)

# Custom window and FFT resolution
phase_custom = Phase(
    sample_rate=44100,
    window=2560,
    resolut=8192
)

# With inverse hyperbolic sine compression
phase_ihs = Phase(
    sample_rate=44100,
    IHS=True
)

API Reference

Phase Class

Phase(sample_rate=None, num_freqs=None, window=1280, 
      resolut=4096, y_reverse=True, volume_boost=0.0, 
      HDR=False, IHS=False)

Parameters:

  • sample_rate (int): Audio sample rate (8000, 11025, 16000, 22050, 24000, 32000, 44100, or 48000)
  • num_freqs (int): Number of frequency bins (auto-set based on sample_rate if not provided)
  • window (int): STFT window size (default: 1280)
  • resolut (int): FFT resolution (default: 4096)
  • y_reverse (bool): Flip Y-axis in PNG images (default: True)
  • volume_boost (float): Volume multiplier for reconstruction (default: 0.0 = no boost)
  • HDR (bool): Use 16 bits per channel PNG (default: False = 8 bits per channel)
  • IHS (bool): Enable inverse hyperbolic sine compression (default: False)

Methods

to_phase(audio_buffer)

Convert audio buffer to phase-preserving spectrogram.

Parameters:

  • audio_buffer (numpy.ndarray): 1D array of float64 audio samples

Returns:

  • numpy.ndarray: 2D array of shape (time_frames * num_freqs, 2)

from_phase(spectrogram)

Reconstruct audio from phase-preserving spectrogram.

Parameters:

  • spectrogram (numpy.ndarray): 2D array of shape (time_frames * num_freqs, 2)

Returns:

  • numpy.ndarray: 1D array of float64 audio samples

to_phase_wav(input_file, output_file)

Convert WAV file to PNG spectrogram.

Parameters:

  • input_file (str): Path to input WAV file
  • output_file (str): Path to output PNG file

to_phase_flac(input_file, output_file)

Convert FLAC file to PNG spectrogram.

Parameters:

  • input_file (str): Path to input FLAC file
  • output_file (str): Path to output PNG file

to_wav_png(input_file, output_file)

Convert PNG spectrogram to WAV file.

Parameters:

  • input_file (str): Path to input PNG file
  • output_file (str): Path to output WAV file

Returns:

  • int: Detected sample rate from the spectrogram

Supported Sample Rates

Sample Rate Frequency Bins Family
8000 Hz 768 48k
16000 Hz 768 48k
24000 Hz 768 48k
32000 Hz 768 48k
48000 Hz 768 48k
11025 Hz 836 44.1k
22050 Hz 836 44.1k
44100 Hz 836 44.1k

Note: HDR mode doubles the frequency bin count.

Requirements

  • Python >= 3.7
  • numpy >= 1.20.0
  • scipy >= 1.7.0
  • soundfile >= 0.10.0
  • Pillow >= 8.0.0
  • pypng >= 0.20220715.0

License

See LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Related Projects

This is a Python implementation based on the Go package gomel.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phase_spectrogram-0.0.10.tar.gz (63.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

phase_spectrogram-0.0.10-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file phase_spectrogram-0.0.10.tar.gz.

File metadata

  • Download URL: phase_spectrogram-0.0.10.tar.gz
  • Upload date:
  • Size: 63.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for phase_spectrogram-0.0.10.tar.gz
Algorithm Hash digest
SHA256 78dd5091f9b5891c21919c9be732e4b0d4a0d6779c6f4cf7f6a26834f3a8d249
MD5 53ffdc008785ee2560480bb91c3551b6
BLAKE2b-256 b53f428a861fe82f60aeb4dc7c6b267c0952d4ae8b0a0804d137a7bc02326910

See more details on using hashes here.

Provenance

The following attestation bundles were made for phase_spectrogram-0.0.10.tar.gz:

Publisher: python-publish.yml on neurlang/gomel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file phase_spectrogram-0.0.10-py3-none-any.whl.

File metadata

File hashes

Hashes for phase_spectrogram-0.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 d94491e4623d009f4501c02131c8ebc3350a98f114e7a887fb41afebf6a8ccf5
MD5 be3bc533cada78ff158fb7dabbc7ffb8
BLAKE2b-256 b4e09f7a5af708509976a2e995bfd63c7cffa8283207d82c59e9bbbadcd67b5b

See more details on using hashes here.

Provenance

The following attestation bundles were made for phase_spectrogram-0.0.10-py3-none-any.whl:

Publisher: python-publish.yml on neurlang/gomel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page