Skip to main content

AudioStretchy float32 is a Python library and CLI tool for high-quality time-stretching of audio files without changing pitch. Uses David Bryant's audio-stretch C library with Pedalboard for versatile audio I/O.

Project description

AudioStretchy float32

This is a fork of twardoch/audiostretchy and adds support for float32 audio.

AudioStretchy is a Python library and command-line interface (CLI) tool designed for high-quality time-stretching of audio files without altering their pitch.

It leverages David Bryant’s robust audio-stretch C library, which implements the Time-Domain Harmonic Scaling (TDHS) algorithm, particularly effective for speech. For versatile audio file handling (WAV, MP3, FLAC, OGG, etc.) and resampling, AudioStretchy integrates Spotify's Pedalboard library.

Table of Contents

Who is this for?

AudioStretchy is aimed at:

  • Musicians and Music Producers: To adjust the tempo of backing tracks, samples, or entire songs.
  • Audio Engineers: For post-production tasks requiring timing adjustments without pitch artifacts.
  • Podcast and Video Editors: To fit voiceovers or audio segments into specific time slots.
  • Software Developers: Who need to integrate audio time-stretching capabilities into their Python applications.
  • Researchers and Hobbyists: Exploring audio processing techniques.

Why is it useful?

Time-stretching audio without affecting pitch is a common need in audio production. AudioStretchy provides:

  • High-Quality Results: The TDHS algorithm is known for producing natural-sounding results, especially with speech.
  • Ease of Use: Simple CLI and Python API make it accessible for various workflows.
  • Format Flexibility: Supports a wide range of common audio formats thanks to Pedalboard.
  • Cross-Platform Compatibility: Works on Windows, macOS, and Linux.

Features

  • High-Quality Time Stretching: Utilizes David Bryant's audio-stretch C library (TDHS algorithm).
  • Silence-Aware Stretching: Supports separate stretching ratios for gaps/silence via the gap_ratio parameter (Note: The Python wrapper currently passes this to the C library; however, effective silence-specific stretching relies on the C library's internal segmentation or future Python-side pre-segmentation logic).
  • Broad Audio Format Support: Reads and writes numerous audio formats (WAV, MP3, FLAC, OGG, AIFF, etc.) using the Pedalboard library.
  • Resampling: Supports audio resampling, also via Pedalboard.
  • Adjustable Parameters: Fine-tune stretching with parameters like frequency limits for period detection, buffer sizes, and silence thresholds.
  • Cross-Platform: Includes pre-compiled C libraries for Windows, macOS (x86_64, arm64), and Linux.
  • Simple CLI and Python API.

Demo

Below are links to a short audio file (as WAV and MP3), with the same file stretched at a ratio of 1.2 (making it 20% slower):

Input Stretched (Ratio 1.2)
audio.wav audio-1.2.wav
audio.mp3 audio-1.2.mp3

Installation

Standard Installation

AudioStretchy includes a C extension that provides the core TDHS algorithm. Pre-compiled wheels are provided for Windows, macOS, and Linux, making installation straightforward via pip:

python3 -m pip install audiostretchy-f32

This command installs audiostretchy-f32 along with its key dependencies:

Compatibility note: This fork intentionally keeps the Python import path as audiostretchy for compatibility with existing code. Installing both upstream audiostretchy and this fork (audiostretchy-f32) into the same environment is not recommended because they share the same import package name.

  • numpy: For numerical operations.
  • pedalboard: For reading/writing various audio formats and for resampling.
  • fire: For the command-line interface.

Python version support note: Python 3.14 is temporarily excluded from this fork's official support matrix and release validation. On GitHub-hosted Ubuntu runners, importing pedalboard under Python 3.14 currently crashes with Illegal instruction (SIGILL) before AudioStretchy's own code runs. Until that upstream binary compatibility issue is resolved, this project officially supports Python 3.11 to 3.13 for automated testing and release publishing.

Note on Pedalboard Dependencies (FFmpeg): For pedalboard to support a wide range of audio formats (especially compressed ones like MP3, M4A, OGG), it relies on system libraries like FFmpeg. If you encounter issues opening or saving specific file types, ensure FFmpeg is installed and accessible in your system's PATH.

  • macOS (using Homebrew): brew install ffmpeg
  • Linux (Debian/Ubuntu): sudo apt-get install ffmpeg
  • Windows: Download FFmpeg from the official website, extract it, and add its bin directory to your system's PATH environment variable.

Development Installation

To install the development version from the repository for contributing or testing:

git clone https://github.com/twardoch/audiostretchy.git
cd audiostretchy
git submodule update --init --recursive # To fetch the audio-stretch C library source
python3 -m pip install -e .[testing] # Installs in editable mode with testing dependencies

If you modify the C code in vendors/stretch/stretch.c, you will need to recompile the C library. This typically involves:

  1. Having a C compiler installed (GCC/Clang on Linux/macOS, MSVC on Windows).
  2. Manually compiling vendors/stretch/stretch.c into the appropriate shared library format (_stretch.so on Linux, _stretch.dylib on macOS, _stretch.dll on Windows).
  3. Placing the compiled library into the correct directory within src/audiostretchy/interface/ (e.g., src/audiostretchy/interface/linux/). The CI workflow (.github/workflows/ci.yaml) handles this for official releases.

Usage

Command-Line Interface (CLI)

The audiostretchy command allows you to stretch audio files directly from your terminal.

Syntax:

audiostretchy INPUT_FILE OUTPUT_FILE [FLAGS]

Positional Arguments:

  • INPUT_FILE: Path to the input audio file (e.g., input.wav, song.mp3).
  • OUTPUT_FILE: Path to save the processed audio file (e.g., output_stretched.wav).

Optional Flags (TDHS Parameters):

  • -r, --ratio=FLOAT: The stretch ratio. Values > 1.0 extend audio (slower playback), < 1.0 shorten audio (faster playback). Default: 1.0 (no change).
  • -g, --gap_ratio=FLOAT: Stretch ratio specifically for silence or gaps in the audio. Default: 0.0 (which means use the main ratio for gaps). Note: The underlying C library may use this if it performs internal silence detection, but the Python wrapper does not currently implement pre-segmentation based on this.
  • -u, --upper_freq=INT: Upper frequency limit for period detection in Hz. Affects how the algorithm identifies fundamental frequencies. Default: 333.
  • -l, --lower_freq=INT: Lower frequency limit for period detection in Hz. Default: 55.
  • -b, --buffer_ms=FLOAT: Buffer size in milliseconds, potentially for silence detection logic. Default: 25. (Note: Primarily relevant if advanced gap handling is implemented/utilized).
  • -t, --threshold_gap_db=FLOAT: Silence threshold in dB for gap detection. Default: -40. (Note: Primarily relevant if advanced gap handling is implemented/utilized).
  • -d, --double_range=BOOL: Use an extended ratio range (0.25-4.0 instead of the default 0.5-2.0). Set to True or False. Default: False.
  • -f, --fast_detection=BOOL: Enable a faster (but potentially lower quality) period detection method. Set to True or False. Default: False.
  • -n, --normal_detection=BOOL: Force normal period detection (this can override fast detection if the sample rate is high, depending on C library logic). Set to True or False. Default: False.
  • -s, --sample_rate=INT: Target sample rate in Hz for resampling the output. If 0 or omitted, the output will have the same sample rate as the input (unless stretching itself inherently changes it, which TDHS does not aim to do for sample rate). Default: 0 (no resampling).

Example: To make input.mp3 20% slower and save it as output_slow.wav with a 44100 Hz sample rate:

audiostretchy input.mp3 output_slow.wav --ratio 1.2 --sample_rate 44100

Python API

AudioStretchy can be used programmatically within your Python scripts.

Simple Function Call: The stretch_audio function provides a quick way to process files.

from audiostretchy.stretch import stretch_audio

stretch_audio(
    input_path="path/to/your/input.mp3",
    output_path="path/to/your/output_stretched.wav",
    ratio=0.8,  # Make audio 20% faster
    sample_rate=22050, # Resample output to 22050 Hz
    upper_freq=300, # Adjust upper frequency for period detection
    fast_detection=True # Use faster algorithm
)

print("Audio stretching complete!")

Using the AudioStretch Class: For more control, or if working with audio data in memory (e.g., from BytesIO objects), use the AudioStretch class.

from audiostretchy.stretch import AudioStretch
from io import BytesIO

# Initialize the processor
processor = AudioStretch()

# --- Example 1: Processing files ---
processor.open("input.flac") # Pedalboard handles opening various formats

processor.stretch(
    ratio=1.1,          # Make 10% slower
    # gap_ratio=1.5,    # Stretch silence even more (see note on gap_ratio effectiveness)
    upper_freq=350,
    lower_freq=60,
    double_range=True   # Allow ratios like 0.25 or 4.0
)

# Optional: Resample the processed audio
processor.resample(target_framerate=48000)

processor.save("processed_output.ogg") # Pedalboard infers format from extension
# OR: processor.save("processed_output.custom", output_format="ogg")


# --- Example 2: Processing from BytesIO ---
# Assuming `input_audio_bytes` is a bytes object containing audio data (e.g., read from a stream)
# and `input_format` is known (e.g., 'wav', 'mp3')

# input_audio_bytes = b"..." # Your audio data
# input_format = "wav"
#
# input_file_like_object = BytesIO(input_audio_bytes)
# processor.open(file=input_file_like_object, format=input_format) # `format` might be needed if not inferable
#
# processor.stretch(ratio=1.5)
#
# output_file_like_object = BytesIO()
# processor.save(file=output_file_like_object, format="mp3") # Specify output format
#
# stretched_audio_bytes = output_file_like_object.getvalue()
# with open("output_from_bytesio.mp3", "wb") as f:
#     f.write(stretched_audio_bytes)

Technical Details

How it Works

AudioStretchy operates through several key stages:

  1. Audio Input/Output & Resampling (via Pedalboard):

    • When an input file is provided (e.g., MP3, WAV, FLAC), pedalboard is used to decode it into raw audio samples (specifically, a NumPy array of float32 values). It also reads metadata like sample rate and channel count.
    • If resampling is requested, pedalboard handles this efficiently.
    • After processing, pedalboard encodes the modified audio samples back into the desired output format.
  2. Core Time-Stretching (via audio-stretch C library & TDHSAudioStretch wrapper):

    • The raw audio samples (float32) obtained from pedalboard are converted to 16-bit integers (int16), as the C library expects this format. If the audio is stereo, channels are interleaved (L, R, L, R...).
    • The TDHSAudioStretch class in src/audiostretchy/interface/tdhs.py uses ctypes to call functions from the pre-compiled audio-stretch shared library (e.g., _stretch.so, _stretch.dylib, _stretch.dll).
    • The C library implements Time-Domain Harmonic Scaling (TDHS). This algorithm works by:
      • Analyzing the input audio signal in the time domain.
      • Identifying periodic segments (related to pitch) within the audio.
      • To slow down audio (stretch ratio > 1.0), it intelligently repeats these small segments.
      • To speed up audio (stretch ratio < 1.0), it removes some of these segments.
      • This is done in a way that aims to preserve the harmonic structure and formants, thus maintaining the original pitch and timbre.
    • Parameters like upper_freq and lower_freq guide the C library in its period detection, defining the expected range of fundamental frequencies in the audio.
    • Flags like STRETCH_FAST_FLAG (for --fast_detection) and STRETCH_DUAL_FLAG (for --double_range) modify the C library's behavior.
    • After the C library processes the audio, the resulting int16 samples are converted back to float32 for pedalboard to handle.
  3. Python Orchestration (AudioStretch class):

    • The AudioStretch class in src/audiostretchy/stretch.py manages the overall process:
      • Initializes and uses PedalboardAudioFile for I/O.
      • Prepares data for the TDHSAudioStretch wrapper (data type conversion, interleaving).
      • Calls the stretch method of TDHSAudioStretch.
      • Handles data conversion back from the wrapper.
      • Coordinates resampling if requested.

The gap_ratio parameter is intended for applying a different stretch ratio to silent portions of the audio. While the Python wrapper passes this to the C library initialization, the current Python implementation of AudioStretch.stretch() processes the entire audio with the primary ratio. Effective utilization of gap_ratio would typically require the Python code to segment the audio into speech/silence parts first (e.g., based on RMS levels and threshold_gap_db), and then apply different ratios to these segments, or for the C library to have internal advanced segmentation logic that uses this parameter. The original C CLI application (main.c in dbry/audio-stretch) contains such segmentation logic, which is not fully replicated in the current Python bindings.

Core Modules

  • src/audiostretchy/__main__.py:
    • Provides the command-line interface using the fire library.
    • It calls the stretch_audio function from stretch.py.
  • src/audiostretchy/stretch.py:
    • Contains the main AudioStretch class that orchestrates the audio processing.
    • Implements methods for opening, stretching, resampling, and saving audio.
    • Includes the stretch_audio convenience function used by the CLI.
    • Relies on pedalboard for I/O and resampling, and TDHSAudioStretch for the core algorithm.
  • src/audiostretchy/interface/tdhs.py:
    • Defines the TDHSAudioStretch class, which is a Python ctypes wrapper around the pre-compiled audio-stretch C library.
    • Loads the shared library (.so, .dylib, .dll) based on the operating system.
    • Defines argument types and return types for the C functions (stretch_init, stretch_samples, stretch_flush, etc.).
  • src/audiostretchy/interface/{win,mac,linux}/:
    • These directories contain the pre-compiled shared C libraries (_stretch.dll, _stretch.dylib, _stretch.so) for different platforms and architectures.
  • vendors/stretch/:
    • Contains the source code of David Bryant's audio-stretch C library as a Git submodule. This is used for compiling the shared libraries.

Coding Conventions

This project adheres to standard Python coding practices and uses tools to maintain code quality:

  • Formatting: Code is formatted using Black.
  • Import Sorting: Imports are sorted using isort. Configuration is in .isort.cfg.
  • Linting: Code is linted using Flake8. Configuration is in pyproject.toml ([tool.flake8]).
  • Pre-commit Hooks: The .pre-commit-config.yaml file defines hooks that run these tools automatically before commits, ensuring consistency. This includes checks for trailing whitespace, large files, valid syntax, etc.

Contributing

Contributions are welcome! If you'd like to contribute, please follow these general guidelines:

  1. Fork the Repository: Create your own fork of the audiostretchy repository on GitHub.
  2. Clone Your Fork:
    git clone https://github.com/YOUR_USERNAME/audiostretchy.git
    cd audiostretchy
    git submodule update --init --recursive
    
  3. Create a Branch: Create a new branch for your feature or bug fix:
    git checkout -b my-new-feature
    
  4. Set Up Development Environment:
    • It's recommended to use a virtual environment:
      python3 -m venv venv
      source venv/bin/activate  # On Windows: venv\Scripts\activate
      
    • Install the project in editable mode with testing dependencies:
      python3 -m pip install -e .[testing]
      
    • Install pre-commit hooks:
      pre-commit install
      
  5. Make Your Changes: Implement your feature or bug fix.
    • Adhere to the coding conventions (Black, isort, Flake8). The pre-commit hooks will help with this.
    • Write tests for any new functionality in the tests/ directory.
  6. Run Tests: Ensure all tests pass:
    pytest
    
    Check test coverage:
    pytest --cov src/audiostretchy --cov-report term-missing
    
  7. Commit Your Changes:
    git add .
    git commit -m "feat: Add new feature X" # Or "fix: Resolve bug Y"
    
  8. Push to Your Fork:
    git push origin my-new-feature
    
  9. Submit a Pull Request: Open a pull request from your branch to the main branch of the original twardoch/audiostretchy repository. Provide a clear description of your changes.

If you plan to modify the C library code in vendors/stretch/, you will also need to recompile it for your platform and potentially update the CI workflows if changes are significant.

Development and Release Process

AudioStretchy uses git-tag-based semantic versioning with automated CI/CD pipeline:

Quick Start for Development

# Set up development environment
make dev

# Run tests
make test

# Run linting and formatting
make lint
make format

# Build package
make build

Release Process

# Create a new release (requires main branch)
make release VERSION=1.2.3

This will:

  • Run all tests
  • Build the package
  • Create a git tag
  • Trigger automated CI/CD pipeline
  • Publish to PyPI automatically

CI/CD Pipeline

  • Multi-platform testing: Ubuntu, Windows, macOS
  • Python versions: 3.11, 3.12, 3.13
  • Temporary exclusion: Python 3.14 is currently excluded from release validation because importing pedalboard on GitHub-hosted Ubuntu runners crashes with Illegal instruction (SIGILL)
  • Automatic wheel building: Binary wheels for all platforms
  • Automated PyPI publishing: On git tag creation

See SEMVER_GUIDE.md for detailed release documentation.

License

  • The Python wrapper code for AudioStretchy (this project) is licensed under the BSD-3-Clause License. See LICENSE.txt. Copyright (c) 2023-2024 Adam Twardoch.
  • The vendored core C library (vendors/stretch/stretch.c, vendors/stretch/stretch.h) originates from ogra/audio-stretch, branch ogra/feat-implement-float32, and is distributed under its original BSD-style license. See vendors/stretch/license.txt.
  • Third-party attribution details are summarized in THIRD_PARTY_NOTICES.md.
  • Audio I/O and Resampling functionalities are provided by Spotify's Pedalboard library, which is licensed under the Apache License 2.0. Pedalboard itself may utilize other libraries with their own respective licenses (e.g., libsndfile, Rubber Band library).
  • Some Python code may have been written with assistance from AI language models.

Current Version: (to be updated by release process)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audiostretchy_f32-1.3.6.post1.tar.gz (1.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

audiostretchy_f32-1.3.6.post1-py3-none-any.whl (33.9 kB view details)

Uploaded Python 3

File details

Details for the file audiostretchy_f32-1.3.6.post1.tar.gz.

File metadata

  • Download URL: audiostretchy_f32-1.3.6.post1.tar.gz
  • Upload date:
  • Size: 1.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for audiostretchy_f32-1.3.6.post1.tar.gz
Algorithm Hash digest
SHA256 d11b05f5498c22ac3ba7d13dd886c4fe08025d98c2affc942194aa33a69aa564
MD5 9259fd17944db922b5f8bd225b4d67c7
BLAKE2b-256 ea31d8678b685187413c8524e936b42575762592e37f8e69d8a92ee53298b75b

See more details on using hashes here.

Provenance

The following attestation bundles were made for audiostretchy_f32-1.3.6.post1.tar.gz:

Publisher: release.yml on ogra/audiostretchy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file audiostretchy_f32-1.3.6.post1-py3-none-any.whl.

File metadata

File hashes

Hashes for audiostretchy_f32-1.3.6.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 152bed821b1f06f1b0cfc3dd0f5156e554fb5d90231299e27182b8be4a808384
MD5 2be28090b60e2f73e6ca8da2f0da9dc0
BLAKE2b-256 4b5845fe80a85fbc15c3b4aac5fc79b7cf92c9c4bb700df42b894363dc4ea07c

See more details on using hashes here.

Provenance

The following attestation bundles were made for audiostretchy_f32-1.3.6.post1-py3-none-any.whl:

Publisher: release.yml on ogra/audiostretchy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page