FFT-based channel vocoder that applies spectral envelope transfer

Project description

FFT Channel Vocoder

Built by a blind programmer and musician, for musicians

A Python package that applies spectral envelope transfer using FFT-based processing. The vocoder takes a modulator (voice) signal and imposes its spectral envelope onto a carrier signal.

Designed with accessibility as a core principle: CLI first, no GUI, cross-platform, automated batch processing—built by a blind developer for everyone who deserves equal access to audio tools.

Now on PyPI: pip install fft_channel_vocoder

Demonstration

Hear the FFT Vocoder in action:

FFT Vocoder Demo Video — Original voice, vocoded track, and vocoded track with music + code displayed on screen

Documentation

Documentation — Installation, tutorials, configuration, and troubleshooting
Design Philosophy — Why we built it this way, design decisions, and algorithm choices
Algorithm Deep Dive — Technical details for audio engineers and researchers

Features

Spectral Envelope Transfer: Extracts formant information from voice and applies it to carrier signals
Multiple Input Formats: Supports voice files, MIDI files (synthesized to carrier waves), pre-generated synth wave files, and scale-based pitch correction
Pitch Correction: Optional automatic pitch detection and correction to user-defined musical scales with noise gate
Batch Processing: Automatically processes multiple input files with consistent naming patterns
Generator-based Design: Uses Python generators for efficient iteration through numbered file sequences
Accessibility First: CLI interface, fully accessible to screen readers, works across platforms

Installation

From Python Package Index (PyPI)

The easiest way to install is from PyPI:

pip install fft_channel_vocoder

Then run the vocoder with:

vocode

From Source

Clone the repo and install in development mode:

git clone https://github.com/your-repo/fft_channel_vocoder.git
cd fft_channel_vocoder
pip install -e .

Usage

Command Line

Run the vocoder with files in the input folder:

vocode

Show help and usage information:

vocode -h
vocode --help

Open the configuration menu to adjust settings:

vocode -c
vocode --config

Or run using Python module syntax:

python3 -m fft_channel_vocoder

Input Structure

The vocoder expects files organized in an input/ directory:

input/
    voice1.wav          # Modulator signals
    voice2.wav
    melody1.mid         # MIDI files to synthesize as carrier
    melody2.mid
    synth1.wav          # Pre-generated synth wave files as carrier
    synth2.wav
    synth3.wav
    scale1.txt          # Scale files for pitch correction (one note per line)
    scale2.txt

Scale File Format: Each scale file contains one note class per line. Supported note classes are: c, c#, d, d#, e, f, f#, g, g#, a, a#, b. Comments (lines starting with #) and blank lines are ignored.

Example scale1.txt (C Major scale):

# Major scale
c
d
e
f
g
a
b

Processing Flow

For each voice file, the vocoder:

MIDI Processing: Synthesizes each MIDI file into a carrier wave and vocodes with the voice
Synth Wave Processing: Loads each pre-generated synth wave file and vocodes with the voice
Pitch Correction: Detects pitch from the voice and snaps to a user-defined scale, synthesizes a carrier wave, and vocodes with the voice
Whisper Generation: Creates a stereo whisper track by vocoding the voice with white noise

Pitch Correction Details:

Analyzes the voice for dominant frequencies in the range 50-2000 Hz
Snaps detected pitches to a defined musical scale (octave-independent)
Uses a noise gate (-40 dB by default) to prevent unwanted tuning during silence or low-amplitude content
Maintains the last detected note when below the noise gate threshold

Output Structure

Processed files are saved to output/:

output/
    voice1_melody1.wav       # Voice + MIDI synthesis
    voice1_melody2.wav
    voice1_synth1.wav        # Voice + Synth wave 1
    voice1_synth2.wav
    voice1_synth3.wav
    voice1_scale1.wav        # Voice + Pitch-corrected carrier (scale 1)
    voice1_scale2.wav        # Voice + Pitch-corrected carrier (scale 2)
    voice1_whisper.wav       # Stereo whisper track

Configuration

Configuration is stored in fft_channel_vocoder/config.json. You can edit it directly or use the configuration menu:

vocode -c

Configuration Options

sample_rate: Audio sample rate in Hz (default: 96,000)
vocoder_fft_size: FFT window size as a power of 2 (default: 12, which equals 2^12 = 4096 samples)
vocoder_hop: Hop size divisor for vocoder FFTs (default: 4, calculates as fft_size / 4)
pitch_correct_fft_size: FFT size for pitch correction as a power of 2 (default: 11, which equals 2^11 = 2048 samples)
pitch_correcter_hop: Hop size divisor for pitch correction FFTs (default: 4)

Algorithm

The vocoder works in 5 steps:

STFT Analysis: Compute Short-Time Fourier Transform for both voice and carrier
Spectral Smoothing: Apply frequency-dependent Gaussian blur to extract formant envelopes
Temporal Envelope: Apply asymmetric attack/release smoothing across time to preserve consonant transients and sustain vowels naturally
Envelope Transfer: Apply spectral whitening to the carrier, then scale by voice envelope
Reconstruction: Inverse STFT with original carrier phase to recover time-domain signal

Module Reference

main.py: Core pipeline and file iteration
fft.py: FFT vocoding algorithm
clean_io.py: Audio file I/O with resampling
clean_audio.py: Audio preprocessing and validation
config.py: Global configuration parameters
midi_synth.py: MIDI to audio synthesis
pitch_corrector.py: Pitch detection and scale-based note snapping
scale_synth.py: Pitch-corrected carrier synthesis
noise_generators.py: Noise generation utilities
buffers.py: Buffer management utilities

Disclaimer

This project was developed with AI-assisted development. While some parts of the code were built with AI assistance, the program ideas, architecture, and design philosophy are original.

Project details

Release history Release notifications | RSS feed

This version

1.1.0

May 10, 2026

1.0.1.post1

May 4, 2026

1.0.1

May 3, 2026

1.0.0

May 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fft_channel_vocoder-1.1.0.tar.gz (22.7 kB view details)

Uploaded May 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fft_channel_vocoder-1.1.0-py3-none-any.whl (25.3 kB view details)

Uploaded May 10, 2026 Python 3

File details

Details for the file fft_channel_vocoder-1.1.0.tar.gz.

File metadata

Download URL: fft_channel_vocoder-1.1.0.tar.gz
Upload date: May 10, 2026
Size: 22.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for fft_channel_vocoder-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`276cf641d60dbed35300d87cd0bcd1a756751e202e238986ed3ba8ffb39a70f2`
MD5	`dfda1c13fa9a0fb5164fac02e9bf0e02`
BLAKE2b-256	`244dbcdb78c393454f68beeaa76cc80a06c741b961c9ce44da4a2056b981305f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for fft_channel_vocoder-1.1.0.tar.gz:

Publisher: python-publish.yml on lawrenceper/fft_channel_vocoder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: fft_channel_vocoder-1.1.0.tar.gz
- Subject digest: 276cf641d60dbed35300d87cd0bcd1a756751e202e238986ed3ba8ffb39a70f2
- Sigstore transparency entry: 1494446180
- Sigstore integration time: May 10, 2026
Source repository:
- Permalink: lawrenceper/fft_channel_vocoder@633a5474a85554910fbbbc67ee47221bf7bab92a
- Branch / Tag: refs/tags/1.1.0
- Owner: https://github.com/lawrenceper
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@633a5474a85554910fbbbc67ee47221bf7bab92a
- Trigger Event: release

File details

Details for the file fft_channel_vocoder-1.1.0-py3-none-any.whl.

File metadata

Download URL: fft_channel_vocoder-1.1.0-py3-none-any.whl
Upload date: May 10, 2026
Size: 25.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for fft_channel_vocoder-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`45a21216db384b79bd722e1ddd0b8d0cb68e0cc457b24bf8549e0dbd7f021379`
MD5	`04364d1202c918d1e8a1adfdbb06f97f`
BLAKE2b-256	`d40245aa26dd48e2ae268ca983dfad0038064dbd8b5492ce7570b215b8025cf2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for fft_channel_vocoder-1.1.0-py3-none-any.whl:

Publisher: python-publish.yml on lawrenceper/fft_channel_vocoder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: fft_channel_vocoder-1.1.0-py3-none-any.whl
- Subject digest: 45a21216db384b79bd722e1ddd0b8d0cb68e0cc457b24bf8549e0dbd7f021379
- Sigstore transparency entry: 1494446295
- Sigstore integration time: May 10, 2026
Source repository:
- Permalink: lawrenceper/fft_channel_vocoder@633a5474a85554910fbbbc67ee47221bf7bab92a
- Branch / Tag: refs/tags/1.1.0
- Owner: https://github.com/lawrenceper
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@633a5474a85554910fbbbc67ee47221bf7bab92a
- Trigger Event: release

fft-channel-vocoder 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

FFT Channel Vocoder

Demonstration

Documentation

Features

Installation

From Python Package Index (PyPI)

From Source

Usage

Command Line

Input Structure

Processing Flow

Output Structure

Configuration

Configuration Options

Algorithm

Module Reference

Disclaimer

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance