FFT-based channel vocoder that applies spectral envelope transfer
Project description
FFT Channel Vocoder
Built by a blind programmer and musician, for musicians
A Python package that applies spectral envelope transfer using FFT-based processing. The vocoder takes a modulator (voice) signal and imposes its spectral envelope onto a carrier signal.
Designed with accessibility as a core principle: CLI first, no GUI, cross-platform, automated batch processing—built by a blind developer for everyone who deserves equal access to audio tools.
Now on PyPI: pip install fft_channel_vocoder
Demonstration
Hear the FFT Vocoder in action:
FFT Vocoder Demo Video — Original voice, vocoded track, and vocoded track with music + code displayed on screen
Documentation
- Documentation — Installation, tutorials, configuration, and troubleshooting
- Design Philosophy — Why we built it this way, design decisions, and algorithm choices
- Algorithm Deep Dive — Technical details for audio engineers and researchers
Features
- Spectral Envelope Transfer: Extracts formant information from voice and applies it to carrier signals
- Multiple Input Formats: Supports voice files, MIDI files (synthesized to carrier waves), pre-generated synth wave files, and scale-based pitch correction
- Pitch Correction: Optional automatic pitch detection and correction to user-defined musical scales with noise gate
- Batch Processing: Automatically processes multiple input files with consistent naming patterns
- Generator-based Design: Uses Python generators for efficient iteration through numbered file sequences
- Accessibility First: CLI interface, fully accessible to screen readers, works across platforms
Installation
From Python Package Index (PyPI)
The easiest way to install is from PyPI:
pip install fft_channel_vocoder
Then run the vocoder with:
vocode
From Source
Clone the repo and install in development mode:
git clone https://github.com/your-repo/fft_channel_vocoder.git
cd fft_channel_vocoder
pip install -e .
Usage
Command Line
Run the vocoder with files in the input folder:
vocode
Show help and usage information:
vocode -h
vocode --help
Open the configuration menu to adjust settings:
vocode -c
vocode --config
Or run using Python module syntax:
python3 -m fft_channel_vocoder
Input Structure
The vocoder expects files organized in an input/ directory:
input/
voice1.wav # Modulator signals
voice2.wav
melody1.mid # MIDI files to synthesize as carrier
melody2.mid
synth1.wav # Pre-generated synth wave files as carrier
synth2.wav
synth3.wav
scale1.txt # Scale files for pitch correction (one note per line)
scale2.txt
Scale File Format:
Each scale file contains one note class per line. Supported note classes are: c, c#, d, d#, e, f, f#, g, g#, a, a#, b. Comments (lines starting with #) and blank lines are ignored.
Example scale1.txt (C Major scale):
# Major scale
c
d
e
f
g
a
b
Processing Flow
For each voice file, the vocoder:
- MIDI Processing: Synthesizes each MIDI file into a carrier wave and vocodes with the voice
- Synth Wave Processing: Loads each pre-generated synth wave file and vocodes with the voice
- Pitch Correction: Detects pitch from the voice and snaps to a user-defined scale, synthesizes a carrier wave, and vocodes with the voice
- Whisper Generation: Creates a stereo whisper track by vocoding the voice with white noise
Pitch Correction Details:
- Analyzes the voice for dominant frequencies in the range 50-2000 Hz
- Snaps detected pitches to a defined musical scale (octave-independent)
- Uses a noise gate (-40 dB by default) to prevent unwanted tuning during silence or low-amplitude content
- Maintains the last detected note when below the noise gate threshold
Output Structure
Processed files are saved to output/:
output/
voice1_melody1.wav # Voice + MIDI synthesis
voice1_melody2.wav
voice1_synth1.wav # Voice + Synth wave 1
voice1_synth2.wav
voice1_synth3.wav
voice1_scale1.wav # Voice + Pitch-corrected carrier (scale 1)
voice1_scale2.wav # Voice + Pitch-corrected carrier (scale 2)
voice1_whisper.wav # Stereo whisper track
Configuration
Configuration is stored in fft_channel_vocoder/config.json. You can edit it directly or use the configuration menu:
vocode -c
Configuration Options
sample_rate: Audio sample rate in Hz (default: 96,000)vocoder_fft_size: FFT window size as a power of 2 (default: 12, which equals 2^12 = 4096 samples)vocoder_hop: Hop size divisor for vocoder FFTs (default: 4, calculates as fft_size / 4)pitch_correct_fft_size: FFT size for pitch correction as a power of 2 (default: 11, which equals 2^11 = 2048 samples)pitch_correcter_hop: Hop size divisor for pitch correction FFTs (default: 4)
Algorithm
The vocoder works in 4 steps:
- STFT Analysis: Compute Short-Time Fourier Transform for both voice and carrier
- Spectral Smoothing: Apply Gaussian blur to extract formant envelopes
- Envelope Transfer: Apply spectral whitening to the carrier, then scale by voice envelope
- Reconstruction: Inverse STFT with original carrier phase to recover time-domain signal
Module Reference
main.py: Core pipeline and file iterationfft.py: FFT vocoding algorithmclean_io.py: Audio file I/O with resamplingclean_audio.py: Audio preprocessing and validationconfig.py: Global configuration parametersmidi_synth.py: MIDI to audio synthesispitch_corrector.py: Pitch detection and scale-based note snappingscale_synth.py: Pitch-corrected carrier synthesisnoise_generators.py: Noise generation utilitiesbuffers.py: Buffer management utilities
Disclaimer
This project was developed with AI-assisted development. While some parts of the code were built with AI assistance, the program ideas, architecture, and design philosophy are original.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fft_channel_vocoder-1.0.1.post1.tar.gz.
File metadata
- Download URL: fft_channel_vocoder-1.0.1.post1.tar.gz
- Upload date:
- Size: 21.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ddf269ce53bbd7c1cdd7dfd7b4f018cda6f0b089fffad04abf9588bef258dc8
|
|
| MD5 |
ba271e79560eb306b9acbff73c7af36e
|
|
| BLAKE2b-256 |
b40091b93c68e494ec88131558fd97bebedd3be502cda128e76421c85fc32b5c
|
Provenance
The following attestation bundles were made for fft_channel_vocoder-1.0.1.post1.tar.gz:
Publisher:
python-publish.yml on lawrenceper/fft_channel_vocoder
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fft_channel_vocoder-1.0.1.post1.tar.gz -
Subject digest:
3ddf269ce53bbd7c1cdd7dfd7b4f018cda6f0b089fffad04abf9588bef258dc8 - Sigstore transparency entry: 1437325379
- Sigstore integration time:
-
Permalink:
lawrenceper/fft_channel_vocoder@aed9b7bcba5685d46878339cae8b769fd6512534 -
Branch / Tag:
refs/tags/1.0.1Patch1 - Owner: https://github.com/lawrenceper
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@aed9b7bcba5685d46878339cae8b769fd6512534 -
Trigger Event:
release
-
Statement type:
File details
Details for the file fft_channel_vocoder-1.0.1.post1-py3-none-any.whl.
File metadata
- Download URL: fft_channel_vocoder-1.0.1.post1-py3-none-any.whl
- Upload date:
- Size: 24.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b8c8ad63c5834ea9db596612c1276927f86c9d9f8c31ac2d764d70bed29d8346
|
|
| MD5 |
51b02ed44f44070be895b58e1e2a12b9
|
|
| BLAKE2b-256 |
3989c762de6abb39b93317674f57750fae16d7bf2de79db0079e502c010d53e2
|
Provenance
The following attestation bundles were made for fft_channel_vocoder-1.0.1.post1-py3-none-any.whl:
Publisher:
python-publish.yml on lawrenceper/fft_channel_vocoder
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fft_channel_vocoder-1.0.1.post1-py3-none-any.whl -
Subject digest:
b8c8ad63c5834ea9db596612c1276927f86c9d9f8c31ac2d764d70bed29d8346 - Sigstore transparency entry: 1437325386
- Sigstore integration time:
-
Permalink:
lawrenceper/fft_channel_vocoder@aed9b7bcba5685d46878339cae8b769fd6512534 -
Branch / Tag:
refs/tags/1.0.1Patch1 - Owner: https://github.com/lawrenceper
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@aed9b7bcba5685d46878339cae8b769fd6512534 -
Trigger Event:
release
-
Statement type: