Skip to main content

Normalize WAV voice recordings to a consistent target dB level using AGC, VAD, and limiting

Project description

voxlevel

Normalize WAV voice recordings so all speakers sound at the same volume level (-6 dB by default), regardless of their distance from the microphone.

Handles real-world scenarios: background noise, wind, echo, multiple speakers, and speakers moving during recording.

Installation

pip install voxlevel

Usage

Python API

import voxlevel

# From a WAV file
voxlevel.normalize("input.wav", "output.wav")

# From a numpy array
import numpy as np
result = voxlevel.normalize(audio_array, sample_rate=16000)

# With custom parameters
voxlevel.normalize(
    "input.wav",
    "output.wav",
    target_db=-6.0,
    max_gain_db=30.0,
    rms_window_ms=400.0,
    smooth_window_ms=200.0,
)

CLI

# Single file
voxlevel input.wav -o output.wav

# Batch processing
voxlevel *.wav -o normalized/

# Custom target level
voxlevel input.wav -o output.wav --target-db -3.0

How it works

voxlevel uses a two-pass offline approach (not real-time compression):

  1. Preprocessing -- DC removal + 80 Hz high-pass filter to cut wind noise, handling noise, and plosives
  2. Voice Activity Detection -- Silero-VAD (ONNX) identifies speech vs. silence segments
  3. Automatic Gain Control -- Sliding RMS envelope computes the gain needed at each sample to reach the target level, with interpolation across silence gaps and bidirectional smoothing
  4. Lookahead limiter -- 5 ms lookahead prevents peaks from exceeding the target, reducing transient distortion compared to brick-wall clipping

The two-pass design means gain is correct from sample 0 -- no lag or adaptation artifacts that real-time compressors exhibit.

Constraints

  • 16-bit mono WAV at 8 kHz or 16 kHz
  • Offline processing only (no streaming)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxlevel-0.1.0.tar.gz (66.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voxlevel-0.1.0-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file voxlevel-0.1.0.tar.gz.

File metadata

  • Download URL: voxlevel-0.1.0.tar.gz
  • Upload date:
  • Size: 66.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxlevel-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7a31dfef202dc1db5fe27d16ddd79533a4297457699bf41b3f2465245b50d5ae
MD5 ac1e145c2e01fd3c61f3549cbf6faf84
BLAKE2b-256 abed88252cd54f2cd822a12f9bd6e34d8f7f249f8a750ee97f4107ffc117de29

See more details on using hashes here.

Provenance

The following attestation bundles were made for voxlevel-0.1.0.tar.gz:

Publisher: publish.yml on 42atomys/voxlevel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file voxlevel-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: voxlevel-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxlevel-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 56b0bbb295c65dd480ea9b26f4fd449376c29647511916ea9d6e4d474115bfe9
MD5 92b9a652ef40f48eadbdec0f5f690905
BLAKE2b-256 eed292eec7fd9ce2a3f9e91df42072ce0e50f7fcbc062dac611318661f7faee0

See more details on using hashes here.

Provenance

The following attestation bundles were made for voxlevel-0.1.0-py3-none-any.whl:

Publisher: publish.yml on 42atomys/voxlevel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page