Skip to main content

Normalize WAV voice recordings to a consistent target dB level using AGC, VAD, and limiting

Project description

voxlevel

Normalize WAV voice recordings so all speakers sound at the same volume level (-6 dB by default), regardless of their distance from the microphone.

Handles real-world scenarios: background noise, wind, echo, multiple speakers, and speakers moving during recording.

Installation

pip install voxlevel

Usage

Python API

import voxlevel

# From a WAV file
voxlevel.normalize("input.wav", "output.wav")

# From a numpy array
import numpy as np
result = voxlevel.normalize(audio_array, sample_rate=16000)

# With custom parameters
voxlevel.normalize(
    "input.wav",
    "output.wav",
    target_db=-6.0,
    max_gain_db=30.0,
    rms_window_ms=400.0,
    smooth_window_ms=200.0,
)

CLI

# Single file
voxlevel input.wav -o output.wav

# Batch processing
voxlevel *.wav -o normalized/

# Custom target level
voxlevel input.wav -o output.wav --target-db -3.0

How it works

voxlevel uses a two-pass offline approach (not real-time compression):

  1. Preprocessing -- DC removal + 80 Hz high-pass filter to cut wind noise, handling noise, and plosives
  2. Voice Activity Detection -- Silero-VAD (ONNX) identifies speech vs. silence segments
  3. Automatic Gain Control -- Sliding RMS envelope computes the gain needed at each sample to reach the target level, with interpolation across silence gaps and bidirectional smoothing
  4. Lookahead limiter -- 5 ms lookahead prevents peaks from exceeding the target, reducing transient distortion compared to brick-wall clipping

The two-pass design means gain is correct from sample 0 -- no lag or adaptation artifacts that real-time compressors exhibit.

Constraints

  • 16-bit mono WAV at 8 kHz or 16 kHz
  • Offline processing only (no streaming)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxlevel-0.2.0.tar.gz (70.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voxlevel-0.2.0-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file voxlevel-0.2.0.tar.gz.

File metadata

  • Download URL: voxlevel-0.2.0.tar.gz
  • Upload date:
  • Size: 70.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxlevel-0.2.0.tar.gz
Algorithm Hash digest
SHA256 20d3640ed99e771018acc68427dbe2f63e6af0bc3c6e51b804ec1aa734fe0aab
MD5 2bde7157c4fb195e0d3e1c62f04dc86d
BLAKE2b-256 75808426fafa19e8cd58ea98d2c1823e6c1936dbf22d3dbc8d84205796278e25

See more details on using hashes here.

Provenance

The following attestation bundles were made for voxlevel-0.2.0.tar.gz:

Publisher: publish.yml on 42atomys/voxlevel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file voxlevel-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: voxlevel-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 12.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voxlevel-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 760fee3d04ad75eb1f8b066c704cf58946649d960907a7029c41e837b35c12ac
MD5 0dfa9e0bd8786c33c75af75c14998fac
BLAKE2b-256 4cd781eb099a9ac8b34fd698ef783b967f6bbd35085f56b688fab8a94360253a

See more details on using hashes here.

Provenance

The following attestation bundles were made for voxlevel-0.2.0-py3-none-any.whl:

Publisher: publish.yml on 42atomys/voxlevel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page