Skip to main content

MLX implementation of BigVGAN

Project description

MLX BigVGAN

An MLX-adapted implementation of BigVGAN.

Features

  • BigVGAN Integration: Fully integrates the original BigVGAN model with MLX for enhanced compatibility and performance.
  • Flexible Conversion: Includes tools to convert the original BigVGAN PyTorch weights to MLX format.
  • Customizable Configurations: Supports various configurations for kernel sizes, dilation rates, and activation functions (e.g., snake, snakebeta).
  • Pretrained Models: Easily load pretrained BigVGAN models from the Hugging Face Hub.

Installation

  1. Clone the repository:

    git clone https://github.com/your-username/mlx-BigVGAN.git
    cd mlx-BigVGAN
    
  2. Install dependencies:

    uv sync --no-dev
    

Usage

1. Load Pretrained Model

from mlx_bigvgan import BigVGAN

model = BigVGAN.from_pretrained("wyrom/mlx-bigvgan_v2_24khz_100band_256x")
model.eval()
mx.eval(model.parameters())

2. Generate Audio

import numpy as np
import mlx.core as mx
from mlx_bigvgan import log_mel_spectrogram, load_audio
# Load audio file
audio = load_audio("path/to/audio.wav")
h = model.config
# Compute log-mel spectrogram
mel_spec = log_mel_spectrogram(audio,
    n_fft=h.n_fft,
    n_mels=h.num_mels,
    sample_rate=h.sampling_rate,
    hop_length=h.hop_size,
    fmin=h.fmin,
    fmax=h.fmax,
    padding=(h.n_fft - h.hop_size) // 2,
    mel_norm="slaney",
    mel_scale="slaney",
    power=1.0,
)
# reshape to [B(1), T, C_mels]
mel_spec = mx.expand_dims(mel_spec, 0)
# Generate waveform
waveform = model(mel_spec) # [B(1), T, 1]
# Reshape to [T, 1]
waveform_float = waveform.squeeze(0)

# Convert to int16
waveform_int16 = mx.clip(waveform_float * 32767, -32768, 32767).astype(mx.int16)

# save to wav
import soundfile as sf

sf.write("output.wav", waveform_int16, h.sampling_rate, "PCM_16")

3. Convert Original BigVGAN Weights to MLX Format

You can convert the original BigVGAN weights to MLX format using the provided script.

repo_id is the Hugging Face model ID of the original BigVGAN model you want to convert.

See nvidia/BigVGAN for move pretrained models.

python -m mlx_bigvgan.convert --repo_id nvidia/bigvgan_v2_xxx  --output_dir mlx_models

References

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_bigvgan-0.1.0.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlx_bigvgan-0.1.0-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file mlx_bigvgan-0.1.0.tar.gz.

File metadata

  • Download URL: mlx_bigvgan-0.1.0.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.22

File hashes

Hashes for mlx_bigvgan-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4364ba14317d06fbc4bbe1a6f6b604eca9084d9b8b702508b4a5f602efd776be
MD5 307d702260b92ec690b7fc27ea250664
BLAKE2b-256 decd293dc19559fe8830582e3654c35a472d1bca719569fc5d56824c9a79fae0

See more details on using hashes here.

File details

Details for the file mlx_bigvgan-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for mlx_bigvgan-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 27a3263fa52ae500bb917d7334e461d4e46f06fc919ae3713b7581e7d0c8f9e6
MD5 f3723750f1870435350d92aa4c1eaa44
BLAKE2b-256 6b969d41d09d235fe2e940e9381934f43473b172ae65f5489af7e35e014f7e17

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page