Vocos - MLX
Project description
Vocos — MLX
Implementation of Vocos with the MLX framework.
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
Installation
To use Vocos in inference mode, install it using:
pip install vocos-mlx
Usage
Mel Spectrogram
from vocos_mlx import Vocos, load_audio, log_mel_spectrogram
vocos = Vocos.from_pretrained("lucasnewman/vocos-mel-24khz")
# reconstruct
audio = load_audio("audio.wav", 24_000)
reconstructed_audio = vocos(audio)
# decode from mel spec
mel_spec = log_mel_spectrogram(audio, n_mels = 100)
decoded_audio = vocos.decode(mel_spec)
Encodec
from vocos_mlx import Vocos, load_audio
vocos = Vocos.from_pretrained("lucasnewman/vocos-encodec-24khz")
# reconstruct
audio = load_audio("audio.wav", 24_000)
reconstructed_audio = vocos(audio, bandwidth_id = 3)
# decode with encodec codes
codes = vocos.feature_extractor.get_encodec_codes(audio, bandwidth_id = 3)
decoded_audio = vocos.decode_from_codes(codes, bandwidth_id = 3)
Citations
@article{siuzdak2023vocos,
title={Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis},
author={Siuzdak, Hubert},
journal={arXiv preprint arXiv:2306.00814},
year={2023}
}
License
The code in this repository is released under the MIT license as found in the LICENSE file.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
vocos_mlx-0.0.4.tar.gz
(14.6 kB
view hashes)
Built Distribution
vocos_mlx-0.0.4-py3-none-any.whl
(14.9 kB
view hashes)
Close
Hashes for vocos_mlx-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f3015e80d9cd1b0472b29f2f6fd038e08591e1ad02b40725dcbf3a8aa96f4323 |
|
MD5 | 7e55e7ed0d8781b787399b145bd703b0 |
|
BLAKE2b-256 | 129523474c11c1287be4969242ab3af83d5650872b2e0f84ff373e0888e17ff7 |