Skip to main content

Vocos - MLX

Project description

Vocos — MLX

Implementation of Vocos with the MLX framework. Vocos allows for high quality reconstruction of audio from Mel spectrograms or EnCodec tokens.

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

Paper [abs] [pdf]

Installation

To use Vocos in inference mode, install it using:

pip install vocos-mlx

Usage

Mel Spectrogram

from vocos_mlx import Vocos, load_audio, log_mel_spectrogram

vocos = Vocos.from_pretrained("lucasnewman/vocos-mel-24khz")

# reconstruct
audio = load_audio("audio.wav", 24_000)
reconstructed_audio = vocos(audio)

# decode from mel spec
mel_spec = log_mel_spectrogram(audio, n_mels = 100)
decoded_audio = vocos.decode(mel_spec)

EnCodec

from vocos_mlx import Vocos, load_audio

vocos = Vocos.from_pretrained("lucasnewman/vocos-encodec-24khz")

# reconstruct
audio = load_audio("audio.wav", 24_000)
reconstructed_audio = vocos(audio, bandwidth_id = 3)

# decode with encodec codes
codes = vocos.get_encodec_codes(audio, bandwidth_id = 3)
decoded_audio = vocos.decode_from_codes(codes, bandwidth_id = 3)

Appreciation

Awni Hannun for the reference EnCodec implementation for MLX.

Citations

@article{siuzdak2023vocos,
  title={Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis},
  author={Siuzdak, Hubert},
  journal={arXiv preprint arXiv:2306.00814},
  year={2023}
}

License

The code in this repository is released under the MIT license as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vocos_mlx-0.0.7.tar.gz (14.7 kB view details)

Uploaded Source

Built Distribution

vocos_mlx-0.0.7-py3-none-any.whl (15.1 kB view details)

Uploaded Python 3

File details

Details for the file vocos_mlx-0.0.7.tar.gz.

File metadata

  • Download URL: vocos_mlx-0.0.7.tar.gz
  • Upload date:
  • Size: 14.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for vocos_mlx-0.0.7.tar.gz
Algorithm Hash digest
SHA256 3d556736aaafe23b760befa5f7b639977a3d5c52ad7af65aefe0e636d5541ba3
MD5 eded1e7e1751e244635460a239596745
BLAKE2b-256 f1f5fc2149185abe3a65358e96d429f9e00148341f88da52851615fc4c991fd4

See more details on using hashes here.

File details

Details for the file vocos_mlx-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: vocos_mlx-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 15.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for vocos_mlx-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 a80ea5961811e982ef82a77b2450ea4cbeb5c47318497acaddbea4b58cbaa8d6
MD5 012c7ff2cbdf8d9bb8212ed7946bbde4
BLAKE2b-256 16ecd92a634ba5538d8213c300e2c3429f8f4a512db1012967cbf3ac11f08c01

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page