Descript Audio Codec - MLX
Project description
Descript Audio Codec — MLX
Implementation of the Descript Audio Codec, with the MLX framework.
Descript can compress 44kHz audio into discrete codes at 8kbps and produces high quality reconstructions at a 90:1 compression ratio compared to the raw audio.
This repository is based on the original Pytorch implementation available here.
Installation
pip install descript-mlx
Usage
You can load a pretrained model from Python like this:
import mlx.core as mx
from descript_mlx import DAC
dac = DAC.from_pretrained("44khz") # or "24khz" / "16khz"
audio = mx.array(...)
# encode into latents and codes
z, codes, latents, commitment_loss, codebook_loss = dac.encode(audio)
# reconstruct from latents/codes to audio
reconstucted_audio = dac.decode(z)
# compress audio to a DAC file
dac_file = dac.compress(audio)
dac_file.save("/path/to/file.dac")
# decompress audio from a DAC file
reconstructed_audio = dac.decompress("/path/to/file.dac")
Citations
@misc{kumar2023highfidelityaudiocompressionimproved,
title={High-Fidelity Audio Compression with Improved RVQGAN},
author={Rithesh Kumar and Prem Seetharaman and Alejandro Luebs and Ishaan Kumar and Kundan Kumar},
year={2023},
eprint={2306.06546},
archivePrefix={arXiv},
primaryClass={cs.SD},
url={https://arxiv.org/abs/2306.06546},
}
License
The code in this repository is released under the MIT license as found in the LICENSE file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
descript_mlx-0.0.1.tar.gz
(3.5 kB
view hashes)
Built Distribution
Close
Hashes for descript_mlx-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 53e71a919620d27e373947906562487a3152e8f5c91699244dde00b538945665 |
|
MD5 | 20f5a8c3ae7dff878eaf5c310dc67aba |
|
BLAKE2b-256 | b19e8df824b17899e9a689ceab78156c3e81a87c9dd6a1ffd1cc88b5133b9f54 |