Audiogen Codec
Project description
Audiogen Codec (agc)
We are announcing the open source release of Audiogen Codec (agc) 🎉. A low compression 48khz stereo neural audio codec for general audio, optimizing for audio fidelity 🎵.
It comes in two flavors:
- agc-continuous 🔄 KL regularized, 32 channels, 100hz.
- agc-discrete 🔢 24 stages of residual vector quantization, 50hz.
AGC (Audiogen Codec) is a convolutional autoencoder based on the DAC architecture, which holds SOTA 🏆. We found that training with EMA and adding a perceptual loss term with CLAP features improved performance. These codecs, being low compression, outperform Meta's EnCodec and DAC on general audio as validated from internal blind ELO games 🎲.
We trained (relatively) very low compression codecs in the pursuit of solving a core issue regarding general music and audio generation, low acoustic quality and audible artifacts, which hinder industry use for these models 🚫🎶. Our hope is to encourage researchers to build hierarchical generative audio models that can efficiently use high sequence length representations without sacrificing semantic abilities 🧠.
This codec will power Audiogen's upcoming models. Stay tuned! 🚀
Installation
pip install audiogen-agc
Usage
from agc import AGC
agc = AGC.from_pretrained("Audiogen/agc-continuous") # or "agc-discrete"
audio = torch.randn(1, 2, 480000) # 48khz stereo
z = agc.encode(audio) # (1, 32, 6000) or (1, 24, 3000)
reconstructed_audio = agc.decode(z) # (1, 2, 480000)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for audiogen_agc-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eaa8caf079382f14b45179eb7266dc14cd0a66874522416b3ca0ce5d2fe64e63 |
|
MD5 | a194c1e4960ca49da3b364d6ba562ec6 |
|
BLAKE2b-256 | 16e42e4e22f7184f1f2a559c04688476ba776ffbdee3187eb42bd9b9a7261f04 |