Audiogen Codec
Project description
Audiogen Codec (agc)
We are announcing the open source release of Audiogen Codec (agc) 🎉. A low compression 48khz stereo neural audio codec for general audio, optimizing for audio fidelity 🎵.
It comes in two flavors:
- agc-continuous 🔄 KL regularized, 32 channels, 100hz.
- agc-discrete 🔢 24 stages of residual vector quantization, 50hz.
AGC (Audiogen Codec) is a convolutional autoencoder based on the DAC architecture, which holds SOTA 🏆. We found that training with EMA and adding a perceptual loss term with CLAP features improved performance. These codecs, being low compression, outperform Meta's EnCodec and DAC on general audio as validated from internal blind ELO games 🎲.
We trained (relatively) very low compression codecs in the pursuit of solving a core issue regarding general music and audio generation, low acoustic quality and audible artifacts, which hinder industry use for these models 🚫🎶. Our hope is to encourage researchers to build hierarchical generative audio models that can efficiently use high sequence length representations without sacrificing semantic abilities 🧠.
This codec will power Audiogen's upcoming models. Stay tuned! 🚀
Installation
pip install audiogen-agc
Usage
from agc import AGC
agc = AGC.from_pretrained("Audiogen/agc-continuous") # or "agc-discrete"
audio = torch.randn(1, 2, 480000) # 48khz stereo
z = agc.encode(audio) # (1, 32, 6000) or (1, 24, 3000)
reconstructed_audio = agc.decode(z) # (1, 2, 480000)
Misc
Example colab: https://colab.research.google.com/drive/1MXeBYMY-dZ3Yas-5rXzggMONIlDDQ5VG#scrollTo=9mtfSc-r4dkn (credit: Christoph from LAION)
Examples
https://audiogen.notion.site/Audiogen-Codec-Examples-546fe64596f54e20be61deae1c674f20
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file audiogen_agc-0.1.1.tar.gz
.
File metadata
- Download URL: audiogen_agc-0.1.1.tar.gz
- Upload date:
- Size: 6.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0854c53c7345afea73e12b267751e344b357daf821cec50b09be48f8f51b5809 |
|
MD5 | e3966fe86c2103c1524adbdb6f0636ff |
|
BLAKE2b-256 | cbb8e8b2a52c8b8714c526db8710ff1d44a8db3b21f2e00c0b4189b861563fa8 |
File details
Details for the file audiogen_agc-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: audiogen_agc-0.1.1-py3-none-any.whl
- Upload date:
- Size: 7.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 313e6ebf931b35e73accca695e59fb30b2808c19884987a2929803dbb8f31447 |
|
MD5 | cf2f493756f80a3ffffa4c90f17f23a9 |
|
BLAKE2b-256 | 8c6854cfec24454b2a2c31e0d874c1438acaa570bde8347f5ab2402f7d7b3d7b |