Skip to main content

Audiogen Codec

Project description

Audiogen Codec (agc)

We are announcing the open source release of Audiogen Codec (agc) 🎉. A low compression 48khz stereo neural audio codec for general audio, optimizing for audio fidelity 🎵.

It comes in two flavors:

  • agc-continuous 🔄 KL regularized, 32 channels, 100hz.
  • agc-discrete 🔢 24 stages of residual vector quantization, 50hz.

AGC (Audiogen Codec) is a convolutional autoencoder based on the DAC architecture, which holds SOTA 🏆. We found that training with EMA and adding a perceptual loss term with CLAP features improved performance. These codecs, being low compression, outperform Meta's EnCodec and DAC on general audio as validated from internal blind ELO games 🎲.

We trained (relatively) very low compression codecs in the pursuit of solving a core issue regarding general music and audio generation, low acoustic quality and audible artifacts, which hinder industry use for these models 🚫🎶. Our hope is to encourage researchers to build hierarchical generative audio models that can efficiently use high sequence length representations without sacrificing semantic abilities 🧠.

This codec will power Audiogen's upcoming models. Stay tuned! 🚀

ELO Image

Installation

pip install audiogen-agc

Usage

from agc import AGC

agc = AGC.from_pretrained("Audiogen/agc-continuous") # or "agc-discrete"

audio = torch.randn(1, 2, 480000) # 48khz stereo

z = agc.encode(audio) # (1, 32, 6000) or (1, 24, 3000)

reconstructed_audio = agc.decode(z) # (1, 2, 480000)

Misc

Example colab: https://colab.research.google.com/drive/1MXeBYMY-dZ3Yas-5rXzggMONIlDDQ5VG#scrollTo=9mtfSc-r4dkn (credit: Christoph from LAION)

Examples

https://audiogen.notion.site/Audiogen-Codec-Examples-546fe64596f54e20be61deae1c674f20

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audiogen_agc-0.1.1.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

audiogen_agc-0.1.1-py3-none-any.whl (7.7 kB view details)

Uploaded Python 3

File details

Details for the file audiogen_agc-0.1.1.tar.gz.

File metadata

  • Download URL: audiogen_agc-0.1.1.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for audiogen_agc-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0854c53c7345afea73e12b267751e344b357daf821cec50b09be48f8f51b5809
MD5 e3966fe86c2103c1524adbdb6f0636ff
BLAKE2b-256 cbb8e8b2a52c8b8714c526db8710ff1d44a8db3b21f2e00c0b4189b861563fa8

See more details on using hashes here.

File details

Details for the file audiogen_agc-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for audiogen_agc-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 313e6ebf931b35e73accca695e59fb30b2808c19884987a2929803dbb8f31447
MD5 cf2f493756f80a3ffffa4c90f17f23a9
BLAKE2b-256 8c6854cfec24454b2a2c31e0d874c1438acaa570bde8347f5ab2402f7d7b3d7b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page