The DualCodec neural audio codec.
Project description
DualCodec
Installation
pip install dualcodec
Available models
- 12hz_v1: DualCodec model trained with 12Hz sampling rate.
- 25hz_v1: DualCodec model trained with 25Hz sampling rate.
How to inference
Download checkpoints to local:
# export HF_ENDPOINT=https://hf-mirror.com # uncomment this to use huggingface mirror if you're in China
huggingface-cli download facebook/w2v-bert-2.0 --local-dir w2v-bert-2.0
huggingface-cli download amphion/dualcodec --local-dir dualcodec_ckpts
To inference an audio in a python script:
import dualcodec
w2v_path = "./w2v-bert-2.0" # your downloaded path
dualcodec_model_path = "./dualcodec_ckpts" # your downloaded path
model_id = "12hz_v1" # or "25hz_v1"
dualcodec_model = dualcodec.get_model(model_id, dualcodec_model_path)
inference = dualcodec.Inference(dualcodec_model=dualcodec_model, dualcodec_path=dualcodec_model_path, w2v_path=w2v_path, device="cuda")
# do inference for your wav
import torchaudio
audio, sr = torchaudio.load(YOUR_WAV.wav)
# resample to 24kHz
audio = torchaudio.functional.resample(audio, sr, 24000)
audio = audio.reshape(1,1,-1)
# extract codes
semantic_codes, acoustic_codes = inference.encode(audio, n_quantizers=8)
# semantic_codes shape: torch.Size([1, 1, T])
# acoustic_codes shape: torch.Size([1, n_quantizers-1, T])
# produce output audio
out_audio = dualcodec_model.decode_from_codes(semantic_codes, acoustic_codes)
# save output audio
torchaudio.save("out.wav", out_audio, 24000)
See "example.ipynb" for example inference scripts.
Training DualCodec
Stay tuned for the training code release!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
dualcodec-0.1.1.tar.gz
(7.2 MB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
dualcodec-0.1.1-py3-none-any.whl
(26.7 kB
view details)
File details
Details for the file dualcodec-0.1.1.tar.gz.
File metadata
- Download URL: dualcodec-0.1.1.tar.gz
- Upload date:
- Size: 7.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b74ed757943b5d4c551d51235cf74482e2cffc2c9f5290ae810718c58320e012
|
|
| MD5 |
962cf6caaa53a86277656322b95a8b1c
|
|
| BLAKE2b-256 |
019928511545918cc3fdb91e65b989530767e9b4bb958854d965e9c16c7e75b1
|
File details
Details for the file dualcodec-0.1.1-py3-none-any.whl.
File metadata
- Download URL: dualcodec-0.1.1-py3-none-any.whl
- Upload date:
- Size: 26.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b98dab008c794cf87c981cb49051f28e0a63e545e966db73ad9e38ca3e4ab73c
|
|
| MD5 |
9dd9dcfb90551512b4ef0a0879c1fb60
|
|
| BLAKE2b-256 |
b82cd0c639e8003f1a9ca5c5f475eda6dbee66150ada5fc9a5e8ec6d75e429f8
|