Skip to main content

The DualCodec neural audio codec.

Project description

DualCodec

Installation

pip install dualcodec

Available models

  • 12hz_v1: DualCodec model trained with 12Hz sampling rate.
  • 25hz_v1: DualCodec model trained with 25Hz sampling rate.

How to inference

Download checkpoints to local:

# export HF_ENDPOINT=https://hf-mirror.com      # uncomment this to use huggingface mirror if you're in China
huggingface-cli download facebook/w2v-bert-2.0 --local-dir w2v-bert-2.0
huggingface-cli download amphion/dualcodec --local-dir dualcodec_ckpts

To inference an audio in a python script:

import dualcodec

w2v_path = "./w2v-bert-2.0" # your downloaded path
dualcodec_model_path = "./dualcodec_ckpts" # your downloaded path
model_id = "12hz_v1" # or "25hz_v1"

dualcodec_model = dualcodec.get_model(model_id, dualcodec_model_path)
inference = dualcodec.Inference(dualcodec_model=dualcodec_model, dualcodec_path=dualcodec_model_path, w2v_path=w2v_path, device="cuda")

# do inference for your wav
import torchaudio
audio, sr = torchaudio.load(YOUR_WAV.wav)
# resample to 24kHz
audio = torchaudio.functional.resample(audio, sr, 24000)
audio = audio.reshape(1,1,-1)
# extract codes
semantic_codes, acoustic_codes = inference.encode(audio, n_quantizers=8)
# semantic_codes shape: torch.Size([1, 1, T])
# acoustic_codes shape: torch.Size([1, n_quantizers-1, T])

# produce output audio
out_audio = dualcodec_model.decode_from_codes(semantic_codes, acoustic_codes)

# save output audio
torchaudio.save("out.wav", out_audio, 24000)

See "example.ipynb" for example inference scripts.

Training DualCodec

Stay tuned for the training code release!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dualcodec-0.1.1.tar.gz (7.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dualcodec-0.1.1-py3-none-any.whl (26.7 kB view details)

Uploaded Python 3

File details

Details for the file dualcodec-0.1.1.tar.gz.

File metadata

  • Download URL: dualcodec-0.1.1.tar.gz
  • Upload date:
  • Size: 7.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.2

File hashes

Hashes for dualcodec-0.1.1.tar.gz
Algorithm Hash digest
SHA256 b74ed757943b5d4c551d51235cf74482e2cffc2c9f5290ae810718c58320e012
MD5 962cf6caaa53a86277656322b95a8b1c
BLAKE2b-256 019928511545918cc3fdb91e65b989530767e9b4bb958854d965e9c16c7e75b1

See more details on using hashes here.

File details

Details for the file dualcodec-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: dualcodec-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 26.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.2

File hashes

Hashes for dualcodec-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b98dab008c794cf87c981cb49051f28e0a63e545e966db73ad9e38ca3e4ab73c
MD5 9dd9dcfb90551512b4ef0a0879c1fb60
BLAKE2b-256 b82cd0c639e8003f1a9ca5c5f475eda6dbee66150ada5fc9a5e8ec6d75e429f8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page