A NVIDIA NeMo NanoCodec implementation using MLX
Project description
NanoCodec for Apple Silicon
This is an MLX implementation of NVIDIA NeMo NanoCodec, a lightweight neural audio codec.
Model Description
- Architecture: fully convolutional generator neural network and three discriminators. The generator comprises an encoder, followed by vector quantization, and a HiFi-GAN-based decoder.
- Sample Rate: 22.05 kHz
- Framework: MLX
- Parameters: 105M
Installation
pip install nanocodec-mlx soundfile
# Install your mlx_codec package
Usage
from nanocodec_mlx.models.audio_codec import AudioCodecModel
import soundfile as sf
import mlx.core as mx
import numpy as np
# Load model from HuggingFace Hub
model = AudioCodecModel.from_pretrained("nineninesix/nemo-nano-codec-22khz-0.6kbps-12.5fps-MLX")
# Load audio
audio, sr = sf.read("input.wav")
audio_mlx = mx.array(audio, dtype=mx.float32)[None, None, :]
audio_len = mx.array([len(audio)], dtype=mx.int32)
# Encode and decode
tokens, tokens_len = model.encode(audio_mlx, audio_len)
reconstructed, recon_len = model.decode(tokens, tokens_len)
# Save output
output = np.array(reconstructed[0, 0, :int(recon_len[0])])
sf.write("output.wav", output, 22050)
Input
- Input Type: Audio
- Input Format(s): .wav files
- Input Parameters: One-Dimensional (1D)
- Other Properties Related to Input: 22050 Hz Mono-channel Audio
Output
- Output Type: Audio
- Output Format: .wav files
- Output Parameters: One Dimensional (1D)
- Other Properties Related to Output: 22050 Hz Mono-channel Audio
License
This code is licensed under the Apache License 2.0. See LICENSE for details.
The original NVIDIA NeMo NanoCodec model weights and architecture are developed by NVIDIA Corporation and are licensed under the NVIDIA Open Model License. See NOTICE for attribution.
When using this project, you must comply with both licenses.
Citation
This is an MLX implementation of NVIDIA NeMo NanoCodec. If you use this work, please cite the original:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nanocodec_mlx-0.1.0.tar.gz.
File metadata
- Download URL: nanocodec_mlx-0.1.0.tar.gz
- Upload date:
- Size: 24.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0fca4e603c66ed651f14178794efd9a2579acd2a60784c3033d456ccb7fc0339
|
|
| MD5 |
f8e1fac9e450a098908b15bed81accb7
|
|
| BLAKE2b-256 |
ea5e235086431bf5bd783b87442a0cf3a2d3162c90bd588530af6222c5fdfd65
|
File details
Details for the file nanocodec_mlx-0.1.0-py3-none-any.whl.
File metadata
- Download URL: nanocodec_mlx-0.1.0-py3-none-any.whl
- Upload date:
- Size: 26.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
410fae5f2c6457f016dfbe157adddf03572f83adc9139bc2c47342b4848b6c90
|
|
| MD5 |
5602de914d25b573a4b21a000b24ee5d
|
|
| BLAKE2b-256 |
928292cc95e01080439d10b2f506bb3283098c523d3802682feadcea8dfd6877
|