Skip to main content

A NVIDIA NeMo NanoCodec implementation using MLX

Project description

NanoCodec for Apple Silicon

This is an MLX implementation of NVIDIA NeMo NanoCodec, a lightweight neural audio codec.

Model Description

  • Architecture: fully convolutional generator neural network and three discriminators. The generator comprises an encoder, followed by vector quantization, and a HiFi-GAN-based decoder.
  • Sample Rate: 22.05 kHz
  • Framework: MLX
  • Parameters: 105M

Installation

pip install nanocodec-mlx soundfile
# Install your mlx_codec package

Usage

from nanocodec_mlx.models.audio_codec import AudioCodecModel
import soundfile as sf
import mlx.core as mx
import numpy as np

# Load model from HuggingFace Hub
model = AudioCodecModel.from_pretrained("nineninesix/nemo-nano-codec-22khz-0.6kbps-12.5fps-MLX")

# Load audio
audio, sr = sf.read("input.wav")
audio_mlx = mx.array(audio, dtype=mx.float32)[None, None, :]

audio_len = mx.array([len(audio)], dtype=mx.int32)

# Encode and decode
tokens, tokens_len = model.encode(audio_mlx, audio_len)
reconstructed, recon_len = model.decode(tokens, tokens_len)

# Save output
output = np.array(reconstructed[0, 0, :int(recon_len[0])])
sf.write("output.wav", output, 22050)

Input

  • Input Type: Audio
  • Input Format(s): .wav files
  • Input Parameters: One-Dimensional (1D)
  • Other Properties Related to Input: 22050 Hz Mono-channel Audio

Output

  • Output Type: Audio
  • Output Format: .wav files
  • Output Parameters: One Dimensional (1D)
  • Other Properties Related to Output: 22050 Hz Mono-channel Audio

License

This code is licensed under the Apache License 2.0. See LICENSE for details.

The original NVIDIA NeMo NanoCodec model weights and architecture are developed by NVIDIA Corporation and are licensed under the NVIDIA Open Model License. See NOTICE for attribution.

When using this project, you must comply with both licenses.

Citation

This is an MLX implementation of NVIDIA NeMo NanoCodec. If you use this work, please cite the original:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nanocodec_mlx-0.1.0.tar.gz (24.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nanocodec_mlx-0.1.0-py3-none-any.whl (26.7 kB view details)

Uploaded Python 3

File details

Details for the file nanocodec_mlx-0.1.0.tar.gz.

File metadata

  • Download URL: nanocodec_mlx-0.1.0.tar.gz
  • Upload date:
  • Size: 24.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for nanocodec_mlx-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0fca4e603c66ed651f14178794efd9a2579acd2a60784c3033d456ccb7fc0339
MD5 f8e1fac9e450a098908b15bed81accb7
BLAKE2b-256 ea5e235086431bf5bd783b87442a0cf3a2d3162c90bd588530af6222c5fdfd65

See more details on using hashes here.

File details

Details for the file nanocodec_mlx-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: nanocodec_mlx-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 26.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for nanocodec_mlx-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 410fae5f2c6457f016dfbe157adddf03572f83adc9139bc2c47342b4848b6c90
MD5 5602de914d25b573a4b21a000b24ee5d
BLAKE2b-256 928292cc95e01080439d10b2f506bb3283098c523d3802682feadcea8dfd6877

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page