MusicGen + AudioGen on Apple Silicon via MLX — full AudioCraft for M-series Macs, no CUDA needed

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

mlx-audiocraft

MusicGen + AudioGen on Apple Silicon via MLX — the first full AudioCraft port for M-series Macs. No CUDA, no server, no Docker. Just pip install and generate.

# Generate sound effects (NEW — not in any other MLX package)
audiogen-mlx "keyboard typing, office ambience" -d 5 -o sfx.wav

# Generate music
musicgen-mlx "upbeat cinematic tech promo, piano, 120 BPM, no vocals" -d 30 -o music.wav

What's unique about this package

Package	MusicGen	AudioGen (SFX)	MLX (Apple GPU)
`audiocraft` (Meta)	✅	✅	❌
`musicgen-mlx`	✅	❌	✅
`mlx-audiocraft` (this)	✅	✅	✅

AudioGen on MLX is new. This is the first port that brings text-to-sound-effects to Apple Silicon with hardware acceleration.

How it works (for learners)

Here's the mental model you need:

Your text prompt
      ↓
  T5 Encoder          ← reads your text, runs on CPU (PyTorch)
      ↓
  Transformer LM      ← generates audio "tokens", runs on MLX (Apple GPU)
      ↓
  EnCodec Decoder     ← turns tokens into a waveform, runs on MLX
      ↓
  WAV file

MusicGen and AudioGen use the exact same pipeline — they're just trained on different data (music vs. sound effects). This is why porting AudioGen was mostly adding audiogen.py that inherits the whole pipeline.

MLX is Apple's own ML framework optimised for the unified memory in M-series chips — the CPU, GPU, and Neural Engine all share the same RAM, so there's zero data transfer overhead between them.

Install

pip install mlx-audiocraft

Requirements: macOS 13+, Apple Silicon (M1/M2/M3/M4), Python 3.10+

Quick start

Sound effects (AudioGen)

from mlx_audiocraft import AudioGen

model = AudioGen.get_pretrained("facebook/audiogen-medium")
model.set_generation_params(duration=5)

wav = model.generate(["dog barking in a park"])
# wav shape: [batch, channels, samples]

Music (MusicGen)

from mlx_audiocraft import MusicGen

model = MusicGen.get_pretrained("facebook/musicgen-small")
model.set_generation_params(duration=30)

wav = model.generate(["calm lo-fi beat, soft piano, vinyl crackle"])

CLI

# Sound effects
audiogen-mlx "crowd applause, conference room" -d 5
audiogen-mlx "rain on a window, thunder in the distance" -d 8 -o rain.wav

# Music
musicgen-mlx "epic orchestral soundtrack" -m facebook/musicgen-large -d 20
musicgen-mlx "funky disco groove" "ambient pad wide reverb" -d 10  # batch

Models

AudioGen

Model	Size	Download	Sample Rate
`facebook/audiogen-medium`	1.5B	~3.6 GB	16 kHz

MusicGen

Model	Size	Download	Sample Rate
`facebook/musicgen-small`	300M	~1.2 GB	32 kHz
`facebook/musicgen-medium`	1.5B	~3.2 GB	32 kHz
`facebook/musicgen-large`	3.3B	~6.5 GB	32 kHz
`facebook/musicgen-stereo-small`	300M	~1.2 GB	32 kHz stereo
`facebook/musicgen-stereo-medium`	1.5B	~3.2 GB	32 kHz stereo

Models download automatically from HuggingFace on first use and are cached in ~/.cache/huggingface/.

Benchmark (M4 Max, 64 GB)

Run python benchmarks/run_benchmarks.py to generate results for your machine.

Model	Duration	Time	Realtime
audiogen-medium	5s	~8s	0.6x
musicgen-small	10s	~8s	1.3x
musicgen-medium	10s	~17s	0.6x
musicgen-large	10s	~35s	0.3x

Faster than realtime means generation is quicker than the audio duration.

Prompt guide

Sound effects (AudioGen)

Be literal and specific:

"keyboard typing, subtle office background noise"
"notification chime, clean and bright"
"crowd applause, conference room, 3 seconds"
"rain falling on a metal roof, distant thunder"
"coffee machine brewing, kitchen ambience"

Music (MusicGen)

Include style, instrumentation, BPM, and always end with , no vocals:

"upbeat cinematic tech promo, clean piano with electronic pads, 120 BPM, no vocals"
"calm educational background, soft piano and ambient pads, 75 BPM, no vocals"
"energetic SaaS launch, modern synths, punchy drums, 120 BPM, no vocals"
"Hindi classical influence, sitar and tabla, meditative, 60 BPM, no vocals"

Save output

import numpy as np
import soundfile as sf

wav = model.generate(["your prompt"])
audio = np.array(wav[0]).T          # [channels, samples] → [samples, channels]
if audio.ndim == 2 and audio.shape[1] == 1:
    audio = audio[:, 0]             # stereo → mono if needed
sf.write("output.wav", audio, model.sample_rate)

Architecture deep-dive (for learners)

If you want to understand how this works under the hood, here's a reading order:

mlx_audiocraft/models/genmodel.py — BaseGenModel — the base class all models inherit. Understand generate(), _prepare_tokens_and_attributes(), and _generate_tokens().
mlx_audiocraft/models/audiogen.py — our AudioGen port. It's tiny (~90 lines) because it just inherits BaseGenModel and points at AudioGen's weights. Good first file to read.
mlx_audiocraft/models/musicgen.py — MusicGen adds melody conditioning on top of BaseGenModel. Compare with audiogen.py to see the diff.
mlx_audiocraft/models/loaders.py — how model weights are downloaded from HuggingFace and converted from PyTorch format to MLX.
mlx_audiocraft/modules/transformer.py — the MLX transformer implementation. This is the core of the language model.
mlx_audiocraft/models/encodec.py — the audio codec (compress waveform → tokens, decode tokens → waveform).

Attribution

The MusicGen MLX engine is based on musicgen-mlx by Andrade Olivier. The original AudioCraft library is by Meta AI Research.

AudioGen MLX port is original work in this repository.

License

MIT — see LICENSE.

The pre-trained model weights (facebook/audiogen-medium, facebook/musicgen-*) are released under the CC-BY-NC 4.0 licence by Meta.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

theashishmaurya

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Apr 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_audiocraft-0.1.0.tar.gz (53.8 kB view details)

Uploaded Apr 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mlx_audiocraft-0.1.0-py3-none-any.whl (65.4 kB view details)

Uploaded Apr 24, 2026 Python 3

File details

Details for the file mlx_audiocraft-0.1.0.tar.gz.

File metadata

Download URL: mlx_audiocraft-0.1.0.tar.gz
Upload date: Apr 24, 2026
Size: 53.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mlx_audiocraft-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`dc3e7a90b542661a66dc54f38832dc1f6dd07ce6dea42279ae6986cc76144451`
MD5	`7285c68439e00f6cffe325d45a2abb53`
BLAKE2b-256	`5303c80492e75a98a9c8dcc247b8bd5ca32b6ae7db2d0f5849f8b1287c885740`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlx_audiocraft-0.1.0.tar.gz:

Publisher: publish.yml on theashishmaurya/mlx-audiocraft

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mlx_audiocraft-0.1.0.tar.gz
- Subject digest: dc3e7a90b542661a66dc54f38832dc1f6dd07ce6dea42279ae6986cc76144451
- Sigstore transparency entry: 1371028638
- Sigstore integration time: Apr 24, 2026
Source repository:
- Permalink: theashishmaurya/mlx-audiocraft@70205ef0416706a4f27d0f1c2b10b4f7ea6520d8
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/theashishmaurya
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@70205ef0416706a4f27d0f1c2b10b4f7ea6520d8
- Trigger Event: release

File details

Details for the file mlx_audiocraft-0.1.0-py3-none-any.whl.

File metadata

Download URL: mlx_audiocraft-0.1.0-py3-none-any.whl
Upload date: Apr 24, 2026
Size: 65.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mlx_audiocraft-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e0851dcc63b707ba681a498359d599c8d855ee02e0ee7ebd2e2e2864c204b801`
MD5	`f740fce5505adbf4d583cc58f0e98bfc`
BLAKE2b-256	`0692d11b5f9d06d2209d50665ff3005810a29fb6cdd4d0e5767664c3ab3edf0c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlx_audiocraft-0.1.0-py3-none-any.whl:

Publisher: publish.yml on theashishmaurya/mlx-audiocraft

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mlx_audiocraft-0.1.0-py3-none-any.whl
- Subject digest: e0851dcc63b707ba681a498359d599c8d855ee02e0ee7ebd2e2e2864c204b801
- Sigstore transparency entry: 1371028722
- Sigstore integration time: Apr 24, 2026
Source repository:
- Permalink: theashishmaurya/mlx-audiocraft@70205ef0416706a4f27d0f1c2b10b4f7ea6520d8
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/theashishmaurya
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@70205ef0416706a4f27d0f1c2b10b4f7ea6520d8
- Trigger Event: release

mlx-audiocraft 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

mlx-audiocraft

What's unique about this package

How it works (for learners)

Install

Quick start

Sound effects (AudioGen)

Music (MusicGen)

CLI

Models

AudioGen

MusicGen

Benchmark (M4 Max, 64 GB)

Prompt guide

Sound effects (AudioGen)

Music (MusicGen)

Save output

Architecture deep-dive (for learners)

Attribution

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance