Skip to main content

MLX-Audio is a package for inference of text-to-speech (TTS) and speech-to-speech (STS) models locally on your Mac using MLX

Project description

MLX-Audio

Blaizzy%2Fmlx-audio | Trendshift

PyPI version Python License: MIT GitHub stars

The best audio processing library built on Apple's MLX framework, providing fast and efficient text-to-speech (TTS), speech-to-text (STT), and speech-to-speech (STS) on Apple Silicon.

Table of Contents

Features

  • Fast inference optimized for Apple Silicon (M series chips)
  • Multiple model architectures for TTS, STT, and STS
  • Multilingual support across models
  • Voice customization and cloning capabilities
  • Adjustable speech speed control
  • Interactive web interface with 3D audio visualization
  • OpenAI-compatible REST API
  • Quantization support (3-bit, 4-bit, 6-bit, 8-bit, and more) for optimized performance
  • Swift package for iOS/macOS integration

Installation

Using pip

pip install mlx-audio

Using uv to install only the command line tools

Latest release from pypi:

uv tool install --force mlx-audio --prerelease=allow

Latest code from github:

uv tool install --force git+https://github.com/Blaizzy/mlx-audio.git --prerelease=allow

For development or web interface:

git clone https://github.com/Blaizzy/mlx-audio.git
cd mlx-audio
pip install -e ".[dev, server]"

Quick Start

Command Line

# Basic TTS generation
mlx_audio.tts.generate --model mlx-community/Qwen3-TTS-12Hz-1.7B-Base-8bit --text 'Hello, world!' --voice Chelsie

# With a different voice and language hint
mlx_audio.tts.generate --model mlx-community/Qwen3-TTS-12Hz-1.7B-Base-8bit --text 'Welcome to MLX-Audio!' --voice Ethan --lang_code English

# Play audio immediately
mlx_audio.tts.generate --model mlx-community/Qwen3-TTS-12Hz-1.7B-Base-8bit --text 'Hello!' --voice Chelsie --play

# Save to a specific directory
mlx_audio.tts.generate --model mlx-community/Qwen3-TTS-12Hz-1.7B-Base-8bit --text 'Hello!' --voice Chelsie --output_path ./my_audio

# Stream audio during generation
mlx_audio.tts.generate --model mlx-community/Qwen3-TTS-12Hz-1.7B-Base-8bit --text 'Hello!' --voice Chelsie --stream

# Stream audio during generation and save it to disk
mlx_audio.tts.generate --model mlx-community/Qwen3-TTS-12Hz-1.7B-Base-8bit --text 'Hello!' --voice Chelsie --stream --save

# Join multiple generated segments into one file
mlx_audio.tts.generate --model mlx-community/Qwen3-TTS-12Hz-1.7B-Base-8bit --text $'Hello!\nHow are you?' --voice Chelsie --join_audio

By default, when generation yields multiple segments, mlx-audio saves numbered files such as audio_000.wav and audio_001.wav. Use --join_audio to save one combined file instead. When using --stream, add --save to write the streamed audio to disk.

Python API

from mlx_audio.tts.utils import load_model

# Load model
model = load_model("mlx-community/Qwen3-TTS-12Hz-1.7B-Base-8bit")

# Generate speech
for result in model.generate(
    "Hello from MLX-Audio!",
    voice="Chelsie",
    lang_code="English",
):
    print(f"Generated {result.audio.shape[0]} samples")
    # result.audio contains the waveform as mx.array

Supported Models

Text-to-Speech (TTS)

Model Description Languages Repo
Kokoro Fast, high-quality multilingual TTS EN, JA, ZH, FR, ES, IT, PT, HI bf16, 8bit, 6bit, 4bit
KittenTTS Compact KittenTTS 0.8 models for edge-friendly TTS EN nano, micro, mini, collection
Qwen3-TTS Alibaba's multilingual TTS with voice design ZH, EN, JA, KO, + more mlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDesign-bf16
Higgs Audio v3 4B conversational TTS with voice cloning and inline control tokens 100 languages bosonai/higgs-audio-v3-tts-4b
OmniVoice Zero-shot multilingual TTS with voice cloning, batch generation, and nonverbal tags 646+ languages mlx-community/OmniVoice-bf16
CSM / MisoTTS Sesame-style conversational speech models with voice cloning EN mlx-community/csm-1b, MisoTTS bf16, MisoTTS 8bit
Dia Dialogue-focused TTS EN mlx-community/Dia-1.6B-fp16
OuteTTS Efficient TTS model EN mlx-community/OuteTTS-1.0-0.6B-fp16
Spark SparkTTS model EN, ZH mlx-community/Spark-TTS-0.5B-bf16
Chatterbox Expressive multilingual TTS EN, ES, FR, DE, IT, PT, PL, TR, RU, NL, CS, AR, ZH, JA, HU, KO mlx-community/chatterbox-fp16
Soprano High-quality TTS EN mlx-community/Soprano-1.1-80M-bf16
Ming Omni TTS (BailingMM) Multimodal generation with voice cloning, style control, and speech/music/event generation EN, ZH mlx-community/Ming-omni-tts-16.8B-A3B-bf16
Ming Omni TTS (Dense) Lightweight dense Ming Omni variant for voice cloning and style control EN, ZH mlx-community/Ming-omni-tts-0.5B-bf16
KugelAudio SOTA 7B AR+Diffusion TTS for European languages EN, DE, FR, ES, IT, PT, NL, PL, RU, UK, + 14 more kugelaudio/kugelaudio-0-open
Voxtral TTS Mistral's 4B multilingual TTS (20 voices, 9 languages) EN, FR, ES, DE, IT, PT, NL, AR, HI mlx-community/Voxtral-4B-TTS-2603-mlx-bf16
LongCat-AudioDiT SOTA diffusion TTS in waveform latent space with voice cloning ZH, EN mlx-community/LongCat-AudioDiT-1B-bf16
MeloTTS Lightweight VITS2-based TTS with streaming EN (more coming) mlx-community/MeloTTS-English-MLX
MOSS-TTS 8B delay-pattern and 1.7B local-transformer multilingual TTS with voice cloning 31 languages OpenMOSS-Team/MOSS-TTS-v1.5, OpenMOSS-Team/MOSS-TTS, OpenMOSS-Team/MOSS-TTS-Local-Transformer
MOSS-TTS-Nano Tiny multilingual voice-cloning TTS 20 languages mlx-community/MOSS-TTS-Nano-100M
Higgs Audio v2 3B Llama-backed TTS with real-time voice cloning EN, ZH, KO, DE, ES bf16 (upstream), q8, q6

Speech-to-Text (STT)

Model Description Languages Repo
Whisper OpenAI's robust STT model 99+ languages mlx-community/whisper-large-v3-turbo-asr-fp16
Distil-Whisper Distilled fast Whisper variants EN distil-whisper/distil-large-v3
Qwen3-ASR Alibaba's multilingual ASR ZH, EN, JA, KO, + more mlx-community/Qwen3-ASR-1.7B-8bit
Mega-ASR Routed Qwen3-ASR with automatic clean/base vs degraded/LoRA switching EN (fixtures), multilingual Qwen3-ASR backbone README
Qwen3-ForcedAligner Word-level audio alignment ZH, EN, JA, KO, + more mlx-community/Qwen3-ForcedAligner-0.6B-8bit
Parakeet NVIDIA's accurate STT EN (v2), 25 EU languages (v3) mlx-community/parakeet-tdt-0.6b-v3
Nemotron 3.5 ASR (streaming) NVIDIA's cache-aware streaming FastConformer-RNNT with language-ID prompting 40 language-locales mlx-community/nemotron-3.5-asr-streaming-0.6b · README
Voxtral Mistral's speech model Multiple mlx-community/Voxtral-Mini-3B-2507-bf16
Voxtral Realtime Mistral's 4B streaming STT Multiple 4bit, fp16
VibeVoice-ASR Microsoft's 9B ASR with diarization & timestamps Multiple mlx-community/VibeVoice-ASR-bf16
Canary NVIDIA's multilingual ASR with translation 25 EU + RU, UK README
Moonshine Useful Sensors' lightweight ASR EN README
MMS Meta's massively multilingual ASR with adapters 1000+ README
Granite Speech IBM's ASR + speech translation EN, FR, DE, ES, PT, JA README
Qwen2-Audio Alibaba's multimodal audio understanding (ASR, captioning, emotion, translation) Multiple mlx-community/Qwen2-Audio-7B-Instruct-4bit

Voice Activity Detection / Speaker Diarization (VAD)

Model Description Languages Repo
Silero VAD Lightweight speech/non-speech detection with streaming state Language-agnostic mlx-community/silero-vad
Sortformer v1 NVIDIA's end-to-end speaker diarization (up to 4 speakers) Language-agnostic mlx-community/diar_sortformer_4spk-v1-fp32
Sortformer v2.1 NVIDIA's streaming speaker diarization with AOSC compression Language-agnostic mlx-community/diar_streaming_sortformer_4spk-v2.1-fp32

See the model READMEs for API details, streaming examples, and conversion steps.

Speech-to-Speech (STS)

Model Description Use Case Repo
SAM-Audio Text-guided source separation Extract specific sounds mlx-community/sam-audio-large
Liquid2.5-Audio* Speech-to-Speech, Text-to-Speech and Speech-to-Text Speech interactions mlx-community/LFM2.5-Audio-1.5B-8bit
MossFormer2 SE Speech enhancement Noise removal starkdmi/MossFormer2_SE_48K_MLX
DeepFilterNet (1/2/3) Speech enhancement Noise suppression mlx-community/DeepFilterNet-mlx

Model Examples

Qwen3-TTS

Alibaba's state-of-the-art multilingual TTS with voice cloning, emotion control, and voice design capabilities.

from mlx_audio.tts.utils import load_model

model = load_model("mlx-community/Qwen3-TTS-12Hz-0.6B-Base-bf16")
results = list(model.generate(
    text="Hello, welcome to MLX-Audio!",
    voice="Chelsie",
    language="English",
))

audio = results[0].audio  # mx.array

See the Qwen3-TTS README for voice cloning, CustomVoice, VoiceDesign, and all available models.

OmniVoice

OmniVoice is a zero-shot multilingual TTS model for 646+ languages with voice cloning, batch generation, pronunciation controls, and nonverbal tags such as [laughter] and [sigh]. It uses a bidirectional Qwen3 backbone with iterative masked generation and a HiggsAudioV2 acoustic tokenizer.

from mlx_audio.tts.utils import load_model

model = load_model("mlx-community/OmniVoice-bf16")

# Basic multilingual TTS
for result in model.generate(
    text="Hello from OmniVoice running on Apple Silicon.",
    language="english",
    duration_s=5.0,
    num_steps=32,
):
    audio = result.audio

# Zero-shot voice cloning
for result in model.generate(
    text="This sentence uses the reference speaker.",
    language="english",
    ref_audio="reference.wav",
    ref_text="Transcript of the reference audio.",
    duration_s=5.0,
):
    audio = result.audio

For stable voice cloning, provide ref_text that matches the reference clip. OmniVoice also supports generate_batch() for batched TTS and inline pronunciation controls.

Ming Omni TTS (BailingMM)

mlx_audio.tts.generate \
    --model mlx-community/Ming-omni-tts-16.8B-A3B-bf16 \
    --prompt "Please generate speech based on the following description.\n" \
    --text "This is a quick Ming Omni test." \
    --lang_code en \
    --output_path audio_io \
    --file_prefix ming_basic \
    --verbose

See the Ming Omni TTS README for CLI and Python cookbook examples, and the Ming Omni Dense README for the mlx-community/Ming-omni-tts-0.5B-bf16 workflow.

Kokoro TTS

Kokoro is a fast, multilingual TTS model with 54 voice presets.

from mlx_audio.tts.utils import load_model

model = load_model("mlx-community/Kokoro-82M-bf16")
# Or use a quantized variant for lower memory usage:
# model = load_model("mlx-community/Kokoro-82M-8bit")
# model = load_model("mlx-community/Kokoro-82M-4bit")

# Generate with different voices
for result in model.generate(
    text="Welcome to MLX-Audio!",
    voice="af_heart",  # American female
    speed=1.0,
    lang_code="a"  # American English
):
    audio = result.audio

Available Voices:

  • American English: af_heart, af_bella, af_nova, af_sky, am_adam, am_echo, etc.
  • British English: bf_alice, bf_emma, bm_daniel, bm_george, etc.
  • Japanese: jf_alpha, jm_kumo, etc.
  • Chinese: zf_xiaobei, zm_yunxi, etc.

Kokoro requires pip install misaki for text processing. Japanese and Mandarin may additionally require pip install misaki[ja] or pip install misaki[zh].

Language Codes:

Code Language Note
a American English Default; requires pip install misaki
b British English Requires pip install misaki
j Japanese Requires pip install misaki[ja]
z Mandarin Chinese Requires pip install misaki[zh]
e Spanish Requires pip install misaki
f French Requires pip install misaki

CSM (Voice Cloning)

Clone any voice using a reference audio sample:

mlx_audio.tts.generate \
    --model mlx-community/csm-1b \
    --text "Hello from Sesame." \
    --ref_audio ./reference_voice.wav \
    --play

Whisper STT

from mlx_audio.stt.generate import generate_transcription

result = generate_transcription(
    model="mlx-community/whisper-large-v3-turbo-asr-fp16",
    audio="audio.wav",
)
print(result.text)

Qwen3-ASR & ForcedAligner

Alibaba's multilingual speech models for transcription and word-level alignment.

from mlx_audio.stt import load

# Speech recognition
model = load("mlx-community/Qwen3-ASR-0.6B-8bit")
result = model.generate("audio.wav", language="English")
print(result.text)

# Word-level forced alignment
aligner = load("mlx-community/Qwen3-ForcedAligner-0.6B-8bit")
result = aligner.generate("audio.wav", text="I have a dream", language="English")
for item in result:
    print(f"[{item.start_time:.2f}s - {item.end_time:.2f}s] {item.text}")

See the Qwen3-ASR README for CLI usage, all models, and more examples.

VibeVoice-ASR

Microsoft's 9B parameter speech-to-text model with speaker diarization and timestamps. Supports long-form audio (up to 60 minutes) and outputs structured JSON.

from mlx_audio.stt.utils import load

model = load("mlx-community/VibeVoice-ASR-bf16")

# Basic transcription
result = model.generate(audio="meeting.wav", max_tokens=8192, temperature=0.0)
print(result.text)
# [{"Start":0,"End":5.2,"Speaker":0,"Content":"Hello everyone, let's begin."},
#  {"Start":5.5,"End":9.8,"Speaker":1,"Content":"Thanks for joining today."}]

# Access parsed segments
for seg in result.segments:
    print(f"[{seg['start_time']:.1f}-{seg['end_time']:.1f}] Speaker {seg['speaker_id']}: {seg['text']}")

Streaming transcription:

# Stream tokens as they are generated
for text in model.stream_transcribe(audio="speech.wav", max_tokens=4096):
    print(text, end="", flush=True)

With context (hotwords/metadata):

result = model.generate(
    audio="technical_talk.wav",
    context="MLX, Apple Silicon, PyTorch, Transformer",
    max_tokens=8192,
    temperature=0.0,
)

CLI usage:

# Basic transcription
python -m mlx_audio.stt.generate \
    --model mlx-community/VibeVoice-ASR-bf16 \
    --audio meeting.wav \
    --output-path output \
    --format json \
    --max-tokens 8192 \
    --verbose

# With context/hotwords
python -m mlx_audio.stt.generate \
    --model mlx-community/VibeVoice-ASR-bf16 \
    --audio technical_talk.wav \
    --output-path output \
    --format json \
    --max-tokens 8192 \
    --context "MLX, Apple Silicon, PyTorch, Transformer" \
    --verbose

Parakeet (Multilingual STT)

NVIDIA's high-accuracy speech-to-text model. Parakeet v3 supports 25 European languages.

from mlx_audio.stt.utils import load

# Load the multilingual v3 model
model = load("mlx-community/parakeet-tdt-0.6b-v3")

# Transcribe audio
result = model.generate("audio.wav")
print(f"Text: {result.text}")

# Access word-level timestamps
for sentence in result.sentences:
    print(f"[{sentence.start:.2f}s - {sentence.end:.2f}s] {sentence.text}")

Streaming transcription:

for chunk in model.generate("long_audio.wav", stream=True):
    print(chunk.text, end="", flush=True)

Supported languages (v3): Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish, Russian, Ukrainian

CLI usage:

python -m mlx_audio.stt.generate \
    --model mlx-community/parakeet-tdt-0.6b-v3 \
    --audio speech.wav \
    --output-path output \
    --format json \
    --verbose

KugelAudio

SOTA open-source 7B TTS model for 24 European languages, based on Microsoft VibeVoice. Uses a hybrid AR + Diffusion architecture (Qwen2.5 LM + SDE-DPM-Solver++ diffusion head + VAE decoder).

from mlx_audio.tts.utils import load_model

model = load_model("kugelaudio/kugelaudio-0-open")

for result in model.generate(
    text="Hello, welcome to MLX-Audio!",
    cfg_scale=3.0,       # Classifier-free guidance (1.0=fast, 3.0=quality)
    ddpm_steps=10,       # Diffusion steps (5=fast, 10=balanced, 20=max quality)
):
    audio = result.audio  # mx.array, 24kHz

The model loads directly from HuggingFace (weights are remapped automatically via sanitize()). To quantize or save in a pre-converted format:

python -m mlx_audio.convert \
    --hf-path kugelaudio/kugelaudio-0-open \
    --mlx-path ./kugelaudio-0-open-bf16 \
    --dtype bfloat16

Supported languages (24): English, German, French, Spanish, Italian, Portuguese, Dutch, Polish, Russian, Ukrainian, Czech, Romanian, Hungarian, Swedish, Danish, Finnish, Norwegian, Greek, Bulgarian, Slovak, Croatian, Serbian, Turkish

Note: Requires ~17GB memory (7B params in bfloat16). Pre-encoded voice presets (voice cloning) are not yet available in the upstream model — the model generates speech with a default voice.

LongCat-AudioDiT

SOTA diffusion-based TTS operating in the waveform latent space. Uses Conditional Flow Matching with a DiT backbone and WAV-VAE codec at 24kHz. Supports zero-shot voice cloning.

from mlx_audio.tts.utils import load

model = load("mlx-community/LongCat-AudioDiT-1B-bf16")

# Zero-shot TTS
result = next(model.generate("Hello, this is a test of AudioDiT."))
audio = result.audio  # mx.array, 24kHz

# Voice cloning (use "apg" guidance for best similarity)
result = next(model.generate(
    text="Today is warm turning to rain.",
    ref_audio="reference.wav",
    ref_text="Transcript of the reference audio.",
    guidance_method="apg",
    cfg_strength=4.0,
    steps=16,
))

See the LongCat-AudioDiT README for all parameters and CLI usage.

Voxtral TTS

Mistral's 4B multilingual text-to-speech with 20 voice presets across 9 languages.

from mlx_audio.tts.utils import load

model = load("mlx-community/Voxtral-4B-TTS-2603-mlx-bf16")

for result in model.generate(text="Hello, how are you today?", voice="casual_male"):
    print(result.audio_duration)

Voices: casual_male, casual_female, cheerful_female, neutral_male, neutral_female, fr_male, fr_female, es_male, es_female, de_male, de_female, it_male, it_female, pt_male, pt_female, nl_male, nl_female, ar_male, hi_male, hi_female

Voxtral Realtime

Mistral's 4B parameter streaming speech-to-text model, optimized for low-latency transcription.

Available variants: 4bit (smaller/faster) | fp16 (full precision)

from mlx_audio.stt.utils import load

# Use 4bit for faster inference, fp16 for full precision
model = load("mlx-community/Voxtral-Mini-4B-Realtime-2602-4bit")

# Transcribe audio
result = model.generate("audio.wav")
print(result.text)

# Streaming transcription
for chunk in model.generate("audio.wav", stream=True):
    print(chunk, end="", flush=True)

# Adjust transcription delay (lower = faster but less accurate)
result = model.generate("audio.wav", transcription_delay_ms=240)

MedASR (Medical Transcription)

Specialized model for medical terms and dictation.

from mlx_audio.stt.utils import load, transcribe

model = load("mlx-community/medasr")
result = transcribe("medical_dictation.wav", model=model)
print(result["text"])

Live Transcription Example:

# Continuous live transcription with VAD
python examples/medasr_live.py

SAM-Audio (Source Separation)

Separate specific sounds from audio using text prompts:

from mlx_audio.sts import SAMAudio, SAMAudioProcessor, save_audio

model = SAMAudio.from_pretrained("mlx-community/sam-audio-large")
processor = SAMAudioProcessor.from_pretrained("mlx-community/sam-audio-large")

batch = processor(
    descriptions=["A person speaking"],
    audios=["mixed_audio.wav"],
)

result = model.separate_long(
    batch.audios,
    descriptions=batch.descriptions,
    anchors=batch.anchor_ids,
    chunk_seconds=10.0,
    overlap_seconds=3.0,
    ode_opt={"method": "midpoint", "step_size": 2/32},
)

save_audio(result.target[0], "voice.wav")
save_audio(result.residual[0], "background.wav")

MossFormer2 (Speech Enhancement)

Remove noise from speech recordings:

from mlx_audio.sts import MossFormer2SEModel, save_audio

model = MossFormer2SEModel.from_pretrained("starkdmi/MossFormer2_SE_48K_MLX")
enhanced = model.enhance("noisy_speech.wav")
save_audio(enhanced, "clean.wav", 48000)

Web Interface & API Server

MLX-Audio includes a modern web interface and OpenAI-compatible API.

Starting the Server

# Start API server
mlx_audio.server --host 0.0.0.0 --port 8000

# Start web UI (in another terminal)
cd mlx_audio/ui
npm install && npm run dev

API Endpoints

Text-to-Speech (OpenAI-compatible):

curl -X POST http://localhost:8000/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"model": "mlx-community/Kokoro-82M-bf16", "input": "Hello!", "voice": "af_heart"}' \
  --output speech.wav

Speech-to-Text:

curl -X POST http://localhost:8000/v1/audio/transcriptions \
  -F "file=@audio.wav" \
  -F "model=mlx-community/whisper-large-v3-turbo-asr-fp16"

Quantization

Reduce model size and improve performance with quantization using the convert script:

# Convert and quantize to 4-bit
python -m mlx_audio.convert \
    --hf-path prince-canuma/Kokoro-82M \
    --mlx-path ./Kokoro-82M-4bit \
    --quantize \
    --q-bits 4 \
    --upload-repo username/Kokoro-82M-4bit (optional: if you want to upload the model to Hugging Face)

# Convert with MXFP4 quantization
python -m mlx_audio.convert \
    --hf-path prince-canuma/Kokoro-82M \
    --mlx-path ./Kokoro-82M-mxfp4 \
    --quantize \
    --q-mode mxfp4

# Convert with specific dtype (bfloat16)
python -m mlx_audio.convert \
    --hf-path prince-canuma/Kokoro-82M \
    --mlx-path ./Kokoro-82M-bf16 \
    --dtype bfloat16 \
    --upload-repo username/Kokoro-82M-bf16 (optional: if you want to upload the model to Hugging Face)

Options:

Flag Description
--hf-path Source Hugging Face model or local path
--mlx-path Output directory for converted model
-q, --quantize Enable quantization
--q-bits Bits per weight (optional, defaults depend on --q-mode)
--q-group-size Group size for quantization (optional, defaults depend on --q-mode)
--q-mode Quantization mode: affine, mxfp4, mxfp8, nvfp4
--dtype Weight dtype: float16, bfloat16, float32
--upload-repo Upload converted model to HF Hub

Swift

Looking for Swift/iOS support? Check out mlx-audio-swift for on-device TTS using MLX on macOS and iOS.

Requirements

  • Python 3.10+
  • Apple Silicon Mac (M1/M2/M3/M4)
  • MLX framework
  • ffmpeg (required for MP3/FLAC/OGG/Opus/Vorbis audio encoding)

Installing ffmpeg

ffmpeg is required for saving audio in MP3, FLAC, OGG, Opus, or Vorbis format. Install it using:

# macOS (using Homebrew)
brew install ffmpeg

# Ubuntu/Debian
sudo apt install ffmpeg

WAV format works without ffmpeg.

License

MIT License

Citation

@misc{mlx-audio,
  author = {Canuma, Prince},
  title = {MLX Audio},
  year = {2025},
  howpublished = {\url{https://github.com/Blaizzy/mlx-audio}},
  note = {Audio processing library for Apple Silicon with TTS, STT, and STS capabilities.}
}

Acknowledgements

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_audio-0.4.4.tar.gz (1.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlx_audio-0.4.4-py3-none-any.whl (1.7 MB view details)

Uploaded Python 3

File details

Details for the file mlx_audio-0.4.4.tar.gz.

File metadata

  • Download URL: mlx_audio-0.4.4.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mlx_audio-0.4.4.tar.gz
Algorithm Hash digest
SHA256 d751e5f477517e4e7f04de5567318e2fe91b4606af5d7e4b2973603c4777814a
MD5 2d3564983c6ad0e732b84cfef1a1d5ef
BLAKE2b-256 af1ef712c9f7997e5051c4da3b658f38162203bb703c750741984c8358c8b897

See more details on using hashes here.

File details

Details for the file mlx_audio-0.4.4-py3-none-any.whl.

File metadata

  • Download URL: mlx_audio-0.4.4-py3-none-any.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mlx_audio-0.4.4-py3-none-any.whl
Algorithm Hash digest
SHA256 39fe81b03e2b1354be70de82dc8bf01dd6e75efcb464150afef89f18f734d0d5
MD5 038da1143371007cee3479dda77d6587
BLAKE2b-256 b84d93ac0e0526591c856a1ea2cd00a41f31530d9e021cc95bc9400550926d51

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page