Skip to main content

MLX-native inference for MERaLiON AudioLLM on Apple Silicon

Project description

mlx-meralion

CI Security (Bandit) Dependency Audit (pip-audit) Dependency Review CodeQL Publish

MLX-native inference for MERaLiON AudioLLM on Apple Silicon.

MERaLiON is A*STAR's multimodal audio-language model for speech transcription, translation, spoken question answering, and more.

Installation

pip install mlx-meralion

Requires macOS on Apple Silicon (M1/M2/M3/M4) and Python 3.10+.

Quick Start

Python API

from mlx_meralion import load_model, transcribe

# Load model (auto-downloads from HuggingFace on first use)
model = load_model("MERaLiON/MERaLiON-2-10B-MLX")  # 10B 8-bit, recommended
# model = load_model("MERaLiON/MERaLiON-2-3B-MLX")   # 3B fp16, smaller

# Transcribe speech
text = transcribe(model, "audio.wav")
print(text)

# Translate to Chinese
text = transcribe(model, "audio.wav", task="translate_zh")

# Spoken question answering
text = transcribe(model, "audio.wav", task="sqa", question="What is the speaker talking about?")

# Summarize dialogue
text = transcribe(model, "audio.wav", task="summarize")

CLI

# ASR (default task)
mlx-meralion --model MERaLiON/MERaLiON-2-10B-MLX --audio audio.wav --task asr

# Translation
mlx-meralion --model MERaLiON/MERaLiON-2-10B-MLX --audio audio.wav --task translate_zh

# Custom instruction
mlx-meralion --model MERaLiON/MERaLiON-2-10B-MLX --audio audio.wav --instruction "Summarize this in one sentence."

Supported Tasks

Task Description
asr Speech-to-text transcription
translate_zh Translate to Chinese
translate_id Translate to Indonesian
translate_ms Translate to Malay
translate_ta Translate to Tamil
sqa Spoken question answering (requires question=)
summarize Dialogue summarization
paralinguistics Speaker characteristic analysis

Available Models

Model Size RAM Quality HuggingFace
MERaLiON-2-10B-MLX ~10 GB 16+ GB Best MERaLiON/MERaLiON-2-10B-MLX
MERaLiON-2-3B-MLX ~6 GB 8+ GB Good MERaLiON/MERaLiON-2-3B-MLX

Features

  • Apple Silicon native: Runs entirely on MLX with GPU acceleration
  • N-gram blocking: Automatically prevents repetitive output (matching HuggingFace quality)
  • Smart chunking: Long audio split at 30s boundaries; short tails merged to prevent hallucination
  • Auto-download: HuggingFace models are downloaded and cached automatically
  • Multiple tasks: ASR, translation, QA, summarization, and more

Architecture

Audio (WAV/MP3/FLAC)
  -> Whisper Encoder (1280-d)
    -> LayerNorm + MLP Adaptor
      -> Speech embeddings merged into text sequence
        -> Gemma2 Decoder -> text output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_meralion-0.1.1.tar.gz (22.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlx_meralion-0.1.1-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file mlx_meralion-0.1.1.tar.gz.

File metadata

  • Download URL: mlx_meralion-0.1.1.tar.gz
  • Upload date:
  • Size: 22.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mlx_meralion-0.1.1.tar.gz
Algorithm Hash digest
SHA256 fbf7bbc8f3af45d9598c2fe11f1bcd3533b5946d5343c318882eba516338e5e7
MD5 d016b0888018ee10ed0e5be959425c7a
BLAKE2b-256 451145f348dc93536ea37cd0f17bccb8dc62dce8bbd2ce2a247bb80625780beb

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlx_meralion-0.1.1.tar.gz:

Publisher: publish.yml on YingxuH/mlx-audiollm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mlx_meralion-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: mlx_meralion-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mlx_meralion-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b009e40de669bb4a5b521c16d5e6aeb4ddd56b1a2b8e1d38fd30c8e04bf63677
MD5 293345170087b7277d44ae840d81a01c
BLAKE2b-256 9804fa0cea4bb369b36b9fd952a003fc21f4974495a33ec07240e0bcb183b268

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlx_meralion-0.1.1-py3-none-any.whl:

Publisher: publish.yml on YingxuH/mlx-audiollm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page