Skip to main content

Mega-ASR: fine-tuned Qwen3-ASR 1.7B for Chinese/English code-switching speech recognition on Apple MLX

Project description

Mega-ASR MLX

End-to-end speech recognition on Apple Silicon, powered by MLX.

Mega-ASR is a fine-tuned Qwen3-ASR 1.7B model with merged LoRA weights, optimized for Chinese/English code-switching speech. The model runs entirely on-device via Apple MLX.

Install

pip install mega-asr-mlx

Quick Start

# Download model weights from HuggingFace (~4.4 GB)
huggingface-cli download voiceink/mega-asr-mlx --local-dir ~/.cache/voiceink/mega-asr-mlx

# Transcribe audio
mega-asr --audio speech.wav --language English

Or use as a Python library:

from mega_asr_mlx import MegaASRMLX

model = MegaASRMLX("~/.cache/voiceink/mega-asr-mlx")
text = model.transcribe("speech.wav", language="English")
print(text)

Model

Component Architecture Size
Audio Encoder Conv2D stem + 24-layer Transformer (1024-dim) 606 MB
Decoder Qwen3 28-layer (2048-dim, GQA 16/8) 3.8 GB
Router 4-layer Transformer audio quality classifier 2.2 MB
  • Languages: Chinese, English (auto-detect)
  • Input: 16 kHz mono WAV
  • Output: Plain text transcription

Requirements

  • macOS with Apple Silicon (M1/M2/M3/M4)
  • Python 3.10+
  • Dependencies: mlx, mlx-lm, numpy, scipy, soundfile, safetensors, transformers

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mega_asr_mlx-0.1.0.tar.gz (15.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mega_asr_mlx-0.1.0-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file mega_asr_mlx-0.1.0.tar.gz.

File metadata

  • Download URL: mega_asr_mlx-0.1.0.tar.gz
  • Upload date:
  • Size: 15.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.10

File hashes

Hashes for mega_asr_mlx-0.1.0.tar.gz
Algorithm Hash digest
SHA256 59a14e566d29d2e1450fe948c0e4c05d571ccf73d1cd17642e28736f2b58386d
MD5 e3eb19ccc2a9a76e9745f61ac0d35504
BLAKE2b-256 f3a1e9680684c92ca583767b875a6b5a2f57fac4f6b1d0f6a6baaed18afd9346

See more details on using hashes here.

File details

Details for the file mega_asr_mlx-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mega_asr_mlx-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.10

File hashes

Hashes for mega_asr_mlx-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8c0a29d7ae6c953dc3b5022104f1b2518c516dee106144222f0339e4d2c3b17e
MD5 ede86d7fac259d5bc8d5fea6f9df6ae4
BLAKE2b-256 59005ea05ca58401db99b399a57112972f974108109068d0979d83bd4b205cc1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page