Skip to main content

Add your description here

Project description

Inference for Speech Models in MLX

This repo implements some speech models in MLX for better performance on Mac devices.

Currently, it supports the following models:

  • Qwen2.5-Omni (both original version and mlx-quant version), currently supported only speech model
  • Ultravox-0.5

Performance

Tested on MacBook M4-Pro (48GB RAM):

Model Prompt TPS Generation TPS
Qwen/Qwen2.5-Omni-7B 259.5 17.8
Qwen/Qwen2.5-Omni-3B 468.4 38.8
giangndm/qwen2.5-omni-7b-mlx-8bit 253.4 31.7
giangndm/qwen2.5-omni-7b-mlx-4bit 259.2 57.6
giangndm/qwen2.5-omni-3b-mlx-8bit 456.2 67.0
fixie-ai/ultravox-v0_5-llama-3_1-8b and mlx-community/Llama-3.1-8B-Instruct-4bit 188.5 40.4

How to use

uv add mlx-lm-omni 
# or
uv add https://github.com/giangndm/mlx-lm-omni.git
from mlx_lm_omni import load, generate
import librosa
from io import BytesIO
from urllib.request import urlopen

# model, tokenizer = load("giangndm/qwen2.5-omni-7b-mlx-4bit")
model, tokenizer = load("fixie-ai/ultravox-v0_5-llama-3_1-8b", model_config={"text_model_id": "mlx-community/Llama-3.1-8B-Instruct-4bit"})

audio_path = "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen2-Audio/audio/1272-128104-0000.flac"
audio = librosa.load(BytesIO(urlopen(audio_path).read()), sr=16000)[0]

messages = [
    {"role": "system", "content": "You are a speech recognition model."},
    {"role": "user", "content": "Transcribe the English audio into text without any punctuation marks.", "audio": audio},
]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_lm_omni-0.1.1.tar.gz (68.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlx_lm_omni-0.1.1-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file mlx_lm_omni-0.1.1.tar.gz.

File metadata

  • Download URL: mlx_lm_omni-0.1.1.tar.gz
  • Upload date:
  • Size: 68.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.4

File hashes

Hashes for mlx_lm_omni-0.1.1.tar.gz
Algorithm Hash digest
SHA256 203758ce4d49e017f5442e7c4279750e2da0e3c8b96be50fe92ca45f0e29c3a6
MD5 ed0e4829091e53ffe00acea92060bd30
BLAKE2b-256 b940a10bee85b42c146cd2f549cda328cbee82088c8c01e451afa4fc7d87939e

See more details on using hashes here.

File details

Details for the file mlx_lm_omni-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for mlx_lm_omni-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 50448062cbe5c75a6eb72a53aed65ce06bef246ed7ad05e21c71de820ea10947
MD5 1191f22045ad401f47bbef745b0b9b62
BLAKE2b-256 61865f77abc3c8440aec8f80aa5457bdd7efd421948280dd9a99d5c888a6f457

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page