Skip to main content

Add your description here

Project description

Inference for Speech Models in MLX

This repo implements some speech models in MLX for better performance on Mac devices.

Currently, it supports the following models:

  • Qwen2.5-Omni (both original version and mlx-quant version), currently supported only speech model
  • Ultravox-0.5

Performance

Tested on MacBook M4-Pro (48GB RAM):

Model Prompt TPS Generation TPS
Qwen/Qwen2.5-Omni-7B 259.5 17.8
Qwen/Qwen2.5-Omni-3B 468.4 38.8
giangndm/qwen2.5-omni-7b-mlx-8bit 253.4 31.7
giangndm/qwen2.5-omni-7b-mlx-4bit 259.2 57.6
giangndm/qwen2.5-omni-3b-mlx-8bit 456.2 67.0
fixie-ai/ultravox-v0_5-llama-3_1-8b and mlx-community/Llama-3.1-8B-Instruct-4bit 188.5 40.4

How to use

uv add mlx-lm-omni 
# or
uv add https://github.com/giangndm/mlx-lm-omni.git
from mlx_lm_omni import load, generate
import librosa
from io import BytesIO
from urllib.request import urlopen

# model, tokenizer = load("giangndm/qwen2.5-omni-7b-mlx-4bit")
model, tokenizer = load("fixie-ai/ultravox-v0_5-llama-3_1-8b", model_config={"text_model_id": "mlx-community/Llama-3.1-8B-Instruct-4bit"})

audio_path = "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen2-Audio/audio/1272-128104-0000.flac"
audio = librosa.load(BytesIO(urlopen(audio_path).read()), sr=16000)[0]

messages = [
    {"role": "system", "content": "You are a speech recognition model."},
    {"role": "user", "content": "Transcribe the English audio into text without any punctuation marks.", "audio": audio},
]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_lm_omni-0.1.3.tar.gz (821.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlx_lm_omni-0.1.3-py3-none-any.whl (17.3 kB view details)

Uploaded Python 3

File details

Details for the file mlx_lm_omni-0.1.3.tar.gz.

File metadata

  • Download URL: mlx_lm_omni-0.1.3.tar.gz
  • Upload date:
  • Size: 821.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.4

File hashes

Hashes for mlx_lm_omni-0.1.3.tar.gz
Algorithm Hash digest
SHA256 972a0dfac50e0182855c5aee28cafe55fae5160343b91089451642f95a49fe13
MD5 b55f6270d12e940489f223ecc636a757
BLAKE2b-256 610e7369115e07220514d813a02189a8099d1b9da43fc6109243c11527a8cba3

See more details on using hashes here.

File details

Details for the file mlx_lm_omni-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for mlx_lm_omni-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 c99f278830f3a144ff77cb50aa2d7f43e36a8391a8d228ecd68c5de17bf9cba0
MD5 66afa059e86630abdeb808d8e7b26bb8
BLAKE2b-256 22940b4ca9e9c2728ccce015a61eb4d29d23e80c7aaab5e0646a95ff11c2c74a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page