Skip to main content

Voxtral Mini Realtime speech-to-text in MLX

Project description

voxmlx

Realtime speech-to-text with Voxtral Mini Realtime in MLX.

Install

pip install voxmlx

Usage

voxmlx

Transcribe audio from a file or stream from the microphone in real-time.

Stream from microphone:

voxmlx

Transcribe a file:

voxmlx --audio audio.flac

Options:

Flag Description Default
--audio Path to audio file (omit to stream from mic) None
--model Model path or HuggingFace model ID mlx-community/Voxtral-Mini-4B-Realtime-6bit
--temp Sampling temperature (0 = greedy) 0.0

voxmlx-convert

Convert Voxtral weights to voxmlx/MLX format with optional quantization.

Basic conversion:

voxmlx-convert --mlx-path voxtral-mlx

4-bit quantized conversion:

voxmlx-convert -q --mlx-path voxtral-mlx-4bit

Convert and upload to HuggingFace:

voxmlx-convert -q --mlx-path voxtral-mlx-4bit --upload-repo username/voxtral-mlx-4bit

Options:

Flag Description Default
--hf-path HuggingFace model ID or local path mistralai/Voxtral-Mini-4B-Realtime-2602
--mlx-path Output directory mlx_model
-q, --quantize Quantize the model Off
--group-size Quantization group size 64
--bits Bits per weight 4
--dtype Cast weights (float16, bfloat16, float32) None
--upload-repo HuggingFace repo to upload converted model None

Python API

from voxmlx import transcribe

text = transcribe("audio.flac")
print(text)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxmlx-0.0.2.tar.gz (19.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voxmlx-0.0.2-py3-none-any.whl (21.3 kB view details)

Uploaded Python 3

File details

Details for the file voxmlx-0.0.2.tar.gz.

File metadata

  • Download URL: voxmlx-0.0.2.tar.gz
  • Upload date:
  • Size: 19.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for voxmlx-0.0.2.tar.gz
Algorithm Hash digest
SHA256 0798f2a0645ad73a106be8a96769c9ca8a7f9a03dbeb8e40b021307ecde5ffaa
MD5 38fabaf6b0a3d3c7dc956972e854d0cd
BLAKE2b-256 fa2bd5641f435ed350b6d0529f3cc8eed98c725a454f89b3e8993fd793d4165a

See more details on using hashes here.

File details

Details for the file voxmlx-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: voxmlx-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 21.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for voxmlx-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 daedb7fa24d99948b78de94d95b053351b0c7d1e858ea04f0f88623436a957b8
MD5 75b70aedb1e11713b364dd34fc5330b9
BLAKE2b-256 fad1b128c41a3ffe39eef8d9539f484c9f310cb74b37c6fbf1290374a816a739

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page