Skip to main content

Voxtral Mini Realtime speech-to-text in MLX

Project description

voxmlx

Realtime speech-to-text with Voxtral Mini Realtime in MLX.

Install

pip install voxmlx

Usage

voxmlx

Transcribe audio from a file or stream from the microphone in real-time.

Stream from microphone:

voxmlx

Transcribe a file:

voxmlx --audio audio.flac

Options:

Flag Description Default
--audio Path to audio file (omit to stream from mic) None
--model Model path or HuggingFace model ID mlx-community/Voxtral-Mini-4B-Realtime-6bit
--temp Sampling temperature (0 = greedy) 0.0

voxmlx-convert

Convert Voxtral weights to voxmlx/MLX format with optional quantization.

Basic conversion:

voxmlx-convert --mlx-path voxtral-mlx

4-bit quantized conversion:

voxmlx-convert -q --mlx-path voxtral-mlx-4bit

Convert and upload to HuggingFace:

voxmlx-convert -q --mlx-path voxtral-mlx-4bit --upload-repo username/voxtral-mlx-4bit

Options:

Flag Description Default
--hf-path HuggingFace model ID or local path mistralai/Voxtral-Mini-4B-Realtime-2602
--mlx-path Output directory mlx_model
-q, --quantize Quantize the model Off
--group-size Quantization group size 64
--bits Bits per weight 4
--dtype Cast weights (float16, bfloat16, float32) None
--upload-repo HuggingFace repo to upload converted model None

Python API

from voxmlx import transcribe

text = transcribe("audio.flac")
print(text)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxmlx-0.0.1.tar.gz (19.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voxmlx-0.0.1-py3-none-any.whl (21.3 kB view details)

Uploaded Python 3

File details

Details for the file voxmlx-0.0.1.tar.gz.

File metadata

  • Download URL: voxmlx-0.0.1.tar.gz
  • Upload date:
  • Size: 19.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for voxmlx-0.0.1.tar.gz
Algorithm Hash digest
SHA256 c7caa858d182c66390d0092dce8383bcfb493efb07322e36b1e47c72757c6bbb
MD5 560aafd7e35b6cb749625047fc4f475d
BLAKE2b-256 cd5bd31d974ead31e893c150455e18c6a6115ddb7451b28204a181e6997e523a

See more details on using hashes here.

File details

Details for the file voxmlx-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: voxmlx-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 21.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for voxmlx-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1b3e115418f673df3d2239ca36a0a6a035bb762ac07bc15431983131086f79a4
MD5 f35efe871fd0b5e9b8ee9e110eae4d4d
BLAKE2b-256 16325ce910ce98d620a5cb0947f60c3153429d2db5efdbe5b806d24ed12b89b5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page