Skip to main content

OpenAI Whisper on Apple silicon with MLX and the Hugging Face Hub

Project description

Whisper

Speech recognition with Whisper in MLX. Whisper is a set of open source speech recognition models from OpenAI, ranging from 39 million to 1.5 billion parameters.[^1]

Setup

Install ffmpeg:

# on macOS using Homebrew (https://brew.sh/)
brew install ffmpeg

Install the mlx-whisper package with:

pip install mlx-whisper

Run

CLI

At its simplest:

mlx_whisper audio_file.mp3

This will make a text file audio_file.txt with the results.

Use -f to specify the output format and --model to specify the model. There are many other supported command line options. To see them all, run mlx_whisper -h.

You can also pipe the audio content of other programs via stdin:

some-process | mlx_whisper -

The default output file name will be content.*. You can specify the name with the --output-name flag.

API

Transcribe audio with:

import mlx_whisper

text = mlx_whisper.transcribe(speech_file)["text"]

The default model is "mlx-community/whisper-tiny". Choose the model by setting path_or_hf_repo. For example:

result = mlx_whisper.transcribe(speech_file, path_or_hf_repo="models/large")

This will load the model contained in models/large. The path_or_hf_repo can also point to an MLX-style Whisper model on the Hugging Face Hub. In this case, the model will be automatically downloaded. A collection of pre-converted Whisper models are in the Hugging Face MLX Community.

The transcribe function also supports word-level timestamps. You can generate these with:

output = mlx_whisper.transcribe(speech_file, word_timestamps=True)
print(output["segments"][0]["words"])

To see more transcription options use:

>>> help(mlx_whisper.transcribe)

Converting models

[!TIP] Skip the conversion step by using pre-converted checkpoints from the Hugging Face Hub. There are a few available in the MLX Community organization.

To convert a model, first clone the MLX Examples repo:

git clone https://github.com/ml-explore/mlx-examples.git

Then run convert.py from mlx-examples/whisper. For example, to convert the tiny model use:

python convert.py --torch-name-or-path tiny --mlx-path mlx_models/tiny

Note you can also convert a local PyTorch checkpoint which is in the original OpenAI format.

To generate a 4-bit quantized model, use -q. For a full list of options:

python convert.py --help

By default, the conversion script will make the directory mlx_models and save the converted weights.npz and config.json there.

Each time it is run, convert.py will overwrite any model in the provided path. To save different models, make sure to set --mlx-path to a unique directory for each converted model. For example:

model="tiny"
python convert.py --torch-name-or-path ${model} --mlx-path mlx_models/${model}_fp16
python convert.py --torch-name-or-path ${model} --dtype float32 --mlx-path mlx_models/${model}_fp32
python convert.py --torch-name-or-path ${model} -q --q_bits 4 --mlx-path mlx_models/${model}_quantized_4bits

[^1]: Refer to the arXiv paper, blog post, and code for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_whisper-0.4.2.tar.gz (778.6 kB view details)

Uploaded Source

Built Distribution

mlx_whisper-0.4.2-py3-none-any.whl (782.7 kB view details)

Uploaded Python 3

File details

Details for the file mlx_whisper-0.4.2.tar.gz.

File metadata

  • Download URL: mlx_whisper-0.4.2.tar.gz
  • Upload date:
  • Size: 778.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for mlx_whisper-0.4.2.tar.gz
Algorithm Hash digest
SHA256 5487d967245291fd45d5f11c98a69da1130b9304d0767663412d56fffd71e088
MD5 47377f3124d31b680b7e69cfab50027e
BLAKE2b-256 4bbeda654e46741fb07aafbfe44b8e8227f890a6874f17dd068db310d75d491b

See more details on using hashes here.

File details

Details for the file mlx_whisper-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: mlx_whisper-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 782.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for mlx_whisper-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 19a343deaa85f461be4fdc9c14e6e8b61d65260374872e07f9f6101ad397a053
MD5 e39d20c17c979578c9dd4bd1ba2eb41b
BLAKE2b-256 3fb59887e68f5488314a57f7baf2c706b06a97d9020433d1ce774bb6cefc1c22

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page