Skip to main content

Local-first alt-text generator built on top of MLX-VLM

Project description

MLX Alt Text

A Python package for generating alt-text for images using local MLX models.

About

MLX Alt Text is a local-first alt-text generator built on top of MLX-VLM. It allows you to generate descriptive alt-text for images using vision-language models that run entirely on your device using Apple's MLX framework.

Features

  • Generate detailed accessibility descriptions for images
  • Run entirely on-device (no API calls except to download the model)
  • Customizable prompts, output lengths, and temperatures
  • Command-line interface for easy integration into workflows
  • Python API for integration into your applications

Requirements

  • macOS with Apple Silicon
  • Python 3.10–3.12

Installation

uv tool install mlx-alt-text # CLI installation with uv (recommended)
pipx install mlx-alt-text    # CLI installation with pipx
pip install mlx-alt-text     # Python pacakge installation

Usage

Command-line Interface

Generate alt-text for an image:

mlx-alt-text path/to/image.jpg

With custom options:

mlx-alt-text path/to/image.jpg \
  --prompt "Describe this image in detail for accessibility purposes" \
  --model "mlx-community/SmolVLM-256M-Instruct-bf16" \
  --max-tokens 150 \
  --temperature 0.3

Python API

from mlx_alt_text import AltTextGenerator

# Initialize with default options
generator = AltTextGenerator()

# Or with custom options
generator = AltTextGenerator(
    model_name="mlx-community/Qwen2-VL-2B-Instruct-4bit",
    max_tokens=100,
    temperature=0.2
)

# Generate alt-text
alt_text = generator.generate(
    image="path/to/image.jpg",
    prompt="Describe this image for accessibility purposes"
)

print(alt_text)

Available Models

By default, MLX Alt Text uses mlx-community/Qwen2-VL-2B-Instruct-4bit, but you can specify other compatible models. Some examples below:

  • mlx-community/Qwen2-VL-2B-Instruct-4bit (default, ~2GB)
  • mlx-community/SmolVLM-256M-Instruct-bf16 (smaller, ~256MB)
  • mlx-community/SmolVLM-Instruct-bf16 (larger, better quality)

The first time you use a model, it will be automatically downloaded from the Hugging Face Hub.

Development Setup

  1. Clone the repository:
git clone https://github.com/yourusername/mlx-alt-text.git
cd mlx-alt-text
  1. Set up the development environment with uv:
# Install uv if you don't have it
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create a virtual environment and install dependencies
uv sync
source .venv/bin/activate
uv run mlx-alt-text
  1. Run tests:
pytest

Other Development Notes

uv lock --upgrade # to upgrade all packages
uv lock --upgrade <package> # to upgrade a single package
uv build # to build the package
uv run https://gist.githubusercontent.com/Jython1415/84f37a01fb9700d3eb72b67a52273222/raw/3d7ec10e3c6bb5f0191bd6681dd0016017a28a55/uv-publish-pypi.py # to publish the package

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_alt_text-0.1.1.tar.gz (251.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlx_alt_text-0.1.1-py3-none-any.whl (5.4 kB view details)

Uploaded Python 3

File details

Details for the file mlx_alt_text-0.1.1.tar.gz.

File metadata

  • Download URL: mlx_alt_text-0.1.1.tar.gz
  • Upload date:
  • Size: 251.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.10

File hashes

Hashes for mlx_alt_text-0.1.1.tar.gz
Algorithm Hash digest
SHA256 a86cb2fc376bd329645c45f25c7b15639f7b0434df38f24210f07d73436df5f2
MD5 e3fb107ece41289f5eeca849ae1be986
BLAKE2b-256 e94ec5b4d820b93006838f2f248555bf973438973fd300c16f04494f5bf9805c

See more details on using hashes here.

File details

Details for the file mlx_alt_text-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for mlx_alt_text-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b754b09a0bcf92d9c576fa13ee88a2650bb2b0e1b55384af4eac18f0d113142d
MD5 8327c444bb9626734702303fd66ff104
BLAKE2b-256 8471a2182b3fd40731d47e1ae46f2a2951da6a707a6a4d5c9cbdf87348787f07

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page