Local-first alt-text generator built on top of MLX-VLM
Project description
MLX Alt Text
A Python package for generating alt-text for images using local MLX models.
About
MLX Alt Text is a local-first alt-text generator built on top of MLX-VLM. It allows you to generate descriptive alt-text for images using vision-language models that run entirely on your device using Apple's MLX framework.
Features
- Generate detailed accessibility descriptions for images
- Run entirely on-device (no API calls except to download the model)
- Customizable prompts, output lengths, and temperatures
- Command-line interface for easy integration into workflows
- Python API for integration into your applications
Requirements
- macOS with Apple Silicon
- Python 3.10–3.12
Installation
uv tool install mlx-alt-text # CLI installation with uv (recommended)
pipx install mlx-alt-text # CLI installation with pipx
pip install mlx-alt-text # Python pacakge installation
Usage
Command-line Interface
Generate alt-text for an image:
mlx-alt-text path/to/image.jpg
With custom options:
mlx-alt-text path/to/image.jpg \
--prompt "Describe this image in detail for accessibility purposes" \
--model "mlx-community/SmolVLM-256M-Instruct-bf16" \
--max-tokens 150 \
--temperature 0.3
Python API
from mlx_alt_text import AltTextGenerator
# Initialize with default options
generator = AltTextGenerator()
# Or with custom options
generator = AltTextGenerator(
model_name="mlx-community/Qwen2-VL-2B-Instruct-4bit",
max_tokens=100,
temperature=0.2
)
# Generate alt-text
alt_text = generator.generate(
image="path/to/image.jpg",
prompt="Describe this image for accessibility purposes"
)
print(alt_text)
Available Models
By default, MLX Alt Text uses mlx-community/Qwen2-VL-2B-Instruct-4bit, but you can specify other compatible models. Some examples below:
mlx-community/Qwen2-VL-2B-Instruct-4bit(default, ~2GB)mlx-community/SmolVLM-256M-Instruct-bf16(smaller, ~256MB)mlx-community/SmolVLM-Instruct-bf16(larger, better quality)
The first time you use a model, it will be automatically downloaded from the Hugging Face Hub.
Development Setup
- Clone the repository:
git clone https://github.com/yourusername/mlx-alt-text.git
cd mlx-alt-text
- Set up the development environment with uv:
# Install uv if you don't have it
curl -LsSf https://astral.sh/uv/install.sh | sh
# Create a virtual environment and install dependencies
uv sync
source .venv/bin/activate
uv run mlx-alt-text
- Run tests:
pytest
Other Development Notes
uv lock --upgrade # to upgrade all packages
uv lock --upgrade <package> # to upgrade a single package
uv build # to build the package
uv run https://gist.githubusercontent.com/Jython1415/84f37a01fb9700d3eb72b67a52273222/raw/3d7ec10e3c6bb5f0191bd6681dd0016017a28a55/uv-publish-pypi.py # to publish the package
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mlx_alt_text-0.1.1.tar.gz.
File metadata
- Download URL: mlx_alt_text-0.1.1.tar.gz
- Upload date:
- Size: 251.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a86cb2fc376bd329645c45f25c7b15639f7b0434df38f24210f07d73436df5f2
|
|
| MD5 |
e3fb107ece41289f5eeca849ae1be986
|
|
| BLAKE2b-256 |
e94ec5b4d820b93006838f2f248555bf973438973fd300c16f04494f5bf9805c
|
File details
Details for the file mlx_alt_text-0.1.1-py3-none-any.whl.
File metadata
- Download URL: mlx_alt_text-0.1.1-py3-none-any.whl
- Upload date:
- Size: 5.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b754b09a0bcf92d9c576fa13ee88a2650bb2b0e1b55384af4eac18f0d113142d
|
|
| MD5 |
8327c444bb9626734702303fd66ff104
|
|
| BLAKE2b-256 |
8471a2182b3fd40731d47e1ae46f2a2951da6a707a6a4d5c9cbdf87348787f07
|