AI-powered alt text generation and labeling tools for markdown content
Project description
alt-text-llm
AI-powered alt text generation and labeling tools for markdown content. Originally developed for my website (repo).
Features
- Intelligent scanning - Detects images/videos missing meaningful alt text (ignores empty
alt="") - AI-powered generation - Uses LLM of your choice to create context-aware alt text suggestions
- Interactive labeling - Manually review and edit LLM suggestions. Images display directly in your terminal
- Automatic application - Apply approved captions back to your markdown files
Installation
From PyPI
pip install alt-text-llm
Automated setup (includes system dependencies)
git clone https://github.com/alexander-turner/alt-text-llm.git
cd alt-text-llm
./setup.sh
Prerequisites
The following command-line tools must be installed:
llm- LLM interface (install instructions)git- Version controlmagick(ImageMagick) - Image processingffmpeg- Video processingimgcat- Terminal image display
macOS:
brew install imagemagick ffmpeg imgcat
pip install llm
Linux:
sudo apt-get install imagemagick ffmpeg
pip install llm
# imgcat: curl -sL https://iterm2.com/utilities/imgcat -o ~/.local/bin/imgcat && chmod +x ~/.local/bin/imgcat
Usage
The tool provides three main commands: scan, generate, and label.
1. Scan for missing alt text
Scan your markdown files to find images without meaningful alt text:
alt-text-llm scan --root /path/to/markdown/files
This creates asset_queue.json with all assets needing alt text.
2. Generate AI suggestions
Generate alt text suggestions using an LLM:
alt-text-llm generate \
--root /path/to/markdown/files \
--model gemini-2.5-flash \
--suggestions-file suggested_alts.json
Available options:
--model(required) - LLM model to use (e.g.,gemini-2.5-flash,gpt-4o-mini,claude-3-5-sonnet)--max-chars- Maximum characters for alt text (default: 300)--timeout- LLM timeout in seconds (default: 120)--estimate-only- Only show cost estimate without generating--process-existing- Also process assets that already have captions
Cost estimation:
alt-text-llm generate \
--root /path/to/markdown/files \
--model gemini-2.5-flash \
--estimate-only
3. Label and approve suggestions
Interactively review and approve the AI-generated suggestions:
alt-text-llm label \
--suggestions-file suggested_alts.json \
--output asset_captions.json
Interactive commands:
- Edit the suggested alt text (vim keybindings enabled)
- Press Enter to accept the suggestion as-is
- Submit
undooruto go back to the previous item - Images display in your terminal (requires
imgcat)
Example workflow
# 1. Scan markdown files for missing alt text
alt-text-llm scan --root ./content
# 2. Estimate the cost
alt-text-llm generate \
--root ./content \
--model gemini-2.5-flash \
--estimate-only
# 3. Generate suggestions (if cost is acceptable)
alt-text-llm generate \
--root ./content \
--model gemini-2.5-flash
# 4. Review and approve suggestions
alt-text-llm label
Configuration
LLM Integration
This tool uses the llm CLI tool to generate alt text. This provides access to many different AI models including:
- Gemini (Google) via the llm-gemini plugin
- Claude (Anthropic) via the llm-claude-3 plugin
- And many more via plugins
Setting up your model
For Gemini models (default):
llm install llm-gemini
llm keys set gemini # enter API key
llm -m gemini-2.5-flash "Hello, world!"
For other models:
- Install the appropriate llm plugin (e.g.,
llm install llm-openai) - Configure your API key (e.g.,
llm keys set openai) - Use the model name with
--modelflag (e.g.,--model gpt-4o-mini)
See the llm documentation for setup instructions and the plugin directory for available models.
Output files
asset_queue.json- Queue of assets needing alt text (fromscan)suggested_alts.json- AI-generated suggestions (fromgenerate)asset_captions.json- Approved final captions (fromlabel)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file alt_text_llm-1.0.tar.gz.
File metadata
- Download URL: alt_text_llm-1.0.tar.gz
- Upload date:
- Size: 42.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1da4f3922a95625ec059aa35239d3feb6fcb2d6dc032548dee3e9b928e6c370b
|
|
| MD5 |
68abf7b4857cdcd4d04c6d021ef4ca2e
|
|
| BLAKE2b-256 |
d729093a881216c1533d59901ebd851a8267ef02e572730cf8993d0fab52eb66
|
File details
Details for the file alt_text_llm-1.0-py3-none-any.whl.
File metadata
- Download URL: alt_text_llm-1.0-py3-none-any.whl
- Upload date:
- Size: 24.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
651a5e8aed48f240301714285fbf22c856ca2dd12e8ddfa069097835383cc3a6
|
|
| MD5 |
8fb487bf4f3dcbfdf79607a7a06a905a
|
|
| BLAKE2b-256 |
f184fe3d5c1d194471c68263b13df32c4d3a5d61fcbe1246075a29ca96274330
|