Skip to main content

AI-powered alt text generation and labeling tools for markdown content

Project description

alt-text-llm

AI-powered alt text generation and labeling tools for markdown content. Originally developed for my website (repo).

Features

  • Intelligent scanning - Detects images/videos missing meaningful alt text (ignores empty alt="")
  • AI-powered generation - Uses LLM of your choice to create context-aware alt text suggestions
  • Interactive labeling - Manually review and edit LLM suggestions. Images display directly in your terminal
  • Automatic application - Apply approved captions back to your markdown files

A labeled example of the labeling pipeline: 1) view the context for an image, 2) view the image itself, while 3) editing the AI-generated label suggestion.

Installation

From PyPI

pip install alt-text-llm

Automated setup (includes system dependencies)

git clone https://github.com/alexander-turner/alt-text-llm.git
cd alt-text-llm
./setup.sh

Prerequisites

macOS:

brew install imagemagick ffmpeg imgcat
pip install llm

Linux:

sudo apt-get install imagemagick ffmpeg
pip install llm
# imgcat: curl -sL https://iterm2.com/utilities/imgcat -o ~/.local/bin/imgcat && chmod +x ~/.local/bin/imgcat

Usage

The tool provides four main commands: scan, generate, label, and apply.

1. Scan for missing alt text

Scan your markdown files to find images without meaningful alt text:

alt-text-llm scan --root /path/to/markdown/files

This creates asset_queue.json with all assets needing alt text.

2. Generate AI suggestions

Generate alt text suggestions using an LLM:

alt-text-llm generate \
  --root /path/to/markdown/files \
  --model gemini-2.5-flash \
  --suggestions-file suggested_alts.json

Available options:

  • --model (required) - LLM model to use (e.g., gemini-2.5-flash, gpt-4o-mini, claude-3-5-sonnet)
  • --max-chars - Maximum characters for alt text (default: 300)
  • --timeout - LLM timeout in seconds (default: 120)
  • --estimate-only - Only show cost estimate without generating
  • --process-existing - Also process assets that already have captions

Cost estimation:

alt-text-llm generate \
  --root /path/to/markdown/files \
  --model gemini-2.5-flash \
  --estimate-only

3. Label and approve suggestions

Interactively review and approve the AI-generated suggestions:

alt-text-llm label \
  --suggestions-file suggested_alts.json \
  --output asset_captions.json

Interactive commands:

  • Edit the suggested alt text (vim keybindings enabled)
  • Press Enter to accept the suggestion as-is
  • Submit undo or u to go back to the previous item
  • Images display in your terminal (requires imgcat)

4. Apply approved captions

Apply the approved captions back to your markdown files:

alt-text-llm apply \
  --captions-file asset_captions.json

Available options:

  • --captions-file - Path to the captions JSON file with final_alt populated (default: asset_captions.json)
  • --dry-run - Preview changes without modifying files

What it does:

  • Reads approved captions from the captions file
  • Locates corresponding images/videos in markdown files
  • Updates alt text for all supported formats:
    • Markdown images: ![alt](path)
    • HTML img tags: <img src="path" alt="alt">
    • Wikilink images: ![[path|alt]]
  • Preserves file formatting and handles special characters

Example workflow

# 1. Scan markdown files for missing alt text
alt-text-llm scan --root ./content

# 2. Estimate the cost
alt-text-llm generate \
  --root ./content \
  --model gemini-2.5-flash \
  --estimate-only

# 3. Generate suggestions (if cost is acceptable)
alt-text-llm generate \
  --root ./content \
  --model gemini-2.5-flash

# 4. Review and approve suggestions
alt-text-llm label

# 5. Apply approved captions to markdown files
alt-text-llm apply

Configuration

LLM Integration

This tool uses the llm CLI tool to generate alt text. This provides access to many different AI models including:

Setting up your model

For Gemini models (default):

llm install llm-gemini
llm keys set gemini # enter API key
llm -m gemini-2.5-flash "Hello, world!"

For other models:

  1. Install the appropriate llm plugin (e.g., llm install llm-openai)
  2. Configure your API key (e.g., llm keys set openai)
  3. Use the model name with --model flag (e.g., --model gpt-4o-mini)

See the llm documentation for setup instructions and the plugin directory for available models.

Output files

  • asset_queue.json - Queue of assets needing alt text (from scan)
  • suggested_alts.json - AI-generated suggestions (from generate)
  • asset_captions.json - Approved final captions (from label)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alt_text_llm-1.1.1.tar.gz (44.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

alt_text_llm-1.1.1-py3-none-any.whl (24.8 kB view details)

Uploaded Python 3

File details

Details for the file alt_text_llm-1.1.1.tar.gz.

File metadata

  • Download URL: alt_text_llm-1.1.1.tar.gz
  • Upload date:
  • Size: 44.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for alt_text_llm-1.1.1.tar.gz
Algorithm Hash digest
SHA256 077cdd53c4adb0c2dc2ae26f9ed63697343da7aebe8cfd91553f6c11cd14b6d0
MD5 baf62f1118c386a12459a20943e15974
BLAKE2b-256 c7a69ba63992391ed9cf7e78c7b04243dbfc4d9fdbb5296891f143347bcd24a0

See more details on using hashes here.

Provenance

The following attestation bundles were made for alt_text_llm-1.1.1.tar.gz:

Publisher: publish.yml on alexander-turner/alt-text-llm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file alt_text_llm-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: alt_text_llm-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 24.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for alt_text_llm-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9c903e75fddf383932bf4c19f64950a829633c526ecac5e1dfc9f0e06952acf2
MD5 9de0afe550ae31dfc5e7c6d0de981a92
BLAKE2b-256 7c808df2a157a02e48db7a282182c4040098f25f07cc452ed054f8a185f6f610

See more details on using hashes here.

Provenance

The following attestation bundles were made for alt_text_llm-1.1.1-py3-none-any.whl:

Publisher: publish.yml on alexander-turner/alt-text-llm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page