Bulk convert images to WebP and automatically update URLs in markdown files
Project description
bulk-webp-url-replacer
Bulk convert images to WebP and automatically update URLs in markdown files with a custom CDN prefix.
Features
- 🔍 Extract image URLs from markdown files (frontmatter, galleries, inline images)
- 📥 Download images from remote URLs (parallel downloads)
- 🖼️ Convert to optimized WebP format
- 🔄 Replace original URLs with new CDN-prefixed paths
- ⏭️ Skip already-processed images and excluded extensions
- 👀 Dry-run mode to preview changes
Installation
pip install bulk-webp-url-replacer
Or install from source:
git clone https://github.com/HoangYell/bulk-webp-url-replacer.git
cd bulk-webp-url-replacer
pip install -e .
Usage
CLI
# Dry run - preview what would be processed
bulk-webp-url-replacer \
--scan-dir ./content \
--output-dir ./webp_images \
--dry-run
# Full run with custom URL prefix
bulk-webp-url-replacer \
--scan-dir ./content \
--output-dir ./webp_images \
--new-url-prefix "https://cdn.example.com/images"
# Faster with more threads
bulk-webp-url-replacer \
--scan-dir ./content \
--output-dir ./webp_images \
--new-url-prefix "https://cdn.example.com/images" \
--threads 8
As Python Module
python -m bulk_webp_url_replacer \
--scan-dir ./content \
--output-dir ./webp_images \
--new-url-prefix "https://cdn.example.com/images"
Programmatic Usage
from bulk_webp_url_replacer import ImageETL, ImageURLExtractor
# Full ETL pipeline
etl = ImageETL(
content_dir="./content",
webp_dir="./webp_images",
webp_base_url="https://cdn.example.com/images",
quality=80,
max_width=1200,
exclude_extensions=["gif", "svg", "webp", "ico"],
threads=4
)
# Dry run to preview changes
result = etl.run(dry_run=True)
print(f"Found {result.total_urls} URLs, {result.skipped} already processed")
# Full run
result = etl.run(dry_run=False)
print(f"Converted {result.converted} images, {result.failed} failed")
# Or just extract URLs without processing
extractor = ImageURLExtractor()
urls = extractor.extract_from_directory("./content")
for file_path, line_num, url in urls:
print(f"{file_path}:{line_num} -> {url}")
Options
| Option | Required | Default | Description |
|---|---|---|---|
--scan-dir |
Yes | - | Directory to scan for files containing image URLs |
--output-dir |
Yes | - | Directory to save converted WebP images |
--new-url-prefix |
No | - | URL prefix to replace old image URLs |
--quality |
No | 80 | WebP quality 1-100 |
--max-width |
No | 1200 | Max image width in pixels |
--exclude-ext |
No | gif svg webp ico | File extensions to skip |
--threads |
No | 4 | Number of parallel download threads |
--dry-run |
No | - | Preview changes without downloading or modifying files |
Supported Patterns
The tool detects image URLs in:
# YAML frontmatter
---
image: "https://example.com/image.jpg"
---
# TOML frontmatter
+++
image = "https://example.com/image.jpg"
+++
# Gallery shortcodes
{{< gallery >}}
- https://example.com/photo1.jpg
- https://example.com/photo2.png
{{< /gallery >}}
# HTML img tags in shortcodes
{{< embed >}}
<img src="https://example.com/image.jpg" width="250" height="250"/>
{{< /embed >}}
# Standard markdown

Output
After running, you'll have:
- WebP images in your
--output-dir - mapping.json tracking original → WebP conversions
- Updated files with new URLs
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bulk_webp_url_replacer-0.1.1.tar.gz.
File metadata
- Download URL: bulk_webp_url_replacer-0.1.1.tar.gz
- Upload date:
- Size: 8.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fa97afab404065062a8d1212f057eb7891fade6a15847792a02b77c751fd7633
|
|
| MD5 |
5156358cd0c8024fb0a08d13f2de193f
|
|
| BLAKE2b-256 |
a4f568452c0e3cd08cf91737cbd1f7b958e8ffa5e33530e20fac5d3f092491af
|
File details
Details for the file bulk_webp_url_replacer-0.1.1-py3-none-any.whl.
File metadata
- Download URL: bulk_webp_url_replacer-0.1.1-py3-none-any.whl
- Upload date:
- Size: 11.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0bb1ff7728f66083d03e6db55ab3423352f54f9b4e3181ce26c9e7fe8b66a3a7
|
|
| MD5 |
f0a0348a17d120e57941b97ea4998196
|
|
| BLAKE2b-256 |
585729c87e19a7e4526c42b92f295f7c743d107f4abd5279bb51bd5fd16ddbb8
|