Validate images before AI pipelines
Project description
imageguard
Validate images before AI pipelines.
imageguard is a lightweight Python library that checks whether an image is good enough to feed into an AI pipeline — OCR, background removal, upscaling, dataset training, or any model that is sensitive to image quality.
No image-processing knowledge required. Pass in a path. Get back a decision.
from imageguard import validate
result = validate("photo.jpg")
if not result.ok:
print("Reject:", result.reason) # "blurry"
else:
print("Score:", result.score) # 0.87
Used in production at changeimageto.com — the image quality checker tool that processes thousands of images per day.
Install
pip install imageguard
For URL support (optional):
pip install "imageguard[url]"
The result object
result.ok # bool – True if the image passes all checks
result.reason # str – primary issue (empty when ok=True)
result.issues # list – every detected issue
result.score # float – 0.0 (worst) → 1.0 (best)
Possible issues
| Value | Meaning |
|---|---|
blurry |
Image is out of focus |
low_resolution |
Too few pixels for reliable processing |
noisy |
Excessive visual noise |
pixelated |
Block artefacts or aggressive upscaling |
compressed |
Severe JPEG compression artefacts |
underexposed |
Image is too dark |
overexposed |
Image is too bright / washed out |
clipped |
Large portions are pure black or white |
bad_exposure |
General lighting issue |
Input formats
validate() accepts a file path, a URL, or a numpy array:
from imageguard import validate
import numpy as np
# File path
result = validate("photo.jpg")
result = validate(Path("photo.png"))
# URL (requires: pip install "imageguard[url]")
result = validate("https://example.com/product.jpg")
# Numpy array (RGB, uint8)
arr = np.zeros((480, 640, 3), dtype=np.uint8)
result = validate(arr)
Use cases
OCR pre-check
from imageguard import validate
result = validate("scan.png", thresholds={"blur_score": 60.0, "resolution_score": 70.0})
if not result.ok:
raise ValueError(f"Document quality too low for OCR: {result.reason}")
text = ocr_engine.read(scan_path)
Dataset cleaning
from pathlib import Path
from imageguard import validate
accepted, rejected = [], []
for img_path in Path("raw_dataset/").glob("**/*.jpg"):
result = validate(img_path)
(accepted if result.ok else rejected).append(img_path)
print(f"Accepted {len(accepted)}, rejected {len(rejected)}")
E-commerce product photo gate
from imageguard import validate
def before_background_removal(image_path: str) -> None:
result = validate(image_path)
if result.score < 0.6:
raise ValueError(
f"Image quality too low (score {result.score:.2f}): {result.issues}"
)
# proceed to AI service…
CI/CD image quality gate (GitHub Actions)
- name: Validate image assets
run: |
pip install imageguard
python -c "
from pathlib import Path
from imageguard import validate
import sys
failures = [(p, validate(p)) for p in Path('assets/').rglob('*.jpg') if not validate(p).ok]
if failures:
for p, r in failures: print(f'FAIL {p.name}: {r.reason} ({r.score:.2f})')
sys.exit(1)
"
Custom thresholds
from imageguard import validate, DEFAULT_THRESHOLDS
print(DEFAULT_THRESHOLDS)
# {'blur_score': 40.0, 'noise_score': 30.0, 'compression_score': 50.0,
# 'pixelation_score': 50.0, 'exposure_score': 40.0,
# 'resolution_score': 60.0, 'overall_score': 40.0}
# Stricter blur check for OCR
result = validate("scan.jpg", thresholds={"blur_score": 60.0})
# More lenient resolution for thumbnails
result = validate("thumb.jpg", thresholds={"resolution_score": 30.0})
How it works
imageguard computes six quality signals using OpenCV and scikit-image:
| Signal | Method |
|---|---|
| Blur | Laplacian variance + Tenengrad gradient energy, normalised by texture |
| Noise | SNR + high-pass filter residual on flat regions |
| Resolution | Pixel count with aspect ratio penalty |
| Exposure | Histogram balance + entropy + RMS contrast |
| Compression | 8×8 border energy ratio (blockiness) |
| Pixelation | FFT grid harmonic energy + SSIM round-trip |
All signals are normalised to 0–100. The final score (0–1) is a weighted combination with per-issue penalties.
Dependencies
| Package | Purpose |
|---|---|
opencv-python |
Image loading and signal computation |
numpy |
Array maths |
scikit-image |
SSIM for pixelation check |
requests (optional) |
URL image loading |
Learn more
- Try the online image quality checker — no code needed
- Why AI pipelines fail on bad images
- How to filter blurry images before OCR
- Image validation for machine learning datasets
- Python image quality check guide
- Automate image quality checks in CI/CD
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file imageguard-0.1.0.tar.gz.
File metadata
- Download URL: imageguard-0.1.0.tar.gz
- Upload date:
- Size: 14.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
10ee46bcc1a6edb6a414f0671fe6826ca602aa5ad00a5cdc24923e1885fd9245
|
|
| MD5 |
553fde0ecf8c3a669f90dbb4358f5f53
|
|
| BLAKE2b-256 |
3e54f16af98b5a98fc9e1a3d14cc7cd0e1d795e6b90f2d67666686fe2edcd60c
|
File details
Details for the file imageguard-0.1.0-py3-none-any.whl.
File metadata
- Download URL: imageguard-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
525feb3edd480eea6cf24aee72853ab719cf10648dae17cef97da95da758f077
|
|
| MD5 |
aa86cdb9dcb9e35c2e032a9681206eda
|
|
| BLAKE2b-256 |
72ebeeb4af011a36390ea661f79c28d0e7a89b29769b0eb2087e942dd351f7b3
|