Skip to main content

Validate images before AI pipelines

Project description

imageguard

Validate images before AI pipelines.

PyPI version Python License: MIT Used in production

imageguard is a lightweight Python library that checks whether an image is good enough to feed into an AI pipeline — OCR, background removal, upscaling, dataset training, or any model that is sensitive to image quality.

No image-processing knowledge required. Pass in a path. Get back a decision.

from imageguard import validate

result = validate("photo.jpg")

if not result.ok:
    print("Reject:", result.reason)  # "blurry"
else:
    print("Score:", result.score)    # 0.87

Used in production at changeimageto.com — the image quality checker tool that processes thousands of images per day.


Install

pip install imageguard

For URL support (optional):

pip install "imageguard[url]"

The result object

result.ok        # bool   – True if the image passes all checks
result.reason    # str    – primary issue (empty when ok=True)
result.issues    # list   – every detected issue
result.score     # float  – 0.0 (worst) → 1.0 (best)

Possible issues

Value Meaning
blurry Image is out of focus
low_resolution Too few pixels for reliable processing
noisy Excessive visual noise
pixelated Block artefacts or aggressive upscaling
compressed Severe JPEG compression artefacts
underexposed Image is too dark
overexposed Image is too bright / washed out
clipped Large portions are pure black or white
bad_exposure General lighting issue

Input formats

validate() accepts a file path, a URL, or a numpy array:

from imageguard import validate
import numpy as np

# File path
result = validate("photo.jpg")
result = validate(Path("photo.png"))

# URL (requires: pip install "imageguard[url]")
result = validate("https://example.com/product.jpg")

# Numpy array (RGB, uint8)
arr = np.zeros((480, 640, 3), dtype=np.uint8)
result = validate(arr)

Use cases

OCR pre-check

from imageguard import validate

result = validate("scan.png", thresholds={"blur_score": 60.0, "resolution_score": 70.0})

if not result.ok:
    raise ValueError(f"Document quality too low for OCR: {result.reason}")

text = ocr_engine.read(scan_path)

Dataset cleaning

from pathlib import Path
from imageguard import validate

accepted, rejected = [], []

for img_path in Path("raw_dataset/").glob("**/*.jpg"):
    result = validate(img_path)
    (accepted if result.ok else rejected).append(img_path)

print(f"Accepted {len(accepted)}, rejected {len(rejected)}")

E-commerce product photo gate

from imageguard import validate

def before_background_removal(image_path: str) -> None:
    result = validate(image_path)
    if result.score < 0.6:
        raise ValueError(
            f"Image quality too low (score {result.score:.2f}): {result.issues}"
        )
    # proceed to AI service…

CI/CD image quality gate (GitHub Actions)

- name: Validate image assets
  run: |
    pip install imageguard
    python -c "
    from pathlib import Path
    from imageguard import validate
    import sys
    failures = [(p, validate(p)) for p in Path('assets/').rglob('*.jpg') if not validate(p).ok]
    if failures:
        for p, r in failures: print(f'FAIL {p.name}: {r.reason} ({r.score:.2f})')
        sys.exit(1)
    "

Custom thresholds

from imageguard import validate, DEFAULT_THRESHOLDS

print(DEFAULT_THRESHOLDS)
# {'blur_score': 40.0, 'noise_score': 30.0, 'compression_score': 50.0,
#  'pixelation_score': 50.0, 'exposure_score': 40.0,
#  'resolution_score': 60.0, 'overall_score': 40.0}

# Stricter blur check for OCR
result = validate("scan.jpg", thresholds={"blur_score": 60.0})

# More lenient resolution for thumbnails
result = validate("thumb.jpg", thresholds={"resolution_score": 30.0})

How it works

imageguard computes six quality signals using OpenCV and scikit-image:

Signal Method
Blur Laplacian variance + Tenengrad gradient energy, normalised by texture
Noise SNR + high-pass filter residual on flat regions
Resolution Pixel count with aspect ratio penalty
Exposure Histogram balance + entropy + RMS contrast
Compression 8×8 border energy ratio (blockiness)
Pixelation FFT grid harmonic energy + SSIM round-trip

All signals are normalised to 0–100. The final score (0–1) is a weighted combination with per-issue penalties.


Dependencies

Package Purpose
opencv-python Image loading and signal computation
numpy Array maths
scikit-image SSIM for pixelation check
requests (optional) URL image loading

Learn more


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imageguard-0.1.0.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imageguard-0.1.0-py3-none-any.whl (13.0 kB view details)

Uploaded Python 3

File details

Details for the file imageguard-0.1.0.tar.gz.

File metadata

  • Download URL: imageguard-0.1.0.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for imageguard-0.1.0.tar.gz
Algorithm Hash digest
SHA256 10ee46bcc1a6edb6a414f0671fe6826ca602aa5ad00a5cdc24923e1885fd9245
MD5 553fde0ecf8c3a669f90dbb4358f5f53
BLAKE2b-256 3e54f16af98b5a98fc9e1a3d14cc7cd0e1d795e6b90f2d67666686fe2edcd60c

See more details on using hashes here.

File details

Details for the file imageguard-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: imageguard-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 13.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for imageguard-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 525feb3edd480eea6cf24aee72853ab719cf10648dae17cef97da95da758f077
MD5 aa86cdb9dcb9e35c2e032a9681206eda
BLAKE2b-256 72ebeeb4af011a36390ea661f79c28d0e7a89b29769b0eb2087e942dd351f7b3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page