Skip to main content

Reduce vision API costs by 60-80% with algorithmic image optimization

Project description

imgslim

Reduce your vision API costs by 60–80% before the image ever leaves your machine.

No AI. No external calls. Pure algorithmic compression tuned for how LLMs actually tokenize images.

pip install imgslim
from imgslim import VisionLite

result = VisionLite(model="claude").optimize("screenshot.png")
# {'output_path': '/tmp/imgslim_output/screenshot_optimized.png',
#  'savings_pct': 74.3,
#  'strategy_used': ['strip_exif', 'resize', 'grayscale']}

Why this exists

Vision APIs charge by image tokens. Claude, GPT-4o, and Gemini all slice images into tiles before processing — the larger the image, the more tiles, the higher the bill.

Most images sent to LLMs are massively over-sized for what the model actually needs to understand them. A 3000×2000 photo contains the same semantic information as a 1092×728 one, from the model's perspective.

imgslim exploits the exact tile thresholds of each model to resize images to the optimal resolution — plus strips EXIF metadata, removes whitespace margins, and converts to grayscale when color adds nothing.


Benchmark

Tested on 5 synthetic content types × 3 models. All transformations are algorithmic — no AI used.

Model Sample Input KB Output KB Saving % Strategies
claude diagram 95 KB 38 KB 59.4% strip_exif, resize_to_tile_limit, convert_to_grayscale
claude mixed 678 KB 225 KB 66.8% strip_exif, resize_to_tile_limit, convert_to_grayscale, crop_whitespace
claude photo 6877 KB 226 KB 96.7% strip_exif, resize_to_tile_limit
claude screenshot 98 KB 21 KB 78.4% strip_exif, resize_to_tile_limit, convert_to_grayscale
claude text_doc 934 KB 161 KB 82.8% strip_exif, resize_to_tile_limit, convert_to_grayscale
gemini diagram 95 KB 81 KB 14.9% strip_exif, resize_to_tile_limit, convert_to_grayscale
gemini mixed 678 KB 300 KB 55.8% strip_exif, resize_to_tile_limit, convert_to_grayscale, crop_whitespace
gemini photo 6877 KB 563 KB 91.8% strip_exif, resize_to_tile_limit
gemini screenshot 98 KB 35 KB 64.1% strip_exif, resize_to_tile_limit, convert_to_grayscale
gemini text_doc 934 KB 306 KB 67.3% strip_exif, resize_to_tile_limit, convert_to_grayscale
gpt-4o diagram 95 KB 36 KB 61.9% strip_exif, resize_to_tile_limit, convert_to_grayscale
gpt-4o mixed 678 KB 194 KB 71.4% strip_exif, resize_to_tile_limit, convert_to_grayscale, crop_whitespace
gpt-4o photo 6877 KB 190 KB 97.2% strip_exif, resize_to_tile_limit
gpt-4o screenshot 98 KB 18 KB 81.8% strip_exif, resize_to_tile_limit, convert_to_grayscale
gpt-4o text_doc 934 KB 138 KB 85.2% strip_exif, resize_to_tile_limit, convert_to_grayscale

Average saving: 71.7% across all content types and models.

Benchmark script included: python benchmark/run_benchmark.py


How it works

1. Content detection (no AI)

OpenCV heuristics classify the image as photo, text, or diagram based on edge density, contour count, and line detection.

2. Strategy selection

All images   → strip_exif + resize_to_tile_limit
text/diagram → + convert_to_grayscale
text         → + crop_whitespace

3. Model-aware resize

Each model has documented tile thresholds. imgslim resizes to just below the optimal boundary — never upscales, always preserves aspect ratio.

MODEL_TILE_LIMITS = {
    "claude": {"max_size": 1568, "optimal_long_edge": 1092},
    "gpt-4o": {"max_size": 2048, "optimal_long_edge": 1024},
    "gemini": {"max_size": 3072, "optimal_long_edge": 1536},
}

Usage

from imgslim import VisionLite

# Default: optimized for Claude
v = VisionLite(model="claude")
result = v.optimize("invoice.jpg")

print(result["savings_pct"])      # 81.2
print(result["strategy_used"])    # ['strip_exif', 'resize', 'grayscale', 'crop']
print(result["output_path"])      # /tmp/imgslim_output/invoice_optimized.jpg

# Then pass output_path to your API call as usual

Supported models: claude, gpt-4o, gemini


Install

pip install imgslim

Dependencies: Pillow, opencv-python-headless, piexif

Python 3.10+


What imgslim does NOT do

  • No quality assessment (does not verify the model still "understands" the image)
  • No batch processing yet
  • No async support yet
  • Does not tune JPEG quality dynamically

These are on the roadmap. PRs welcome.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imgslim-0.1.0.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imgslim-0.1.0-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file imgslim-0.1.0.tar.gz.

File metadata

  • Download URL: imgslim-0.1.0.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for imgslim-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4ba7f9615af71590ef9632121b90539d0caeb27449c839d754be352dc7e1997c
MD5 167217f01959b72db7a6a934a2bc81bc
BLAKE2b-256 8e0a058ec77a3aea9eb414678455d8d107ba50849f5f6525266076886e9c9eb5

See more details on using hashes here.

File details

Details for the file imgslim-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: imgslim-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for imgslim-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8ec565ba03d25f53e235444fb088532017f2f8e5601995ca0a01c2593ee3b5bc
MD5 4e3d1abcc39229a02b94848ca56ea393
BLAKE2b-256 0f759064687e8b5a5e7b80cdd3e6c9dc6f85d5f8083ef1106eb42bbc4ac7b6df

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page