Reduce vision API costs by 60-80% with algorithmic image optimization
Project description
imgslim
Reduce your vision API costs by 60–80% before the image ever leaves your machine.
No AI. No external calls. Pure algorithmic compression tuned for how LLMs actually tokenize images.
pip install imgslim
from imgslim import VisionLite
result = VisionLite(model="claude").optimize("screenshot.png")
# {'output_path': '/tmp/imgslim_output/screenshot_optimized.png',
# 'savings_pct': 74.3,
# 'strategy_used': ['strip_exif', 'resize', 'grayscale']}
Why this exists
Vision APIs charge by image tokens. Claude, GPT-4o, and Gemini all slice images into tiles before processing — the larger the image, the more tiles, the higher the bill.
Most images sent to LLMs are massively over-sized for what the model actually needs to understand them. A 3000×2000 photo contains the same semantic information as a 1092×728 one, from the model's perspective.
imgslim exploits the exact tile thresholds of each model to resize images to the optimal resolution — plus strips EXIF metadata, removes whitespace margins, and converts to grayscale when color adds nothing.
Benchmark
Tested on 5 synthetic content types × 3 models. All transformations are algorithmic — no AI used.
| Model | Sample | Input KB | Output KB | Saving % | Strategies |
|---|---|---|---|---|---|
| claude | diagram | 95 KB | 38 KB | 59.4% | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| claude | mixed | 678 KB | 225 KB | 66.8% | strip_exif, resize_to_tile_limit, convert_to_grayscale, crop_whitespace |
| claude | photo | 6877 KB | 226 KB | 96.7% | strip_exif, resize_to_tile_limit |
| claude | screenshot | 98 KB | 21 KB | 78.4% | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| claude | text_doc | 934 KB | 161 KB | 82.8% | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| gemini | diagram | 95 KB | 81 KB | 14.9% | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| gemini | mixed | 678 KB | 300 KB | 55.8% | strip_exif, resize_to_tile_limit, convert_to_grayscale, crop_whitespace |
| gemini | photo | 6877 KB | 563 KB | 91.8% | strip_exif, resize_to_tile_limit |
| gemini | screenshot | 98 KB | 35 KB | 64.1% | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| gemini | text_doc | 934 KB | 306 KB | 67.3% | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| gpt-4o | diagram | 95 KB | 36 KB | 61.9% | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| gpt-4o | mixed | 678 KB | 194 KB | 71.4% | strip_exif, resize_to_tile_limit, convert_to_grayscale, crop_whitespace |
| gpt-4o | photo | 6877 KB | 190 KB | 97.2% | strip_exif, resize_to_tile_limit |
| gpt-4o | screenshot | 98 KB | 18 KB | 81.8% | strip_exif, resize_to_tile_limit, convert_to_grayscale |
| gpt-4o | text_doc | 934 KB | 138 KB | 85.2% | strip_exif, resize_to_tile_limit, convert_to_grayscale |
Average saving: 71.7% across all content types and models.
Benchmark script included:
python benchmark/run_benchmark.py
How it works
1. Content detection (no AI)
OpenCV heuristics classify the image as photo, text, or diagram
based on edge density, contour count, and line detection.
2. Strategy selection
All images → strip_exif + resize_to_tile_limit
text/diagram → + convert_to_grayscale
text → + crop_whitespace
3. Model-aware resize
Each model has documented tile thresholds. imgslim resizes to just below the optimal boundary — never upscales, always preserves aspect ratio.
MODEL_TILE_LIMITS = {
"claude": {"max_size": 1568, "optimal_long_edge": 1092},
"gpt-4o": {"max_size": 2048, "optimal_long_edge": 1024},
"gemini": {"max_size": 3072, "optimal_long_edge": 1536},
}
Usage
from imgslim import VisionLite
# Default: optimized for Claude
v = VisionLite(model="claude")
result = v.optimize("invoice.jpg")
print(result["savings_pct"]) # 81.2
print(result["strategy_used"]) # ['strip_exif', 'resize', 'grayscale', 'crop']
print(result["output_path"]) # /tmp/imgslim_output/invoice_optimized.jpg
# Then pass output_path to your API call as usual
Supported models: claude, gpt-4o, gemini
Install
pip install imgslim
Dependencies: Pillow, opencv-python-headless, piexif
Python 3.10+
What imgslim does NOT do
- No quality assessment (does not verify the model still "understands" the image)
- No batch processing yet
- No async support yet
- Does not tune JPEG quality dynamically
These are on the roadmap. PRs welcome.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file imgslim-0.1.0.tar.gz.
File metadata
- Download URL: imgslim-0.1.0.tar.gz
- Upload date:
- Size: 9.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ba7f9615af71590ef9632121b90539d0caeb27449c839d754be352dc7e1997c
|
|
| MD5 |
167217f01959b72db7a6a934a2bc81bc
|
|
| BLAKE2b-256 |
8e0a058ec77a3aea9eb414678455d8d107ba50849f5f6525266076886e9c9eb5
|
File details
Details for the file imgslim-0.1.0-py3-none-any.whl.
File metadata
- Download URL: imgslim-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8ec565ba03d25f53e235444fb088532017f2f8e5601995ca0a01c2593ee3b5bc
|
|
| MD5 |
4e3d1abcc39229a02b94848ca56ea393
|
|
| BLAKE2b-256 |
0f759064687e8b5a5e7b80cdd3e6c9dc6f85d5f8083ef1106eb42bbc4ac7b6df
|