Smart image downsampling for image classification datasets
Project description
smartdownsample
Intelligent image downsampling with enhanced visual similarity detection
SmartDownsample uses multi-dimensional visual features to sample maximally diverse images from large collections. Optimized for camera trap data with center-focused animal detection, color/brightness separation, and natural folder ordering.
Installation
pip install smartdownsample
Features
- 🎯 Multi-dimensional visual similarity - Separates by structure, color, and brightness
- 🐾 Camera trap optimized - Center-focused detection for animals
- 🎨 Color-aware bucketing - Separates grayscale from color images
- 💡 Brightness distinction - Groups dark vs bright scenes
- 📊 Manageable grouping - Creates 16-32 meaningful buckets (not hundreds)
- ⚡ Still fast - Enhanced features with minimal speed impact
- 🎲 Reproducible - Set seed for consistent results
- 📈 Built-in visualization - Thumbnails grids and distribution charts
Usage
from smartdownsample import sample_diverse
# Basic usage - intelligent visual diversity
selected = sample_diverse(
image_paths=my_image_list,
target_count=1000
)
# Full feature usage with visualization
selected = sample_diverse(
image_paths=my_camera_trap_images,
target_count=1000,
hash_size=8, # Perceptual hash size (8 recommended)
n_workers=4, # Parallel workers
show_progress=True, # Progress bars
random_seed=42, # Reproducible results
show_summary=True, # Text statistics
show_distribution=True, # Bucket distribution chart
show_thumbnails=True # 10x10 thumbnail grids per bucket
)
print(f"Selected {len(selected)} images from {len(buckets)} visual similarity groups")
Visualization Options
The algorithm includes three built-in visualization modes to understand bucket quality:
# 1. Text summary (show_summary=True) - Default
selected = sample_diverse(paths, target_count=1000, show_summary=True)
# Prints: bucket sizes, distribution stats, diversity metrics
# 2. Distribution chart (show_distribution=True)
selected = sample_diverse(paths, target_count=1000, show_distribution=True)
# Shows: vertical bar chart of kept vs excluded per bucket
# 3. Thumbnail grids (show_thumbnails=True)
selected = sample_diverse(paths, target_count=1000, show_thumbnails=True)
# Shows: 10x10 grids of first 100 images from each bucket in square layout
# All visualizations together
selected = sample_diverse(paths, target_count=1000,
show_summary=True,
show_distribution=True,
show_thumbnails=True)
How It Works
Multi-dimensional visual similarity algorithm optimized for camera trap data:
1. Multi-Feature Extraction
Each image is analyzed using 4 complementary visual features:
# For each image, compute:
1. DHash (8x8) → Structural patterns, edges, shapes
2. AHash (4x4) → Brightness distribution, contrast
3. Color Variance → Separates grayscale from colorful images
4. Overall Brightness → Separates dark from bright scenes
2. Center-Focused Animal Detection
For camera trap data where animals are typically centered:
# From 8x8 DHash (64 bits), strategically sample center positions:
center_indices = [27, 36] # Center-left and center-right positions
# Bit 27: Detects vertical edges (animal body/legs)
# Bit 36: Detects horizontal edges (animal head/back)
3. Smart Bucket Key Creation
Combine features into meaningful visual groups (max 32 buckets):
bucket_key = (
structure_bit_27, # Center-left animal features (0 or 1)
structure_bit_36, # Center-right animal features (0 or 1)
brightness_pattern, # AHash brightness pattern (0 or 1)
color_type, # Grayscale=0, Color=1
brightness_level # Dark=0, Bright=1
)
# Results in 2×2×2×2×2 = 32 maximum buckets
4. Diversity-Preserving Selection
# Phase 1: Ensure diversity - sample from every bucket
# Phase 2: Fill remaining quota proportionally from largest buckets
# Within buckets: Natural sort preserves camera/folder structure
Example output buckets for camera trap data:
• Bucket 1: Dark grayscale deer (vertical edges)
• Bucket 2: Bright color birds (horizontal patterns)
• Bucket 3: Grayscale empty frames (low structure)
• Bucket 4: Color daytime mammals (mixed patterns)
Algorithm Benefits
Visual Similarity Improvements
- Better separation: Color vs grayscale images grouped separately
- Animal-focused: Center-positioned features detect different animal poses/species
- Brightness aware: Day vs night scenes properly distinguished
- Structure sensitive: Different animal orientations and camera angles detected
- Manageable buckets: 16-32 meaningful groups instead of random mixing
Performance Characteristics
- Still fast: Multi-feature extraction adds minimal overhead (~20% slower)
- Linear scaling: O(n) complexity maintained across all features
- Memory efficient: Features computed on-the-fly, not stored
- Parallel processing: Hash computation parallelized across workers
- Smart bucket counts: Never creates excessive micro-buckets
Camera Trap Optimizations
- Natural sorting: Preserves camera/folder structure (CAM01_IMG_001.jpg → CAM01_IMG_010.jpg)
- Center detection: Focus on image center where animals appear
- Scene variety: Separates empty frames, single animals, multiple animals
- Lighting diversity: Day/night scenes properly represented
- Color preservation: IR grayscale vs color daylight images distinguished
Comparison with Other Methods
| Method | Bucket Quality | Speed | Animal Detection | Color Separation |
|---|---|---|---|---|
| Random sampling | None | Fastest | No | No |
| Single DHash | Poor mixing | Fast | No | No |
| smartdownsample v1.6+ | Excellent | Fast+ | Yes | Yes |
| Complex ML clustering | Perfect | Very Slow | Depends | Yes |
Real Results: Camera Trap Dataset
Before (v1.5): 495 buckets, color/grayscale mixed randomly in each bucket
After (v1.6+): 32 buckets, clear separation:
- Bucket 1: Grayscale deer images (IR night camera)
- Bucket 2: Color bird images (daylight camera)
- Bucket 3: Dark empty frames (nighttime)
- Bucket 4: Bright color mammals (sunny daytime)
Performance: Only ~20% slower than single-hash method, dramatically better visual grouping.
Performance
| Task | Time |
|---|---|
| 100 from 1,000 | <5 sec |
| 900 from 1,000 | <5 sec |
| 1,000 from 24,000 | ~30 sec |
| 23,000 from 24,000 | ~30 sec |
| Any ratio | Fast ✓ |
Parameters
| Parameter | Default | Description |
|---|---|---|
image_paths |
Required | List of image file paths (str or Path objects) |
target_count |
Required | Exact number of images to select |
hash_size |
8 |
Perceptual hash size - 8 recommended for good speed/quality balance |
n_workers |
4 |
Number of parallel workers for hash computation |
show_progress |
True |
Display progress bars during processing |
random_seed |
42 |
Random seed for reproducible bucket selection |
show_summary |
True |
Print bucket statistics and distribution summary |
show_distribution |
False |
Show bucket distribution bar chart (requires matplotlib) |
show_thumbnails |
False |
Show 10x10 thumbnail grids for each bucket (requires matplotlib) |
Parameter Recommendations
For camera trap data:
hash_size=8: Optimal balance of speed and animal detection qualityshow_thumbnails=True: Essential for validating bucket qualityshow_summary=True: Understand bucket size distribution
For other image types:
hash_size=6: Faster processing, may reduce center-detection accuracyhash_size=10: Slower but more detailed structural analysis
Technical Details
Hash Features Explained
DHash (Difference Hash)
- Detects structural patterns, edges, object boundaries
- 8×8 hash = 64 bits representing horizontal gradients
- Center bits (positions 27, 36) focus on animal detection
- Fast computation: resize → grayscale → compare adjacent pixels
AHash (Average Hash)
- Detects brightness patterns and contrast distribution
- 4×4 hash = 16 bits representing above/below average brightness
- Used for distinguishing lighting conditions
- Complements DHash with tonal information
Color Variance
- Separates grayscale (IR cameras) from color (daylight cameras)
- Computed as variance of RGB channel means
- Threshold: variance < 100 = grayscale, ≥ 100 = color
Overall Brightness
- Separates dark (nighttime) from bright (daytime) scenes
- Computed as mean pixel value across all channels
- Threshold: brightness < 128 = dark, ≥ 128 = bright
Performance Notes
- Multi-feature extraction adds ~20% processing time vs single hash
- Parallel hash computation scales linearly with worker count (up to CPU cores)
- Memory usage remains O(1) - features computed on-demand
- Bucket creation is O(n) - no expensive similarity comparisons
License
MIT License – see LICENSE file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file smartdownsample-1.7.1.tar.gz.
File metadata
- Download URL: smartdownsample-1.7.1.tar.gz
- Upload date:
- Size: 17.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d385089f50f3fe60cdc030c8028f05561e6978a7f751723779cd04658c31310a
|
|
| MD5 |
1cb12e451b872f35eac9fe13ce1f61b6
|
|
| BLAKE2b-256 |
a4c5f4ed53df94de2450cf53f16c03ab1b8154a17a09157f21803271993e16f8
|
File details
Details for the file smartdownsample-1.7.1-py3-none-any.whl.
File metadata
- Download URL: smartdownsample-1.7.1-py3-none-any.whl
- Upload date:
- Size: 13.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f204daf4598561c4692f6304c154b39515026d86e6efd6854a1cd81408de59ad
|
|
| MD5 |
4d068429b50e77d236308f01e4d39c3f
|
|
| BLAKE2b-256 |
834a2936d021f94d34016073ab1535925164acbc47d7e2d2001a32571206ae11
|