Scan macOS Photos library, detect and identify birds, write species captions
Project description
preen
Groom your photo library — automatically find and name every bird.
Preen scans your macOS Photos library, detects birds with YOLO, identifies species with SuperPicky's OSEA classifier (10,964 species), and writes bilingual keywords and captions.
Features
- Scans entire Photos library including iCloud photos
- YOLO multi-bird detection — finds all birds in a photo
- OSEA species identification with GPS-based eBird regional filtering
- Keywords:
白鹭 (Little Egret)+ pinyinbailuper species - Captions:
白鹭, 苍鹭 (Little Egret, Grey Heron) - Parallel iCloud downloads via PhotoKit (no Photos.app dependency for reads)
- SQLite checkpoint — pause/resume, incremental or full rescan
- Auto-retries failed iCloud exports on next run
- Supports JPEG, HEIC, JXL, AVIF, and RAW formats (ARW, CR2, CR3, NEF, DNG, RAF)
Requirements
- macOS with Photos.app
- Python 3.11+
Installation
pipx install birdpreen
Or with pip:
pip install birdpreen
On first scan, model files (~260 MB) are automatically downloaded from HuggingFace.
Usage
# Scan new photos (incremental)
preen scan
# Full library rescan
preen scan --full
# Dry run — detect and identify without writing
preen scan --dry-run
# Custom confidence threshold (default: 70%)
preen scan --threshold 65
# Process in batches
preen scan --batch-size 500
# Adjust parallel iCloud downloads (default: 16)
preen scan --workers 32
# Regional filter for photos without GPS (uses AVONET distribution data)
preen scan --country CN
# Regional filter using eBird species lists (more curated, supports subnational)
preen scan --region US-CA
# Scan a Shanghai photo library with English-only captions
preen scan --region CN-31 --caption-format "{en}" --keyword-format "{en} {latin}"
# Scan UK photos with Latin names, slash-separated
preen scan --country GB --caption-format "{en} ({latin})" --caption-separator " / "
# Stricter threshold when no regional filter matches (default: 90%)
preen scan --global-threshold 95
# Rescan unidentified birds with a lower global threshold
# Only re-processes photos where birds were detected but not identified
preen scan --threshold 60 --global-threshold 85
# Check progress
preen status
# Reset checkpoint (auto-creates backup)
preen reset
# Restore checkpoint from latest backup
preen restore
Regional species filtering
When photos have GPS, preen automatically filters species using a two-level fallback: AVONET 1×1° grid → country-level eBird list → global. For photos without GPS, use --country or --region:
--country XX— queries AVONET distribution data using a bounding box. Covers 49 countries including CH, HK, TW, NP where eBird lists aren't bundled.--region XXor--region XX-YY— uses curated offline eBird species lists. Supports 51 countries + 92 subnational regions (all US states, CN provinces, AU states). Falls back to country-level if subnational file not found.
Caption and keyword format
The --caption-format flag controls how each species appears in the photo description. The --keyword-format flag controls which fields become individual keywords (space-delimited). Available placeholders:
| Placeholder | Example |
|---|---|
{cn} |
白鹭 |
{cn_trad} |
白鷺 |
{en} |
Little Egret |
{latin} |
Egretta garzetta |
{pinyin} |
bailu |
Tuning --workers
The --workers flag controls how many iCloud photos are downloaded in parallel (default: 16). The scan output shows a queue indicator like q:12/16 — ready/total. "Ready" means downloaded and waiting for the GPU; "total" is the queue size.
- If ready often drops to 0, downloads can't keep up — increase workers
- If queue is often full (e.g.
q:16/16), GPU is the bottleneck — check for other GPU-intensive processes - If you have plenty of RAM (each queued image uses ~50-100MB), 32 workers is safe
- For sequential processing (most reliable), use
--workers 1
Credits
- OSEA bird classification model (10,964 species) by Sun Jiao
- Bird identification logic (OSEA classifier, AVONET geographic filtering, eBird species data) extracted from SuperPicky
- YOLO11 segmentation model by Ultralytics
- Photos library access via PhotoKit through PyObjC
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file birdpreen-0.5.0.tar.gz.
File metadata
- Download URL: birdpreen-0.5.0.tar.gz
- Upload date:
- Size: 39.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7604a54ad98f1565c30cc4362b26b4f0527b9b0ce381da991dd3f7046d809278
|
|
| MD5 |
e7d1029a15a2a4c0b444e7eab2d8b0f5
|
|
| BLAKE2b-256 |
14ddcfc7b2f9ff86dcad37f641aeb94227301de37e2d8d1be725e23e4673cb38
|
File details
Details for the file birdpreen-0.5.0-py3-none-any.whl.
File metadata
- Download URL: birdpreen-0.5.0-py3-none-any.whl
- Upload date:
- Size: 36.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7430cc4135d14228c7d111df55a9880e0feaa9a49b53171bdf06809d33dbca02
|
|
| MD5 |
c5b9ac40181fa66d25130f2c6ba233fa
|
|
| BLAKE2b-256 |
c99b40bc19faf7ae234aefb9eaf4cb1b8a693fd06b4e5b7480fef63f99dbb885
|