Organize Google Takeout photos into YYYY/MM/ folders with dedup and reporting
Project description
Degoogle-Photos
Unfuck the mess that Google Takeout makes of your photo library. Takes the dozens of chaotic zip archives, deduplicates, extracts dates, and organizes everything into clean YYYY/MM/ folders with album symlinks and a browsable HTML report.
Why this exists
If you're not paying for the product, you are the product.
Google Photos is free because Google's business model is advertising and data. Their terms of service grant them a worldwide, royalty-free license to use, reproduce, modify, and distribute anything you upload -- including training AI models and creating derivative works. Your "private" album is private from other users, not from Google.
I decided to leave Google Photos for good, but getting out is harder than getting in. Google Takeout -- the only official export tool -- dumps your collection into dozens of numbered zip files with a chaotic structure: albums split across chunks, JSON metadata sidecars with truncated filenames, duplicates scattered everywhere, and no usable date-based organization. For my ~20,000 photos across 46 archives, it was a mess.
The go-to recommendation is the Google Photos Takeout Helper. I tried it. It crashed on a missing geoDataExif field in a JSON sidecar. Moved the problem file out, restarted. Crashed again on the next file. Moved the whole album folder. Crashed on a different folder with the same error. Each crash meant starting from scratch -- no resume support. After several rounds of this whack-a-mole I gave up. With 20,000 files and wildly inconsistent metadata across 46 archives, a tool that dies on the first unexpected field is effectively unusable.
So I built this from scratch with Claude
Sharing it because leaving Google shouldn't require a computer science degree. If the only thing keeping you on Google Photos is "I don't know how to get my photos out," this is how.
What it does
- Scans multiple
Takeout*/Google Photos/directories and builds a global index - Extracts the best date for each file (EXIF > JSON photoTakenTime > filename > JSON creationTime > file mtime)
- Deduplicates by MD5 hash + date (rounded to the minute)
- Copies media files into
YYYY/MM/folders, preserving JSON sidecars alongside - Creates
Albums/folder with relative symlinks for named albums - Generates a multi-page HTML report with thumbnails, metadata tooltips, and Finder links
Getting your photos out of Google
- Go to takeout.google.com
- Click Deselect all, then scroll down and select only Google Photos
- Click Next step
- Choose Export once, file type .zip, and size 2 GB (or 50 GB if you have fast internet and lots of storage)
- Click Create export
- Wait -- Google prepares the archive in the background and emails you when it's ready (can take hours or even days for large collections)
- Download all the zip files and extract them into a single folder
You'll end up with something like Takeout/, Takeout-2/, Takeout-3/, ... each containing a Google Photos/ subfolder. That's your --source directory.
Prerequisites
- Python 3.9+
- A Google Takeout export (see above)
Installation
pip install degoogle-photos
That's it. Pillow (for EXIF extraction) is installed automatically.
Alternative: run from source
git clone https://github.com/couzteau/Degoogle-Photos.git
cd Degoogle-Photos
pip install -e .
Usage
# Simplest: cd into the folder with your Takeout dirs and run
cd /path/to/takeouts
degoogle-photos
# Or specify paths explicitly
degoogle-photos --source /path/to/takeouts --output /path/to/organized
# Preview what would happen (no files copied)
degoogle-photos --dry-run
Options
| Flag | Description |
|---|---|
--source PATH |
Root directory containing Takeout*/ folders (default: current directory) |
--output PATH |
Destination for organized photos (default: ./Google Photos Organized) |
--dry-run |
Report what would be done without copying any files |
How it works
- Index -- Scan all Takeout directories, index media files and JSON sidecars by album
- Match -- Link each media file to its JSON sidecar via title field or filename stripping
- Date extraction -- Extract the best date using a priority cascade (EXIF > JSON > filename > mtime)
- Deduplication -- Skip files with identical MD5 + date (within the same minute)
- Copy -- Copy to
YYYY/MM/filenamewith collision resolution (_2,_3, etc.) - Albums -- Create
Albums/<name>/with relative symlinks to the copied files - Report -- Generate a browsable HTML report with per-folder and per-album pages
HTML Report
The report is written to <output>/report/index.html and includes:
- Dashboard with copy/duplicate/error counts and date-source breakdown
- Per-folder pages with image thumbnails in a responsive grid
- Per-album pages for named albums (generic "Photos from YYYY" albums are excluded)
- Hover tooltips showing EXIF data (camera, ISO, focal length, GPS) and JSON metadata (people, geo, description)
- "Finder" buttons to open the containing folder in macOS Finder
Project structure
degoogle_photos/
__init__.py # Package version
indexing.py # Takeout directory scanning and JSON sidecar indexing
dates.py # Date extraction (EXIF, JSON, filename, mtime)
metadata.py # Rich metadata extraction for report tooltips
dedup.py # MD5 hashing and deduplication keys
copy.py # File copying with collision resolution
report.py # Multi-page HTML report generation
logging_util.py # Migration logging and progress reporting
albums.py # Album symlink creation
cli.py # CLI entry point and orchestration
tests/
conftest.py # Shared test fixtures
test_indexing.py
test_dates.py
test_metadata.py
test_dedup.py
test_copy.py
test_report.py
test_albums.py
migrate_photos.py # Thin wrapper for backward compatibility
pyproject.toml # Project metadata and dependencies
Running tests
pip install -e ".[dev]"
pytest -v
Where to put your photos after
Once your photos are organized, you have options with better privacy terms:
| Service | Terms summary | Cross-platform |
|---|---|---|
| Apple iCloud | Minimal rights -- just enough to sync and store. No ad business model (you pay for storage via iCloud+). | Apple devices + web (non-Apple users can upload via browser to shared albums) |
| Adobe Lightroom | Rights limited to operating services. No generative AI training on customer content. | Full cross-platform |
| Dropbox / OneDrive | Rights limited to providing the service. No promotional or AI training use. | Full cross-platform |
| Self-hosted (Immich, PhotoPrism) | You retain all rights. Requires technical setup. | Web-based, any device |
| Local storage + backup | Your files, your rights. Use the generated report/index.html to browse and review. Back up to an external drive or NAS. |
Any device with file access |
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file degoogle_photos-0.1.4.tar.gz.
File metadata
- Download URL: degoogle_photos-0.1.4.tar.gz
- Upload date:
- Size: 27.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6bfd2632b5218dd853c533beafc0e2e9566c3fccc13f3a3fdd2e0af6449e0977
|
|
| MD5 |
41d6a6b30112f5f2a42058bf816b485d
|
|
| BLAKE2b-256 |
ea3bf2ca260cbcf5d3ff156397f04c9d42305bc3006dd0c1fdd6d918efef2afd
|
File details
Details for the file degoogle_photos-0.1.4-py3-none-any.whl.
File metadata
- Download URL: degoogle_photos-0.1.4-py3-none-any.whl
- Upload date:
- Size: 22.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f75403e1d15d1ea4669dce96cfbf3604d3b79d78fa55cec499da60269992a6b8
|
|
| MD5 |
cef3ef94f11a000f9a063cbe61ca3f4f
|
|
| BLAKE2b-256 |
4bf533db784ba7303bf1d68a1343d38cfc5d9bb45c4f3b562b67f585f4c8d3c0
|