Organize Google Takeout photos into YYYY/MM/ folders with dedup and reporting

These details have not been verified by PyPI

Project links

Project description

Degoogle-Photos

Unfuck the mess that Google Takeout makes of your photo library. Takes the dozens of chaotic zip archives, deduplicates, extracts dates, and organizes everything into clean YYYY/MM/ folders with album symlinks and a browsable HTML report.

Why this exists

If you're not paying for the product, you are the product.

Google Photos is free because their business model is advertising and data. Their terms grant them a worldwide, royalty-free license to use, reproduce, modify, and distribute your uploads -- including AI training. Your "private" album is private from other users, not from Google.

I decided to leave. Google Takeout -- the only official export -- dumps your library into dozens of numbered zips: albums split across chunks, JSON metadata with truncated filenames, duplicates everywhere, no usable organization. For my ~20,000 photos across 46 archives, it was unusable.

The popular Google Photos Takeout Helper crashed repeatedly on missing metadata fields with no resume support. After several rounds of whack-a-mole I gave up.

So I built this with Claude. Sharing it because leaving Google shouldn't require a computer science degree.

Getting your photos out of Google

Go to takeout.google.com
Click Deselect all, then scroll down and select only Google Photos
Click Next step
Choose Export once, file type .zip, and size 2 GB (or 50 GB if you have fast internet and lots of storage)
Click Create export
Wait -- Google prepares the archive in the background and emails you when it's ready (can take hours or even days for large collections)
Download all the zip files and extract them into a single folder

You'll end up with something like Takeout/, Takeout-2/, Takeout-3/, ... each containing a Google Photos/ subfolder. That's your --source directory.

What it does

Takeout migration mode (default):

Scans multiple Takeout*/Google Photos/ directories and builds a global index
Extracts the best date for each file (EXIF > JSON photoTakenTime > filename > JSON creationTime > file mtime)
Deduplicates by MD5 hash + date (rounded to the minute)
Copies media files into YYYY/MM/ folders, preserving JSON sidecars alongside
Creates Albums/ folder with relative symlinks for named albums
Generates a multi-page HTML report with thumbnails, metadata tooltips, and Finder links

Dedup mode (--dedup-scan):

Scans any folder and its subdirectories for duplicate media files
Copies one unique file per duplicate group into a date-organised YYYY/MM/ structure
Recreates the original folder tree under by-folder/ as symlinks pointing at the date-organised files
The source folder is never modified

Prerequisites

Python 3.9+
A Google Takeout export (see above)

Installation

Windows:

pip install degoogle-photos

macOS / Linux:

pip3 install degoogle-photos

That's it. Pillow (for EXIF extraction) is installed automatically.

Why pip3? Many macOS and Linux systems still have Python 2.7 as the default pip. If you see "No matching distribution found" or warnings about Python 2.7, that's why. pip3 ensures you're using Python 3.

Troubleshooting — macOS: command not found: degoogle-photos? The package installed correctly, but pip placed the executable in a user-local directory that isn't on your PATH by default. Fix it by running:
export PATH="$HOME/Library/Python/3.9/bin:$PATH"
degoogle-photos
Replace 3.9 with your actual Python version (check with python3 --version). To make this permanent so it survives Terminal restarts, add the export line to your ~/.zshrc.

Usage

Takeout migration

# Simplest: cd into the folder with your extracted Takeout dirs and run
cd /path/to/takeouts
degoogle-photos

# Or specify paths explicitly
degoogle-photos --source /path/to/takeouts --output /path/to/organized

# Preview what would happen (no files copied)
degoogle-photos --dry-run

The script is safe to stop and restart at any time. It detects files that have already been copied and skips them, so you'll never end up with duplicates — even if you run it multiple times or interrupt it halfway through.

Dedup mode

Deduplicate any folder without needing a Takeout structure. The source is never modified — a clean copy is written to the output folder.

# Dry run first — see what would be copied and which groups are duplicates
degoogle-photos --dedup-scan --dry-run \
  --source "/path/to/photo backup" \
  --output /path/to/output

# Full run — copies unique files, source untouched
degoogle-photos --dedup-scan \
  --source "/path/to/photo backup" \
  --output /path/to/output

Output structure:

output/
  2019/07/IMG_001.jpg        ← unique file, date-organised
  2020/03/VID_001.mp4
  needs_review/IMG_nodate.jpg
  by-folder/                 ← original folder tree as symlinks
    vacation 2019/
      IMG_001.jpg  →  ../../2019/07/IMG_001.jpg
    birthday/
      VID_001.mp4  →  ../../2020/03/VID_001.mp4
  report/index.html

Within each duplicate group the file with the shortest path is kept; all others are skipped.

If degoogle-photos is not found, run it as a module instead:

python3 -m degoogle_photos.cli --dedup-scan --dry-run \
  --source "/path/to/photo backup" \
  --output /path/to/output

All options

Flag	Default	Description
`--source PATH`	current directory	Source root (Takeout dirs for migration; any folder for `--dedup-scan`)
`--output PATH`	`./DeGoogled Photos`	Destination for organised photos or dedup output
`--dry-run`	off	Report what would be done without copying any files
`--dedup-scan`	off	Dedup mode: scan any folder instead of running a Takeout migration

How it works

Takeout migration

Index — Scan all Takeout directories, index media files and JSON sidecars by album
Match — Link each media file to its JSON sidecar via title field or filename stripping
Date extraction — Extract the best date using a priority cascade (EXIF > JSON > filename > mtime)
Deduplication — Skip files with identical MD5 + date (within the same minute)
Copy — Copy to YYYY/MM/filename with collision resolution (_2, _3, etc.)
Albums — Create Albums/<name>/ with relative symlinks to the copied files
Report — Generate a browsable HTML report with per-folder and per-album pages

Dedup mode

Scan — Recursively find all media files under --source
Checksum — Compute MD5 for every file; group identical files together
Copy — For each unique file (or duplicate group keeper), copy to YYYY/MM/ using the same date-extraction cascade; name collisions get a _2, _3 suffix
Symlinks — Recreate the source folder tree under by-folder/ with relative symlinks pointing at the date-organised copies
Report — Generate an HTML report listing all duplicate groups with COPIED / SKIPPED status per file

HTML Report

The report is written to <output>/report/index.html and includes:

Dashboard with copy/duplicate/error counts and date-source breakdown
Per-folder pages with image thumbnails in a responsive grid
Per-album pages for named albums (generic "Photos from YYYY" albums are excluded)
Hover tooltips showing EXIF data (camera, ISO, focal length, GPS) and JSON metadata (people, geo, description)
"Finder" buttons to open the containing folder in macOS Finder

Project structure

degoogle_photos/
  __init__.py          # Package version
  indexing.py          # Takeout directory scanning, JSON sidecar indexing, recursive file finder
  dates.py             # Date extraction (EXIF, JSON, filename, mtime)
  metadata.py          # Rich metadata extraction for report tooltips
  dedup.py             # MD5 hashing, deduplication keys, duplicate grouping
  copy.py              # File copying with collision resolution
  report.py            # HTML report generation (migration + dedup modes)
  logging_util.py      # Migration logging and progress reporting
  albums.py            # Album symlink creation
  cli.py               # CLI entry point — migration and dedup-scan orchestration
tests/
  conftest.py          # Shared test fixtures
  test_indexing.py
  test_dates.py
  test_metadata.py
  test_dedup.py
  test_dedup_mode.py   # End-to-end integration tests for --dedup-scan
  test_copy.py
  test_report.py
  test_albums.py
migrate_photos.py      # Thin wrapper for backward compatibility
pyproject.toml         # Project metadata and dependencies

Running tests

pip install -e ".[dev]"
pytest -v

Where to put your photos after

Once your photos are organized, you have options with better privacy terms:

Recommended: Immich (self-hosted Google Photos replacement)

Immich is a free, open-source, self-hosted photo platform with face recognition, map view, timeline browsing, mobile apps, and AI-powered search -- all running on your own hardware. Your photos never leave your network. It's the closest thing to Google Photos without giving up your privacy.

Setup is quick -- it runs locally via Docker and the install guide is straightforward. If you get stuck, any AI assistant can walk you through it in minutes. Once running, Immich's smart search (by face, location, object, or scene) fully replaces what you'd need Google Photos for when it comes to finding and sorting your photos.

After running degoogle-photos, create an API key in the Immich web UI (Account Settings > API Keys), then authenticate and upload:

immich login http://localhost:2283 YOUR-API-KEY
immich upload --recursive /path/to/DeGoogled\ Photos

Immich will pick up the dates and folder structure automatically.

Other options

Service	Terms summary	Cross-platform	License	Storage
Apple iCloud	Minimal rights -- just enough to sync and store. No ad business model.	Apple devices + web (non-Apple users can upload via browser)	Free	Paid
Adobe Lightroom	Rights limited to operating services. No generative AI training on customer content.	Full cross-platform	Paid	Included
Dropbox / OneDrive	Rights limited to providing the service. No promotional or AI training use.	Full cross-platform	Free tier available	Paid
Local storage + backup	Your files, your rights. Use the generated `report/index.html` to browse and review.	Any device with file access	Free	Free

Roadmap

See ROADMAP.md for planned features.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.0 yanked

Feb 15, 2026

Reason this release was yanked:

Needs bake time

0.2.1

Apr 26, 2026

This version

0.2.0

Apr 26, 2026

0.1.8

Feb 20, 2026

0.1.7

Feb 16, 2026

0.1.6

Feb 16, 2026

0.1.5

Feb 16, 2026

0.1.4

Feb 16, 2026

0.1.3 yanked

Feb 16, 2026

0.1.2 yanked

Feb 16, 2026

0.1.1 yanked

Feb 16, 2026

Reason this release was yanked:

too many bugs

0.1.0 yanked

Feb 15, 2026

Reason this release was yanked:

too many bugs

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

degoogle_photos-0.2.0.tar.gz (35.9 kB view details)

Uploaded Apr 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

degoogle_photos-0.2.0-py3-none-any.whl (27.8 kB view details)

Uploaded Apr 26, 2026 Python 3

File details

Details for the file degoogle_photos-0.2.0.tar.gz.

File metadata

Download URL: degoogle_photos-0.2.0.tar.gz
Upload date: Apr 26, 2026
Size: 35.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for degoogle_photos-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`e18f31766e7795f20125fddb9f0e436a7ad8b7a3cf03972b9aae623761b21684`
MD5	`2960241ab5a9c4b27530807008d74b06`
BLAKE2b-256	`1c51d0fa5f510f526b830cee6154936f92d037df5200054dbadbfb3b494639ac`

See more details on using hashes here.

File details

Details for the file degoogle_photos-0.2.0-py3-none-any.whl.

File metadata

Download URL: degoogle_photos-0.2.0-py3-none-any.whl
Upload date: Apr 26, 2026
Size: 27.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for degoogle_photos-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`747827f0d2b172651579c69db3bfec4eb573200d7047ec52342b4b35f273087c`
MD5	`98117ebe44c2a08daa72db204891d42f`
BLAKE2b-256	`1eaf7d10d25d3fae6ec3a2a7368adb805805b8f2880ec01c5d40e9b411ef9a8d`

See more details on using hashes here.

degoogle-photos 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Degoogle-Photos

Why this exists

Getting your photos out of Google

What it does

Prerequisites

Installation

Usage

Takeout migration

Dedup mode

All options

How it works

Takeout migration

Dedup mode

HTML Report

Project structure

Running tests

Where to put your photos after

Recommended: Immich (self-hosted Google Photos replacement)

Other options

Roadmap

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes