A CLI tool to remove, edit, or selectively filter metadata from images, documents, audio, and video files.
Project description
Metadata Cleaner
Metadata Cleaner is a privacy-focused local tool for viewing and removing metadata from images, documents, audio, and video files. It writes cleaned copies by default, keeps originals unchanged, and includes CLI, JSON automation, Docker, and a local Web UI for side-by-side metadata checks.
Highlights
- View metadata before cleaning.
- Remove metadata into separate cleaned copies.
- Compare original and cleaned metadata in a local-only Web UI.
- Process individual files or recursive folders.
- Generate machine-readable JSON reports for automation.
- Add SHA-256, SHA-512, or BLAKE2b checksums to reports.
- Preserve source timestamps on cleaned outputs when needed.
- Publish-ready CI coverage for Python, package smoke tests, Docker builds, CodeQL, and dependency audits.
Supported Files
- Images: JPG, JPEG, PNG, TIFF, WEBP, AVIF, HEIC, HEIF
- Documents: PDF, DOCX, EPUB, ODT, TXT
- Audio: MP3, WAV, FLAC, OGG, AAC, M4A, WMA
- Video: MP4, MKV, MOV, AVI, WEBM, FLV
Some formats need system tools for best coverage:
ffmpegandffprobeare required for video metadata handling.exiftoolis required for AVIF, HEIC, and HEIF cleanup and improves image metadata coverage.- The published Docker image includes these optional tools.
Install
Requires Python 3.11 or newer.
pip install metadata-cleaner
metadata-cleaner --help
Use Docker when you want the optional system tools preinstalled:
docker run --rm -v "$(pwd):/data" ghcr.io/sandy-sp/metadata-cleaner:latest delete /data/photos
For development:
git clone https://github.com/sandy-sp/metadata-cleaner.git
cd metadata-cleaner
poetry install --with dev
poetry run metadata-cleaner --help
Quick Start
View metadata:
metadata-cleaner view sample.jpg
Clean one file:
metadata-cleaner delete sample.jpg
By default, the cleaned copy is written under a cleaned/ directory next to the
source file. To choose the output path:
metadata-cleaner delete sample.jpg --output cleaned/sample.jpg
Preview a folder run without writing files:
metadata-cleaner delete ./photos --dry-run
Clean a folder recursively:
metadata-cleaner delete ./photos --output ./cleaned-photos
Local Web UI
Start a single-page local Web UI:
metadata-cleaner web
The Web UI binds to 127.0.0.1 by default, shows original metadata beside
cleaned-copy metadata, and lets you download cleaned files. The Files button
lists uploaded originals and cleaned copies from the current local session with
view and delete actions.
Temporary Web UI files are stored in a temporary directory unless you provide a workspace:
metadata-cleaner web --workspace ./metadata-cleaner-workspace
Automation
Print metadata as JSON:
metadata-cleaner view sample.jpg --json
Write metadata JSON to a file:
metadata-cleaner view sample.jpg --json-output reports/metadata.json
Write a delete summary report:
metadata-cleaner delete ./photos --summary-file reports/summary.json
The shared --json-output reports/summary.json option writes the same delete
summary payload.
Add checksums:
metadata-cleaner delete ./photos --summary-file reports/summary.json --checksums
metadata-cleaner delete ./photos --json-summary --checksums --checksum-algorithm sha512
Use compact reports for large jobs:
metadata-cleaner delete ./photos --json-summary --report-detail compact
metadata-cleaner delete ./photos --json-summary --report-filter failed
Summary reports include per-file status, output paths, optional checksums, failure reasons, and format-specific processing notes that explain whether a handler copies, rewrites, re-saves, uses ExifTool, deletes audio tags, or remuxes video with FFmpeg stream copy.
Safety Model
- Originals are not modified by metadata removal.
- Handlers reject in-place cleanup where input and output paths are the same.
- EPUB and ODT ZIP packages are checked against archive safety limits before metadata XML is read or rewritten.
- ExifTool, FFmpeg, and FFprobe subprocess calls use bounded timeouts.
- Logs go to stderr by default; file logging is opt-in.
- The Web UI is local-only by default and scopes file viewing/deletion to its managed workspace.
This tool removes common metadata fields using format-specific libraries and system tools. It is not a guarantee that every possible identifying byte, watermark, hidden payload, or content-derived signal has been removed. For high-risk publishing workflows, inspect outputs with independent tools before release.
Edit Metadata
Editing is available only where handlers support it, currently most useful for audio files through Mutagen:
metadata-cleaner edit song.mp3 --changes '{"artist": "Unknown"}'
Use metadata removal when you need a cleaned copy. Editing may modify the target file in place.
Development Checks
python3 manage.py test
python3 manage.py lint
python3 manage.py check
CI runs tests, lint, pip-audit, package smoke coverage, Docker builds, and
CodeQL on protected branches and pull requests.
Resources
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file metadata_cleaner-3.18.14.tar.gz.
File metadata
- Download URL: metadata_cleaner-3.18.14.tar.gz
- Upload date:
- Size: 41.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.4 CPython/3.12.3 Linux/6.17.0-1013-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b714c1ea66e740a5aefb72b545c56224a826d35036bbe8204de1fec793ec2be9
|
|
| MD5 |
77063edd24fae62d7ab0ec5c2370038f
|
|
| BLAKE2b-256 |
43d535c284cdf2b7859da763465fb0966ae8e64d05baf91ead8a50f7495aba03
|
File details
Details for the file metadata_cleaner-3.18.14-py3-none-any.whl.
File metadata
- Download URL: metadata_cleaner-3.18.14-py3-none-any.whl
- Upload date:
- Size: 33.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.4 CPython/3.12.3 Linux/6.17.0-1013-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1da6cb06754a7898cb76834b8f470d8ca0d7d709ec008011e340ebc4fd30370d
|
|
| MD5 |
d6e7be9fdcd593c72ffeb20863263a10
|
|
| BLAKE2b-256 |
534bdffc4192e2bb76f8bcdf92146a4838f9f5c1996eba550e058a2a9877dca8
|