Skip to main content

A CLI tool to remove, edit, or selectively filter metadata from images, documents, audio, and video files.

Project description

Metadata Cleaner

CI Release PyPI version License: MIT

Metadata Cleaner is a privacy-focused local tool for viewing and removing metadata from images, documents, audio, and video files. It writes cleaned copies by default, keeps originals unchanged, and includes CLI, JSON automation, Docker, and a local Web UI for side-by-side metadata checks.

Highlights

  • View metadata before cleaning.
  • Remove metadata into separate cleaned copies.
  • Compare original and cleaned metadata in a local-only Web UI.
  • Process individual files or recursive folders.
  • Generate machine-readable JSON reports for automation.
  • Add SHA-256, SHA-512, or BLAKE2b checksums to reports.
  • Preserve source timestamps on cleaned outputs when needed.
  • Publish-ready CI coverage for Python, package smoke tests, Docker builds, CodeQL, and dependency audits.

Supported Files

  • Images: JPG, JPEG, PNG, TIFF, WEBP, AVIF, HEIC, HEIF
  • Documents: PDF, DOCX, EPUB, ODT, TXT
  • Audio: MP3, WAV, FLAC, OGG, AAC, M4A, WMA
  • Video: MP4, MKV, MOV, AVI, WEBM, FLV

Some formats need system tools for best coverage:

  • ffmpeg and ffprobe are required for video metadata handling.
  • exiftool is required for AVIF, HEIC, and HEIF cleanup and improves image metadata coverage.
  • The published Docker image includes these optional tools.

Install

Requires Python 3.11 or newer.

pip install metadata-cleaner
metadata-cleaner --help

Use Docker when you want the optional system tools preinstalled:

docker run --rm -v "$(pwd):/data" ghcr.io/sandy-sp/metadata-cleaner:latest delete /data/photos

For development:

git clone https://github.com/sandy-sp/metadata-cleaner.git
cd metadata-cleaner
poetry install --with dev
poetry run metadata-cleaner --help

Quick Start

View metadata:

metadata-cleaner view sample.jpg

Clean one file:

metadata-cleaner delete sample.jpg

By default, the cleaned copy is written under a cleaned/ directory next to the source file. To choose the output path:

metadata-cleaner delete sample.jpg --output cleaned/sample.jpg

Preview a folder run without writing files:

metadata-cleaner delete ./photos --dry-run

Clean a folder recursively:

metadata-cleaner delete ./photos --output ./cleaned-photos

Local Web UI

Start a single-page local Web UI:

metadata-cleaner web

The Web UI binds to 127.0.0.1 by default, shows original metadata beside cleaned-copy metadata, and lets you download cleaned files. The Files button lists uploaded originals and cleaned copies from the current local session with view and delete actions.

Temporary Web UI files are stored in a temporary directory unless you provide a workspace:

metadata-cleaner web --workspace ./metadata-cleaner-workspace

Automation

Print metadata as JSON:

metadata-cleaner view sample.jpg --json

Write metadata JSON to a file:

metadata-cleaner view sample.jpg --json-output reports/metadata.json

Write a delete summary report:

metadata-cleaner delete ./photos --summary-file reports/summary.json

The shared --json-output reports/summary.json option writes the same delete summary payload.

Add checksums:

metadata-cleaner delete ./photos --summary-file reports/summary.json --checksums
metadata-cleaner delete ./photos --json-summary --checksums --checksum-algorithm sha512

Use compact reports for large jobs:

metadata-cleaner delete ./photos --json-summary --report-detail compact
metadata-cleaner delete ./photos --json-summary --report-filter failed

Summary reports include per-file status, output paths, optional checksums, failure reasons, and format-specific processing notes that explain whether a handler copies, rewrites, re-saves, uses ExifTool, deletes audio tags, or remuxes video with FFmpeg stream copy.

Safety Model

  • Originals are not modified by metadata removal.
  • Handlers reject in-place cleanup where input and output paths are the same.
  • EPUB and ODT ZIP packages are checked against archive safety limits before metadata XML is read or rewritten.
  • ExifTool, FFmpeg, and FFprobe subprocess calls use bounded timeouts.
  • Logs go to stderr by default; file logging is opt-in.
  • The Web UI is local-only by default and scopes file viewing/deletion to its managed workspace.

This tool removes common metadata fields using format-specific libraries and system tools. It is not a guarantee that every possible identifying byte, watermark, hidden payload, or content-derived signal has been removed. For high-risk publishing workflows, inspect outputs with independent tools before release.

Edit Metadata

Editing is available only where handlers support it, currently most useful for audio files through Mutagen:

metadata-cleaner edit song.mp3 --changes '{"artist": "Unknown"}'

Use metadata removal when you need a cleaned copy. Editing may modify the target file in place.

Development Checks

python3 manage.py test
python3 manage.py lint
python3 manage.py check

CI runs tests, lint, pip-audit, package smoke coverage, Docker builds, and CodeQL on protected branches and pull requests.

Resources

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metadata_cleaner-3.18.14.tar.gz (41.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

metadata_cleaner-3.18.14-py3-none-any.whl (33.1 kB view details)

Uploaded Python 3

File details

Details for the file metadata_cleaner-3.18.14.tar.gz.

File metadata

  • Download URL: metadata_cleaner-3.18.14.tar.gz
  • Upload date:
  • Size: 41.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.4 CPython/3.12.3 Linux/6.17.0-1013-azure

File hashes

Hashes for metadata_cleaner-3.18.14.tar.gz
Algorithm Hash digest
SHA256 b714c1ea66e740a5aefb72b545c56224a826d35036bbe8204de1fec793ec2be9
MD5 77063edd24fae62d7ab0ec5c2370038f
BLAKE2b-256 43d535c284cdf2b7859da763465fb0966ae8e64d05baf91ead8a50f7495aba03

See more details on using hashes here.

File details

Details for the file metadata_cleaner-3.18.14-py3-none-any.whl.

File metadata

  • Download URL: metadata_cleaner-3.18.14-py3-none-any.whl
  • Upload date:
  • Size: 33.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.4 CPython/3.12.3 Linux/6.17.0-1013-azure

File hashes

Hashes for metadata_cleaner-3.18.14-py3-none-any.whl
Algorithm Hash digest
SHA256 1da6cb06754a7898cb76834b8f470d8ca0d7d709ec008011e340ebc4fd30370d
MD5 d6e7be9fdcd593c72ffeb20863263a10
BLAKE2b-256 534bdffc4192e2bb76f8bcdf92146a4838f9f5c1996eba550e058a2a9877dca8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page