A CLI tool to remove or selectively filter metadata from images, documents, audio, and video files.
Project description
๐ Metadata Cleaner ๐
A powerful CLI tool to remove or selectively filter metadata from images, PDFs, DOCX, audio, and video files.
๐ Overview
Metadata Cleaner is a fast and efficient command-line tool designed for privacy protection, security compliance, and data sanitization. It supports removing metadata from various file formats including images, documents, audio, and video files, with options for selective filtering and parallel batch processing.
๐ Why use Metadata Cleaner?
- Protect your privacy: Strip hidden metadata from files.
- Sanitize sensitive documents: Prepare files for sharing without revealing personal information.
- Reduce file size: Remove unnecessary metadata.
- Batch process: Clean metadata from individual files or entire folders (with recursive support).
๐ Features
-
Selective Metadata Filtering:
Configure which metadata fields to preserve or remove using a JSON configuration file. -
Batch & Recursive Processing:
Process a single file, an entire folder, or even subfolders recursively. -
Parallel Processing:
Accelerate batch operations using multi-file parallel execution. -
Cross-Platform CLI:
Works on Linux, macOS, and Windows. -
Logging & Error Reporting:
Detailed logs help troubleshoot issues easily.
๐ ๏ธ Installation & Usage
1๏ธโฃ Using Poetry (Recommended)
If you use Poetry, simply clone the repository and install dependencies:
git clone https://github.com/sandy-sp/metadata-cleaner.git
cd metadata-cleaner
poetry install
To run Metadata Cleaner:
poetry run metadata-cleaner --help
2๏ธโฃ Install via PyPI
Once published to PyPI, you can install it with pip:
pip install metadata-cleaner
And run it:
metadata-cleaner --help
3๏ธโฃ Usage Examples
Remove Metadata from a Single File
metadata-cleaner --file path/to/file.jpg
Example Output:
Do you want to process file.jpg? [Y/n]: Y
โ
Metadata removed. Cleaned file saved at: path/to/file_cleaned.jpg
Remove Metadata from All Files in a Folder (Non-Recursive)
metadata-cleaner --folder test_folder
Example Output:
Do you want to process all files in test_folder? [Y/n]: Y
Processing Files: 100% |โโโโโโโโโโโโโโโโ| 5/5 [00:10s]
๐ Summary Report:
โ
Successfully processed: 5 files
Cleaned files saved in: test_folder/cleaned
Batch Processing with Recursive Search & Custom Output
metadata-cleaner --folder my_folder --recursive --output sanitized_files --yes
Example Output:
Processing Files: 100% |โโโโโโโโโโโโโโโโ| 20/20 [00:15s]
๐ Summary Report:
โ
Successfully processed: 20 files
Cleaned files saved in: sanitized_files
Using a Custom Configuration File
You can create a JSON configuration file (e.g., config.json) to specify selective metadata rules. Then run:
metadata-cleaner --file sample.jpg --config config.json
๐ง How It Works
-
File Detection:
The tool detects the file type and selects the appropriate handler. -
Selective Filtering:
For image files, it uses a configuration file (if provided) to selectively remove or preserve EXIF metadata. -
Processing:
Files are processedโeither individually or in batchesโwith parallel execution for efficiency. -
Output & Logging:
Cleaned files are saved in a default or specified output folder, and detailed logs are generated for troubleshooting.
๐ป Project Structure
metadata-cleaner/
โโโ docs/ # Documentation
โโโ metadata_cleaner/ # Python package source code
โ โโโ cli.py # CLI entry point
โ โโโ remover.py # Core metadata removal logic
โ โโโ config/ # Configuration settings
โ โโโ core/ # Metadata filtering utilities
โ โโโ file_handlers/ # File-specific metadata handlers
โ โโโ logs/ # Logging configuration
โโโ tests/ # Unit tests
โโโ scripts/ # Setup and environment scripts (Poetry-based)
โโโ pyproject.toml # Poetry configuration file
โโโ MANIFEST.in # Manifest file for packaging
โโโ README.md # This file
๐ก Contributing
Contributions are welcome! To contribute:
- Fork the repository
- Create a new branch for your feature:
git checkout -b feature-name
- Make your changes and test using:
poetry run pytest
- Commit and push your changes:
git commit -m "Describe your feature" git push origin feature-name
- Submit a Pull Request
๐ Resources & Links
- API Reference: docs/API_REFERENCE.md
- Usage Guide: docs/USAGE.md
- Planned Features: docs/PLANNED_FEATURES.md
- GitHub Repository: metadata-cleaner
- PyPI Package: metadata-cleaner
โค๏ธ Support
If you find this tool useful, please give it a โญ on GitHub!
For issues or questions, open an issue or contact sandeep.paidipati@gmail.com.
๐ License
This project is licensed under the MIT License. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file metadata_cleaner-2.0.3.tar.gz.
File metadata
- Download URL: metadata_cleaner-2.0.3.tar.gz
- Upload date:
- Size: 12.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.1 CPython/3.10.16 Linux/6.8.0-1021-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d91308afe730f844bf42da83ee5ae435bcd5d28b579936cff959545e3ca16e4
|
|
| MD5 |
8ed44d0829101e99dfe4e9a448e45e02
|
|
| BLAKE2b-256 |
e7e6df8d6df1dea055f7e2fed49744aee866ef27db764b21c17b5f2efe85dac4
|
File details
Details for the file metadata_cleaner-2.0.3-py3-none-any.whl.
File metadata
- Download URL: metadata_cleaner-2.0.3-py3-none-any.whl
- Upload date:
- Size: 16.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.1 CPython/3.10.16 Linux/6.8.0-1021-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a82cefe627bb7e2c2e5aa6732f6b480e3fe9cfc796706706cf938487dc2c81c
|
|
| MD5 |
d3b6c028fa8cf419c9fb19084f60dcd1
|
|
| BLAKE2b-256 |
e33929ab5228813f1480c3a6e72b9a692601f9a1eb4a0f9495af516406ada8fa
|