Skip to main content

📧 CLI to deduplicate mails from mail boxes

Project description

Mail Deduplicate

Last release Python versions Unittests status Documentation status Coverage status DOI

What is Mail Deduplicate?

Provides the mdedup CLI, an utility to deduplicate mails from a set of boxes.

Mail Deduplicate

Features

  • Duplicate detection based on cherry-picked and normalized mail headers.
  • Fetch mails from multiple sources.
  • Reads and writes to mbox, maildir, babyl, mh and mmdf formats.
  • Deduplication strategies based on size, content, timestamp, file path or random choice.
  • Copy, move or delete the resulting set of duplicates.
  • Dry-run mode.
  • Protection against false-positives with safety checks on size and content differences.
  • Supports macOS, Linux and Windows.
  • Standalone executables for Linux, macOS and Windows.
  • Shell auto-completion for Bash, Zsh and Fish.

⚠️ Warning: Performances

mdedup implementation is quite naive at the moment and everything resides in memory.

If this is good enough for a volume of a couple of gigabytes, the more emails mdedup try to parse, the closer you'll reach the memory limits of your machine. In which case mdedup will exit abruptly, zapped by the OOM killer of your OS. Of course your mileage may vary depending on your hardware.

You can influence implementation of this feature with pull requests, purchasing business support 🤝 and sponsorship 🫶.

Example

Installation

Python

uv is the fastest way to run mdedup from sources on any platform, thanks to its uvx command:

$ uvx --from mail-deduplicate mdedup

Executables

Standalone binaries of mdedup's latest version are available for several platforms and architectures:

Platform x86_64 arm64
Linux Download mdedup-linux-x64.bin Download mdedup-linux-arm64.bin
macOS Download mdedup-macos-x64.bin Download mdedup-macos-arm64.bin
Windows Download mdedup-windows-x64.exe

Other alternatives installation methods are available in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mail_deduplicate-7.6.2.tar.gz (32.0 kB view details)

Uploaded Source

Built Distribution

mail_deduplicate-7.6.2-py3-none-any.whl (32.0 kB view details)

Uploaded Python 3

File details

Details for the file mail_deduplicate-7.6.2.tar.gz.

File metadata

  • Download URL: mail_deduplicate-7.6.2.tar.gz
  • Upload date:
  • Size: 32.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.14

File hashes

Hashes for mail_deduplicate-7.6.2.tar.gz
Algorithm Hash digest
SHA256 20368c6e048be51368eeaf73ba2cccaa3396009e77c8766d2f137dd6e1d2a48f
MD5 50b7ad27a209273fd94956ea616c5cac
BLAKE2b-256 47ea693d0357055dbacef0838419646819cb69eada05ea502088cc34e91175ae

See more details on using hashes here.

File details

Details for the file mail_deduplicate-7.6.2-py3-none-any.whl.

File metadata

File hashes

Hashes for mail_deduplicate-7.6.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1840d473b92f098a1722fedaf68c8935e7bec0ff4bfb8ca4317cb86488cff3f6
MD5 593316cd08440460e7aa78c468813518
BLAKE2b-256 16c65afc07d4c082caf86a28bc410701e2b251647ea9d702e2b5b1457280d2f0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page