Skip to main content

📬 Recursively extract all attachments from .mbox email archives with a single command

Project description

mbox-extractor

mbox-extractor

License Python

📬 Recursively extract all attachments from .mbox email archives with a single command

Features

  • Recursive scanning - Finds all .mbox files in any directory tree
  • Safe filenames - Sanitizes attachment names, removing illegal characters
  • No duplicates - Uses content-based hashing to prevent overwrites
  • Progress display - Visual progress bar for large mailboxes

Quick Start

uv tool install mbox-extractor
mbox-extractor /path/to/emails

Installation

Using uv (recommended)

uv tool install mbox-extractor

Using pip

pip install mbox-extractor

From source

git clone https://github.com/tsilva/mbox-extractor.git
cd mbox-extractor
uv tool install .
pre-commit install

Usage

Extract all attachments from .mbox files under a directory:

mbox-extractor /path/to/search

Attachments from each .mbox file are saved to a folder with the same name:

Found mbox: /emails/archive.mbox -> extracting to /emails/archive
Extracting archive.mbox: 100%|████████████████████| 500/500 [00:10<00:00, 48.5it/s]
Extracted 42 attachments to '/emails/archive'.

How It Works

  1. Recursively scans the given path for .mbox files
  2. Opens each mailbox and iterates through all messages
  3. Extracts attachments with sanitized, unique filenames
  4. Saves them to a folder named after the source .mbox file

Filenames are made unique by appending an 8-character MD5 hash of the file content, preventing overwrites when multiple attachments share the same name.

Programmatic Usage

You can also use mbox-extractor as a library in your Python code:

from mbox_extractor import extract_mbox

# Extract with default output directory (creates /path/to/archive/ from /path/to/archive.mbox)
count = extract_mbox("/path/to/archive.mbox")

# Extract to a custom output directory
count = extract_mbox("/path/to/archive.mbox", output_dir="/custom/output")

# Extract without progress bar (for scripts/automation)
count = extract_mbox("/path/to/archive.mbox", show_progress=False)

API

extract_mbox(mbox_path: str, output_dir: str | None = None, show_progress: bool = True) -> int
Parameter Description
mbox_path Absolute path to the .mbox file
output_dir Output directory (default: same name as mbox file, without extension)
show_progress Show progress bar (default: True)
Returns Number of attachments extracted

Requirements

  • Python 3.12+
  • tqdm (installed automatically)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mbox_extractor-0.1.21.tar.gz (169.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mbox_extractor-0.1.21-py3-none-any.whl (5.6 kB view details)

Uploaded Python 3

File details

Details for the file mbox_extractor-0.1.21.tar.gz.

File metadata

  • Download URL: mbox_extractor-0.1.21.tar.gz
  • Upload date:
  • Size: 169.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mbox_extractor-0.1.21.tar.gz
Algorithm Hash digest
SHA256 c7def72383345177f0846aabf63a76a10be8ba2c94cbf91cc73a306d78d18118
MD5 c582676e9f9f8b81d8ec1068e744f089
BLAKE2b-256 db99c8dbd50cbe488d14baf1e23d6849702b06409318842d061a3270e9c3a3bc

See more details on using hashes here.

Provenance

The following attestation bundles were made for mbox_extractor-0.1.21.tar.gz:

Publisher: release.yml on tsilva/mbox-extractor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mbox_extractor-0.1.21-py3-none-any.whl.

File metadata

File hashes

Hashes for mbox_extractor-0.1.21-py3-none-any.whl
Algorithm Hash digest
SHA256 f58171ced9f5c42201c1655425601b368e03e186f3703040dd08a101247b4c56
MD5 f23bd5251671fd7393bcdf2854618290
BLAKE2b-256 cb8de8dab49288c39954f37d03f34fe6976d51ab29b5880654906f245c90ef30

See more details on using hashes here.

Provenance

The following attestation bundles were made for mbox_extractor-0.1.21-py3-none-any.whl:

Publisher: release.yml on tsilva/mbox-extractor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page