Email-to-EML Secure Archiver
Project description
Email-to-EML Secure Archiver (EESA)
A Python-based command-line utility to programmatically retrieve emails from Gmail and Microsoft 365 and save them as RFC822-compliant .eml files.
โจ Features
- ๐ Secure OAuth2 Authentication - Browser-based authentication with 2FA support
- ๐ง Multi-Provider Support - Gmail and Microsoft 365 (Outlook)
- ๐ง AI-Powered Classification - Automatically categorize emails and skip promotions (v0.3.0+)
- ๐ Advanced Filtering - Date-based, incremental sync, custom queries
- ๐ช Webhook Integration - Automatically send downloaded emails to webhook endpoints
- ๐พ Incremental Checkpointing - Resume interrupted downloads
- ๐ฆ Modern Package Management - UV/UVX support for easy installation and execution
๐ Quick Start
Installation
Using uvx (recommended - no installation needed):
uvx email-archiver --help
Using pip:
pip install email-archiver
email-archiver --help
From source:
git clone https://github.com/therealtimex/email-archiver
cd email-archiver
uv sync
uv run email-archiver --help
Basic Usage
# Download emails from Gmail since a specific date
email-archiver --provider gmail --since 2024-12-01
# Incremental sync (resume from last checkpoint)
email-archiver --provider gmail --incremental
# AI Classification (skip promotional emails)
email-archiver --provider gmail --classify --openai-api-key "sk-..." --skip-promotional
# With webhook integration
email-archiver --provider gmail --since 2024-12-23 \
--webhook-url https://your-webhook.com/endpoint \
--webhook-secret "Bearer your-token"
๐ Documentation
- Quick Start Guide - Get up and running in 5 minutes
- Complete Documentation - Full setup and configuration guide
- API Reference - Command-line arguments and Python API
- Examples - 21 practical examples and use cases
๐ฏ Common Use Cases
Daily Email Backup
email-archiver --provider gmail --incremental
Archive Specific Emails
# Emails with attachments
email-archiver --provider gmail --query "has:attachment" --since 2024-01-01
# From specific sender
email-archiver --provider gmail --query "from:important@example.com"
Webhook Integration
# Send emails to processing endpoint
email-archiver --provider gmail --incremental \
--webhook-url https://api.example.com/emails \
--webhook-secret "Bearer sk_live_abc123"
Custom Download Directory
# Save to specific folder
email-archiver --provider gmail --since 2024-12-01 \
--download-dir /path/to/backup/emails
โ๏ธ Configuration
Gmail Setup
- Create a project in Google Cloud Console
- Enable Gmail API
- Create OAuth 2.0 credentials (Desktop App)
- Save credentials as
config/client_secret.json
Microsoft 365 Setup
- Register app in Azure Portal
- Add
Mail.Readpermission - Update
config/settings.yamlwith your Client ID
See Quick Start Guide for detailed instructions.
๐ช Webhook Integration
EESA can automatically POST downloaded .eml files to a webhook endpoint:
Via CLI:
email-archiver --provider gmail --since 2024-12-01 \
--webhook-url https://webhook.site/your-id \
--webhook-secret "Bearer token"
Via Configuration:
# config/settings.yaml
webhook:
url: "https://your-webhook.com/endpoint"
enabled: true
headers:
Authorization: "Bearer your-token"
๐ Command-Line Arguments
| Argument | Description |
|---|---|
--provider {gmail,m365} |
Email provider (required) |
--since YYYY-MM-DD |
Download emails since date |
--incremental |
Resume from last checkpoint |
--query STRING |
Custom search query |
--webhook-url URL |
Webhook endpoint URL |
--webhook-secret SECRET |
Authorization header for webhook |
--download-dir PATH |
Custom download directory |
--classify |
Enable AI email classification |
--openai-api-key KEY |
OpenAI API key |
--skip-promotional |
Skip promotional/social emails |
--metadata-output PATH |
Path to save JSONL metadata |
See API Reference for complete documentation.
๐ง Requirements
- Python 3.9+
- Gmail API credentials (for Gmail)
- Azure AD app registration (for M365)
๐ Project Structure
email-archiver/
โโโ email_archiver/ # Main package
โ โโโ main.py # CLI entry point
โ โโโ core/ # Core modules
โ โโโ gmail_handler.py
โ โโโ graph_handler.py
โ โโโ utils.py
โโโ config/
โ โโโ settings.yaml # Configuration file
โ โโโ checkpoint.json # Incremental sync state
โ โโโ client_secret.json # OAuth credentials (git-ignored)
โโโ auth/ # OAuth tokens (git-ignored)
โโโ downloads/ # Downloaded .eml files
โโโ docs/ # Documentation
โโโ pyproject.toml # Package configuration
๐ Security
- OAuth2 Only - No password storage
- Read-Only Scopes -
gmail.readonlyandMail.Read - Token Protection - Tokens stored with restricted permissions (chmod 600)
- HTTPS Webhooks - Always use HTTPS for webhook endpoints
๐ค Contributing
This project follows the specification in docs/SPECIFICATION.md.
๐ License
See LICENSE file for details.
๐ Support
y For issues or questions:
- Check the documentation
- Review examples
- Check logs in
sync.log - Open an issue on GitHub
๐ Examples
Automation with Cron
# Daily backup at 2 AM
0 2 * * * email-archiver --provider gmail --incremental
Python Integration
import subprocess
subprocess.run([
"email-archiver",
"--provider", "gmail",
"--since", "2024-12-01"
])
Using uvx (no installation)
# Run directly without installing
uvx email-archiver --provider gmail --since 2024-12-01
# Works from any directory
uvx email-archiver --help
See EXAMPLES.md for 21 more examples!
๐ฅ Author & Credits
Author: Trung Le
Team: RealTimeX.ai
Repository: https://github.com/therealtimex/email-archiver
Built with โค๏ธ for secure email archiving
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file email_archiver-0.3.0.tar.gz.
File metadata
- Download URL: email_archiver-0.3.0.tar.gz
- Upload date:
- Size: 93.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
58ff88a7fe5bf1731a3377253479f888ad86bcf10beef46b9cc9125ced114eb1
|
|
| MD5 |
fabfd1d1ba866f74047591a3d4e384ed
|
|
| BLAKE2b-256 |
2d6548d1b503bd048e125c74ceb2843ab4a270c83edae3fa31705b216b7fd9b2
|
File details
Details for the file email_archiver-0.3.0-py3-none-any.whl.
File metadata
- Download URL: email_archiver-0.3.0-py3-none-any.whl
- Upload date:
- Size: 16.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b9235781483bb530a65d24b39b991ceb58276c1de854b2b9d2f2b88e6291b3e7
|
|
| MD5 |
e511cf99c459e40f8ca0019e91871dae
|
|
| BLAKE2b-256 |
2bfb1c8c61ad4e7439dbe9d087ff295eced51494cd804e99f22fb662dd62e9b1
|