Skip to main content

CLI tools for searching, reading, and extracting emails from Google Takeout mbox exports

Project description

google-takeout-utils

CLI tools for searching, reading, and extracting emails from Google Takeout mbox exports. No conversion or import into an email client needed.

Quick start

No install needed — run directly from PyPI with uvx:

cd /path/to/your/takeout-folder
uvx google-takeout-utils@latest search-email --help

The @latest suffix ensures uvx always fetches the most recent version from PyPI.

Setup

  1. Download your data via Google Takeout
  2. Extract the .tgz archives into a folder
  3. cd into that folder — the tool expects this layout:
your-takeout-folder/
└── Takeout/
    └── Mail/
        └── All mail Including Spam and Trash.mbox

On first run, a SQLite index is built automatically (~2 min for 8GB). After that, all searches are instant.

Install (optional)

For repeated use, install permanently so search-email is always available:

pip install google-takeout-utils
# or
uv tool install google-takeout-utils

Usage

All examples use uvx. If installed globally, replace uvx google-takeout-utils@latest with just google-takeout-utils.

Search emails

# Search by sender (case-insensitive substring match on name or email)
uvx google-takeout-utils@latest search-email --from alice

# Date range + sender, limit results
uvx google-takeout-utils@latest search-email --after 2023-01-01 --before 2023-07-01 --from john --limit 20

# Search by recipient (searches To, CC, and BCC)
uvx google-takeout-utils@latest search-email --to alice@example.com

# Search by subject
uvx google-takeout-utils@latest search-email --subject "invoice" --limit 5

# Only emails with attachments
uvx google-takeout-utils@latest search-email --has-attachment --from bank --no-body

# Count matches
uvx google-takeout-utils@latest search-email --count --from newsletter

# Full-text body search (slower — seeks into mbox for each candidate)
uvx google-takeout-utils@latest search-email --body "project proposal" --limit 5

# Headers only, no body preview
uvx google-takeout-utils@latest search-email --from alice --no-body

Search results are sorted by date (newest first). Each result shows a database ID and an [A] marker if the email has attachments.

Read a single email

# Show full email by database ID (from search results)
uvx google-takeout-utils@latest search-email --show 4521

# As JSON (useful for piping to other tools or LLMs)
uvx google-takeout-utils@latest search-email --show 4521 --output json

# As YAML
uvx google-takeout-utils@latest search-email --show 4521 --output yaml

--show displays the complete body, To/CC/BCC recipients, and lists all attachments with their extract commands.

View email threads

# Reconstruct the full thread containing email 4521
uvx google-takeout-utils@latest search-email --thread 4521

Shows an indented tree of all related emails with their database IDs, subjects (truncated to 70 chars), senders, and dates. The starting email is marked with <--.

Extract attachments

# Save first attachment of email 4521 to current directory
uvx google-takeout-utils@latest search-email --attachment 4521-1

# Save to a specific directory
uvx google-takeout-utils@latest search-email --attachment 4521-2 --output-dir /tmp

Use --show ID first to see available attachments and their index numbers.

Index management

# Force rebuild (e.g. after a new Google Takeout export)
uvx google-takeout-utils@latest search-email --re-index

How it works

On first run, the tool scans the entire mbox file and builds a SQLite index (Takeout/Mail/index.sqlite) containing date, sender, recipients (To/CC/BCC), subject, attachment flags, and threading information (Message-ID, In-Reply-To) for every email. Threads are precomputed using Union-Find on In-Reply-To chains.

After indexing, all searches query the SQLite database and return results instantly. Body text and attachments are fetched on demand by seeking to the byte offset in the mbox file.

The index is rebuilt automatically when missing (e.g. after a fresh Takeout import).

Options reference

Search filters

Option Description
--from TEXT Case-insensitive substring match on From header (name or email)
--to TEXT Case-insensitive substring match on To/CC/BCC headers
--subject TEXT Case-insensitive substring match on Subject
--body TEXT Case-insensitive substring match in body text (slower)
--after YYYY-MM-DD Emails on or after this date (UTC, inclusive)
--before YYYY-MM-DD Emails before this date (UTC, exclusive)
--has-attachment Only emails with file attachments
--limit N Max results (default: 10)
--count Only print match count

Display

Option Description
--no-body Omit body preview in search results
--output text|json|yaml Output format (default: text)

Actions

Option Description
--show ID Show full email by database ID
--thread ID Show full email thread as indented tree
--attachment ID-N Extract attachment N from email ID (e.g. 4521-1)
--output-dir PATH Directory for extracted attachments (default: cwd)

Index

Option Description
--re-index Force rebuild the SQLite index
--mbox PATH Path to mbox file (auto-detected by default)

License

Apache 2.0 — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

google_takeout_utils-0.2.0.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

google_takeout_utils-0.2.0-py3-none-any.whl (16.3 kB view details)

Uploaded Python 3

File details

Details for the file google_takeout_utils-0.2.0.tar.gz.

File metadata

  • Download URL: google_takeout_utils-0.2.0.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for google_takeout_utils-0.2.0.tar.gz
Algorithm Hash digest
SHA256 d9a9cc3dd5c329361521ed7b1c164b186d211452554f2eb487e84ef73e89a418
MD5 d27b35d86fd22927b43689540a1ddef8
BLAKE2b-256 f0559218f5a0aeb577dec95a8b327750e6f18bbe810b36f7c1aec5b516f94033

See more details on using hashes here.

Provenance

The following attestation bundles were made for google_takeout_utils-0.2.0.tar.gz:

Publisher: publish.yml on haraldschilly/google-takeout-utils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file google_takeout_utils-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for google_takeout_utils-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 75290dc3f7a53680895c2f9f601f9bee37aadf43cf2f54400bb31294b9ca45b3
MD5 0c213c680c34c52cbda36bed8e9a90a4
BLAKE2b-256 273651e5e4e0e772d3f8e9b7f882d3480a5a48a83e520c1f7fa8a8da75397e6a

See more details on using hashes here.

Provenance

The following attestation bundles were made for google_takeout_utils-0.2.0-py3-none-any.whl:

Publisher: publish.yml on haraldschilly/google-takeout-utils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page