CLI tools for searching, reading, and extracting emails from Google Takeout mbox exports
Project description
google-takeout-utils
CLI tools for searching, reading, and extracting emails from Google Takeout mbox exports. No conversion or import into an email client needed.
Quick start
No install needed — run directly from PyPI with uvx:
cd /path/to/your/takeout-folder
uvx google-takeout-utils@latest search-email --help
The @latest suffix ensures uvx always fetches the most recent version from PyPI.
Setup
- Download your data via Google Takeout
- Extract the
.tgzarchives into a folder cdinto that folder — the tool expects this layout:
your-takeout-folder/
└── Takeout/
└── Mail/
└── All mail Including Spam and Trash.mbox
On first run, a SQLite index is built automatically (~2 min for 8GB). After that, all searches are instant.
Install (optional)
For repeated use, install permanently so search-email is always available:
pip install google-takeout-utils
# or
uv tool install google-takeout-utils
Usage
All examples use uvx. If installed globally, replace uvx google-takeout-utils@latest with just google-takeout-utils.
Search emails
# Search by sender (case-insensitive substring match on name or email)
uvx google-takeout-utils@latest search-email --from alice
# Date range + sender, limit results
uvx google-takeout-utils@latest search-email --after 2023-01-01 --before 2023-07-01 --from john --limit 20
# Search by recipient (searches To, CC, and BCC)
uvx google-takeout-utils@latest search-email --to alice@example.com
# Search by subject
uvx google-takeout-utils@latest search-email --subject "invoice" --limit 5
# Only emails with attachments
uvx google-takeout-utils@latest search-email --has-attachment --from bank --no-body
# Count matches
uvx google-takeout-utils@latest search-email --count --from newsletter
# Full-text body search (slower — seeks into mbox for each candidate)
uvx google-takeout-utils@latest search-email --body "project proposal" --limit 5
# Headers only, no body preview
uvx google-takeout-utils@latest search-email --from alice --no-body
Search results are sorted by date (newest first). Each result shows a database ID
and an [A] marker if the email has attachments.
Read a single email
# Show full email by database ID (from search results)
uvx google-takeout-utils@latest search-email --show 4521
# As JSON (useful for piping to other tools or LLMs)
uvx google-takeout-utils@latest search-email --show 4521 --output json
# As YAML
uvx google-takeout-utils@latest search-email --show 4521 --output yaml
--show displays the complete body, To/CC/BCC recipients, and lists all attachments with their extract commands.
View email threads
# Reconstruct the full thread containing email 4521
uvx google-takeout-utils@latest search-email --thread 4521
Shows an indented tree of all related emails with their database IDs, subjects (truncated to 70 chars), senders, and dates. The starting email is marked with <--.
Extract attachments
# Save first attachment of email 4521 to current directory
uvx google-takeout-utils@latest search-email --attachment 4521-1
# Save to a specific directory
uvx google-takeout-utils@latest search-email --attachment 4521-2 --output-dir /tmp
Use --show ID first to see available attachments and their index numbers.
Index management
# Force rebuild (e.g. after a new Google Takeout export)
uvx google-takeout-utils@latest search-email --re-index
How it works
On first run, the tool scans the entire mbox file and builds a SQLite index
(Takeout/Mail/index.sqlite) containing date, sender, recipients (To/CC/BCC), subject,
attachment flags, and threading information (Message-ID, In-Reply-To) for every email.
Threads are precomputed using Union-Find on In-Reply-To chains.
After indexing, all searches query the SQLite database and return results instantly. Body text and attachments are fetched on demand by seeking to the byte offset in the mbox file.
The index is rebuilt automatically when missing (e.g. after a fresh Takeout import).
Options reference
Search filters
| Option | Description |
|---|---|
--from TEXT |
Case-insensitive substring match on From header (name or email) |
--to TEXT |
Case-insensitive substring match on To/CC/BCC headers |
--subject TEXT |
Case-insensitive substring match on Subject |
--body TEXT |
Case-insensitive substring match in body text (slower) |
--after YYYY-MM-DD |
Emails on or after this date (UTC, inclusive) |
--before YYYY-MM-DD |
Emails before this date (UTC, exclusive) |
--has-attachment |
Only emails with file attachments |
--limit N |
Max results (default: 10) |
--count |
Only print match count |
Display
| Option | Description |
|---|---|
--no-body |
Omit body preview in search results |
--output text|json|yaml |
Output format (default: text) |
Actions
| Option | Description |
|---|---|
--show ID |
Show full email by database ID |
--thread ID |
Show full email thread as indented tree |
--attachment ID-N |
Extract attachment N from email ID (e.g. 4521-1) |
--output-dir PATH |
Directory for extracted attachments (default: cwd) |
Index
| Option | Description |
|---|---|
--re-index |
Force rebuild the SQLite index |
--mbox PATH |
Path to mbox file (auto-detected by default) |
License
Apache 2.0 — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file google_takeout_utils-0.2.0.tar.gz.
File metadata
- Download URL: google_takeout_utils-0.2.0.tar.gz
- Upload date:
- Size: 16.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d9a9cc3dd5c329361521ed7b1c164b186d211452554f2eb487e84ef73e89a418
|
|
| MD5 |
d27b35d86fd22927b43689540a1ddef8
|
|
| BLAKE2b-256 |
f0559218f5a0aeb577dec95a8b327750e6f18bbe810b36f7c1aec5b516f94033
|
Provenance
The following attestation bundles were made for google_takeout_utils-0.2.0.tar.gz:
Publisher:
publish.yml on haraldschilly/google-takeout-utils
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
google_takeout_utils-0.2.0.tar.gz -
Subject digest:
d9a9cc3dd5c329361521ed7b1c164b186d211452554f2eb487e84ef73e89a418 - Sigstore transparency entry: 997238743
- Sigstore integration time:
-
Permalink:
haraldschilly/google-takeout-utils@cbf72b3002b3b1ab0f1233a36b37c6ba2c163698 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/haraldschilly
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@cbf72b3002b3b1ab0f1233a36b37c6ba2c163698 -
Trigger Event:
push
-
Statement type:
File details
Details for the file google_takeout_utils-0.2.0-py3-none-any.whl.
File metadata
- Download URL: google_takeout_utils-0.2.0-py3-none-any.whl
- Upload date:
- Size: 16.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
75290dc3f7a53680895c2f9f601f9bee37aadf43cf2f54400bb31294b9ca45b3
|
|
| MD5 |
0c213c680c34c52cbda36bed8e9a90a4
|
|
| BLAKE2b-256 |
273651e5e4e0e772d3f8e9b7f882d3480a5a48a83e520c1f7fa8a8da75397e6a
|
Provenance
The following attestation bundles were made for google_takeout_utils-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on haraldschilly/google-takeout-utils
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
google_takeout_utils-0.2.0-py3-none-any.whl -
Subject digest:
75290dc3f7a53680895c2f9f601f9bee37aadf43cf2f54400bb31294b9ca45b3 - Sigstore transparency entry: 997238765
- Sigstore integration time:
-
Permalink:
haraldschilly/google-takeout-utils@cbf72b3002b3b1ab0f1233a36b37c6ba2c163698 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/haraldschilly
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@cbf72b3002b3b1ab0f1233a36b37c6ba2c163698 -
Trigger Event:
push
-
Statement type: