Skip to main content

Tools for working with the Jeffrey Epstein documents released in November 2025.

Project description

Color Highlighted Epstein Emails and Text Messages

joi

Usage

Installation

  1. Requires you have a local copy of the OCR text files from the House Oversight document release in a directory /path/to/epstein/ocr_txt_files. You can download those OCR text files from the Congressional Google Drive folder (make sure you grab both the 001/ and 002/ folders).
  2. Use poetry install for easiest time installing. pip install epstein-files should also work, though pipx install epstein-files is usually better.
  3. (Optional) If you want to work with the documents released by DOJ on January 30th 2026 you'll need to also download the PDF collections from the DOJ site (they're in the "Epstein Files Transparency Act" section) and OCR them or find another way to get the OCR text.

Command Line Tools

You need to set the EPSTEIN_DOCS_DIR environment variable with the path to the folder of files you just downloaded when running. You can either create a .env file modeled on .env.example (which will set it permanently) or you can run with:

EPSTEIN_DOCS_DIR=/path/to/epstein/ocr_txt_files epstein_generate --help

To work with the January 2026 DOJ documents you'll also need to set the EPSTEIN_DOJ_TXTS_20260130_DIR env var to point at folders full of OCR extracted texts from the raw DOJ PDFs. If you have the PDFs but not the text files there's a script that can help you take care of that.

EPSTEIN_DOCS_DIR=/path/to/epstein/ocr_txt_files EPSTEIN_DOJ_TXTS_20260130_DIR=/path/to/doj/files epstein_generate --help

All the tools that come with the package require EPSTEIN_DOCS_DIR to be set. These are the available tools:

# Generate color highlighted texts/emails/other files
epstein_generate

# Search for a string:
epstein_grep Bannon
# Or a regex:
epstein_grep '\bSteve\s*Bannon|Jeffrey\s*Epstein\b'

# Show a file with color highlighting of keywords:
epstein_show 030999
# Show both the highlighted and raw versions of the file:
epstein_show --raw 030999
# The full filename is also accepted:
epstein_show HOUSE_OVERSIGHT_030999

# Count words used by Epstein and Bannon
epstein_word_count --name 'Jeffrey Epstein' --name 'Steve Bannon'

# Diff two epstein files after all the cleanup (stripping BOMs, matching newline chars, etc):
epstein_diff 030999 020442

The first time you run anything it will take a few minutes to fix all the janky OCR text, attribute the redacted emails, etc. After that things will be quick.

The commands used to build the various sites that are deployed on Github Pages can be found in deploy.sh.

Run epstein_generate --help for command line option assistance.

Optional: There are a handful of emails that I extracted from the legal filings they were contained in. If you want to include these files in your local analysis you'll need to copy those files from the repo into your local document directory. Something like:

cp ./emails_extracted_from_legal_filings/*.txt "$EPSTEIN_DOCS_DIR"

As A Library

from epstein_files.epstein_files import EpsteinFiles
epstein_files = EpsteinFiles.get_files()

# All files
for document in epstein_files.all_documents():
    do_stuff(document)

# Emails
for email in epstein_files.emails:
    do_stuff(email)

# iMessage Logs
for imessage_log in epstein_files.imessage_logs:
    do_stuff(imessage_log)

# JSON files
for json_file in epstein_files.json_files:
    do_stuff(json_file)

# Other Files
for file in epstein_files.other_files:
    do_stuff(file)

Everyone Who Sent or Received an Email in the November Document Dump

emails

TODO List

See TODO.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

epstein_files-1.5.0.tar.gz (157.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

epstein_files-1.5.0-py3-none-any.whl (172.1 kB view details)

Uploaded Python 3

File details

Details for the file epstein_files-1.5.0.tar.gz.

File metadata

  • Download URL: epstein_files-1.5.0.tar.gz
  • Upload date:
  • Size: 157.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.11 Darwin/22.6.0

File hashes

Hashes for epstein_files-1.5.0.tar.gz
Algorithm Hash digest
SHA256 57f4db3ba02fe181cf9e8c6498cc11b79534c1033d40edfdd6bf0d7c8a45a122
MD5 626b7f4f76e2a8b1c9d370cc174590fd
BLAKE2b-256 99c6eaa6c426a215a2c27ba1f9d1e34bf197abe3a148f8e8ebb48e3f7351346b

See more details on using hashes here.

File details

Details for the file epstein_files-1.5.0-py3-none-any.whl.

File metadata

  • Download URL: epstein_files-1.5.0-py3-none-any.whl
  • Upload date:
  • Size: 172.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.11 Darwin/22.6.0

File hashes

Hashes for epstein_files-1.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 61adb3adb28dce5b942078b147dfcf9a00bc26e1f8d116aedbeab919730768cc
MD5 0acc0925a484dfa69d7c7d18cd1a160b
BLAKE2b-256 121c4d50ec50aacf5d328dd803d57ae547f3a82667ba52adfbde33b8192e8801

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page