Skip to main content

Converts TLDR newsletters to a two-voice podcast MP3 using Gemini

Project description

๐ŸŽ™๏ธ tldr-podcast

Turn your TLDR newsletters into a listenable two-voice podcast โ€” automatically.

Python uv Gemini ffmpeg Version License

Fetches any combination of TLDR topic newsletters, LLM-scores the articles, and generates a scripted dialogue + audio via Gemini AI. No email account, no subscription, no API beyond Gemini.


๐Ÿ“‘ Table of Contents


โœจ Features

Feature What it does
๐ŸŒ Zero-config fetching Pulls newsletters straight from tldr.tech โ€” no inbox, no scraping your mail
๐Ÿง  Smart curation An LLM interest-scores every article 1โ€“10 before scraping; only the best survive
๐Ÿ—ฃ๏ธ Two-voice dialogue Configurable speaker names, Gemini voices, personalities, and language
๐ŸŽญ Expressive delivery Inline audio tags ([laughs], [short pause], [enthusiasm]) on Gemini 3.x TTS; graceful fallback on older models
๐Ÿ—‚๏ธ Per-run reports Overview, full article list, script, and extracted links (repos ยท papers ยท models)
๐Ÿ•ต๏ธ Stealth browser fallback Optional CloakBrowser (Playwright stealth Chromium) re-renders pages that block trafilatura
๐Ÿ” Self-upgrading config Versioned schema; missing keys are added in place, old file kept as .bak
๐Ÿ’ธ Cost tracking Live token usage and a USD estimate at the end of every run
๐Ÿ”‡ Dry & no-audio modes Preview the script without ever calling TTS
๐ŸŽš๏ธ Flexible output MP3 or WAV, custom output directory

๐Ÿ“ฆ Installation

Requires Python 3.13+ and ffmpeg.

# macOS
brew install ffmpeg

# Ubuntu / Debian
sudo apt-get install -y ffmpeg
From PyPI โ€” no clone needed
# uv (recommended)
uv tool install tldr-podcast

# uvx โ€” run once, install nothing permanently
uvx tldr-podcast run -t ai --no-interactive

# pipx
pipx install tldr-podcast

# pip (inside an active venv)
pip install tldr-podcast

For the unreleased main instead of the latest PyPI release, replace tldr-podcast with git+https://github.com/obeone/tldr-podcast.

From a local clone
git clone https://github.com/obeone/tldr-podcast
cd tldr-podcast

uv tool install .                 # install as a CLI tool
# or, for development:
uv sync && uv pip install -e .    # editable install
Optional โ€” stealth browser fallback (CloakBrowser)

The optional cloak extra adds a Playwright-based stealth Chromium (CloakBrowser) that re-renders pages which block trafilatura; the ~200 MB browser binary downloads automatically at first runtime use (not at install time).

# uv tool โ€” from PyPI
uv tool install "tldr-podcast[cloak]"

# pipx โ€” from PyPI
pipx install "tldr-podcast[cloak]"

# pip (inside an active venv) โ€” from PyPI
pip install "tldr-podcast[cloak]"

# from a local clone
uv tool install ".[cloak]"          # as a CLI tool
uv sync --extra cloak               # for development

Already installed without it? Re-run the matching command above with the [cloak] extra to add the fallback.


๐Ÿš€ Quick start

# 1. Create your config interactively
tldr-podcast config init

# 2. Export your Gemini API key
export GEMINI_API_KEY="your-key"

# 3. Pick topics interactively and generate
tldr-podcast run

# โ€ฆor go straight to it
tldr-podcast run -t ai,devops --no-interactive

๐Ÿ“ฐ Topics

13 TLDR newsletters, mix and match freely:

Slug Newsletter Slug Newsletter
ai AI design Design
infosec Information Security product Product
devops DevOps marketing Marketing
tech Tech (the flagship) data Data
crypto Crypto fintech Fintech
founders Founders dev Web Dev
it Information Technology

โš™๏ธ Configuration

The wizard covers every option and writes a ready-to-use config.yaml:

tldr-podcast config init

Default path: $XDG_CONFIG_HOME/tldr/config.yaml (falls back to ~/.config/tldr/config.yaml).

A minimal config looks like this:

config_version: 6

web:
  default_topics: [ ai, infosec, devops ]

gemini:
  api_key_env: GEMINI_API_KEY        # name of the env var โ€” never the key itself
  text_model: gemini-2.0-flash
  tts_model: gemini-2.5-flash-preview-tts
  language: French
  speaker1: { name: Alex,   voice: Puck,   personality: "enthusiastic, curious" }
  speaker2: { name: Jordan, voice: Charon, personality: "analytical, skeptical" }

output:
  dir: "."
  format: mp3

๐Ÿ” The only required secret is GEMINI_API_KEY. Secrets are never written to the file โ€” keys ending in _env hold the name of the environment variable read at runtime.

For fine-grained tuning (TTS pace, dialogue style, service tiers, per-model pricingโ€ฆ), every key is documented inline in config.example.yaml.

Stealth browser fallback (optional)

When trafilatura fails to fetch or extract an article (bot-detection, JS-rendered pages, etc.), the scraper can fall back to CloakBrowser โ€” a Playwright-based stealth Chromium that bypasses most bot-detection measures.

Install the optional cloak extra โ€” see Installation.

Config key (scraping.cloak_fallback):

Value Behaviour
auto (default) Use the fallback when the cloakbrowser package is importable
on Require the fallback; warns and degrades to newsletter summaries if not installed
off Never use the browser fallback
scraping:
  cloak_fallback: auto   # auto | on | off

After navigation, the fallback automatically waits up to 35 seconds for any Cloudflare Turnstile challenge to resolve before reading the page.

At most 2 stealth-browser sessions run concurrently to avoid memory exhaustion; trafilatura workers are unaffected.

Known limitation: heavily fortified sites using enterprise bot management (e.g. g2.com) may still be blocked โ€” the fallback handles standard Cloudflare challenges, not every anti-bot system.

tldr-podcast config show              # raw config
tldr-podcast config show --resolve    # env vars resolved, secrets masked

๐Ÿ–ฅ๏ธ CLI reference

Command Description
tldr-podcast run Interactive topic picker โ†’ generate podcast
tldr-podcast run -t ai,devops Explicit topics, skip the prompt
tldr-podcast run -t ai --no-interactive Non-interactive, use config defaults if no -t
tldr-podcast run -d 2026-04-06 Target a specific date
tldr-podcast run -t ai -n Dry-run: print dialogue, skip TTS
tldr-podcast run -t ai -A Generate script + report, skip TTS and audio
tldr-podcast run -R Disable report generation
tldr-podcast run -o ./podcasts Custom output directory
tldr-podcast config init Interactive configuration wizard
tldr-podcast config show Display current config
tldr-podcast completions SHELL Print completion script (bash/zsh/fish)
tldr-podcast --version Print the installed version and exit

Short flags: -c config ยท -t topics ยท -d date ยท -o output-dir ยท -n dry-run ยท -A no-audio ยท -v verbose ยท -r/-R report/no-report ยท -h help

Output naming

Topics are sorted alphabetically and joined with the date:

ai-devops-2026-04-17.mp3
ai-devops-2026-04-17/
โ”œโ”€โ”€ overview.md
โ”œโ”€โ”€ articles.md
โ”œโ”€โ”€ script.md
โ””โ”€โ”€ summary.md

๐Ÿš Shell completions

Generate and install a completion script for your shell. Write to a file โ€” do not pipe into eval.

bash ยท zsh ยท fish
# Bash โ€” user completion directory (auto-sourced by bash-completion)
mkdir -p ~/.local/share/bash-completion/completions
tldr-podcast completions bash > ~/.local/share/bash-completion/completions/tldr-podcast

# Zsh โ€” a directory on $fpath
mkdir -p ~/.zsh/completions
tldr-podcast completions zsh > ~/.zsh/completions/_tldr-podcast
# ensure ~/.zshrc contains:
#   fpath=(~/.zsh/completions $fpath)
#   autoload -Uz compinit && compinit

# Fish โ€” auto-sourced on next shell start
tldr-podcast completions fish > ~/.config/fish/completions/tldr-podcast.fish

๐Ÿงช Tests

uv run pytest tests/ -v

All external APIs (Gemini, HTTP) are mocked. A real captured TLDR HTML page in tests/fixtures/ drives realistic parse validation.


๐Ÿงญ Pipeline

flowchart TB
    IN["๐ŸŒ tldr.tech/<topic>/<date>"]

    subgraph SRC["โ‘  Source"]
        WEB["Web Source<br/>BeautifulSoup ยท sponsor filter ยท dedup"]
    end

    subgraph CUR["โ‘ก Curation"]
        RANK["Interest Ranking<br/>LLM scores 1โ€“10"]
        WS["Web Scraper<br/>trafilatura full-text"]
    end

    subgraph GEN["โ‘ข Generation"]
        LLM["Script Writer<br/>Gemini Flash"]
        DC["Dialogue chunks<br/>โ‰ค 3 000 bytes"]
        TTS["TTS Generator<br/>Gemini multi-speaker"]
    end

    subgraph OUT["โ‘ฃ Output"]
        AE["Audio Exporter<br/>pydub + ffmpeg"]
        RPT["๐Ÿ“Š Report Generator"]
    end

    IN --> WEB --> RANK --> WS
    WS --> LE["Link Extractor<br/>repos ยท models ยท papers"]
    WS --> LLM --> DC
    DC --> TTS --> AE --> MP3["๐ŸŽ™๏ธ .mp3 / .wav"]
    DC --> RPT
    LE --> RPT --> FILES["๐Ÿ“‚ overview ยท articles ยท script ยท links"]

๐Ÿท๏ธ Releasing

Releases are automated by .github/workflows/publish.yml. Bumping version in pyproject.toml and pushing to main runs, in order:

  1. Tests โ€” uv run pytest must pass; a red suite blocks the release.
  2. PyPI publish โ€” built with uv build and uploaded via Trusted Publishing (OIDC, no stored token). Requires a publisher configured on PyPI for project tldr-podcast, repository obeone/tldr-podcast, workflow publish.yml, environment pypi.
  3. GitHub release โ€” a v<version> tag plus a release with auto-generated notes.

Dependency-only edits to pyproject.toml are ignored (the version value must actually change), and an already-released version is skipped, so re-runs and unrelated edits are safe no-ops.


๐Ÿ—‚๏ธ Project structure

tldr-podcast/
โ”œโ”€โ”€ config.example.yaml         # Fully documented configuration template
โ”œโ”€โ”€ pyproject.toml
โ”œโ”€โ”€ src/tldr/
โ”‚   โ”œโ”€โ”€ cli.py                  # Click CLI (run ยท config ยท completions)
โ”‚   โ”œโ”€โ”€ config.py               # YAML loader with *_env resolution
โ”‚   โ”œโ”€โ”€ config_migrations.py    # Versioned schema + in-place auto-upgrade
โ”‚   โ”œโ”€โ”€ models.py               # Shared Article dataclass
โ”‚   โ”œโ”€โ”€ web_source.py           # tldr.tech fetcher + parser
โ”‚   โ”œโ”€โ”€ web_scraper.py          # trafilatura full-text scraper
โ”‚   โ”œโ”€โ”€ link_extractor.py       # URL extraction and categorisation
โ”‚   โ”œโ”€โ”€ llm_summarizer.py       # Interest ranking + dialogue generation
โ”‚   โ”œโ”€โ”€ tts_generator.py        # Gemini multi-speaker TTS
โ”‚   โ”œโ”€โ”€ audio_exporter.py       # PCM โ†’ MP3/WAV via pydub
โ”‚   โ”œโ”€โ”€ report_generator.py     # Timestamped report folder output
โ”‚   โ”œโ”€โ”€ token_tracker.py        # Token usage and cost tracking
โ”‚   โ””โ”€โ”€ retry.py                # Retry with exponential backoff
โ””โ”€โ”€ tests/
    โ”œโ”€โ”€ fixtures/               # Real captured HTML for parse tests
    โ””โ”€โ”€ โ€ฆ                       # pytest unit tests (all APIs mocked)

๐Ÿ“œ Changelog

Version Highlights
1.7.3 Tag-based release detection (robust to rebase/squash merges) + skip-existing on publish; MIT license; PyPI project page (README long description, author, project URLs); install docs use PyPI instead of GitHub
1.7.1 CI: install ffmpeg in the release test gate; skip ffmpeg-dependent audio-exporter tests when ffmpeg is absent
1.7.0 Optional CloakBrowser stealth-browser fallback (scraping.cloak_fallback: auto|on|off); config schema v4
1.6.x Dependency security bumps; trafilatura 2.0 scraper user-agent fix
1.5.0 --version flag on the top-level group
1.4.0 Audio-tag support for Gemini 3.x Flash TTS; versioned config schema (config_version) with in-place auto-upgrade + backup
1.3.0 Shell completion support (completions bash|zsh|fish)
1.2.0 Numbered topic recap in the conclusion; --no-audio flag; out-of-order TTS progress bar
1.0.0 Breaking โ€” switched from IMAP/email to direct web scraping of tldr.tech. No account or credentials needed; removed -e/--eml, -s/--status and the imap: config section

๐Ÿ“„ License

MIT ยฉ Grรฉgoire Compagnon


Made with ๐ŸŽง by Grรฉgoire Compagnon โ€” obeone@obeone.org

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tldr_podcast-1.7.3.tar.gz (78.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tldr_podcast-1.7.3-py3-none-any.whl (54.2 kB view details)

Uploaded Python 3

File details

Details for the file tldr_podcast-1.7.3.tar.gz.

File metadata

  • Download URL: tldr_podcast-1.7.3.tar.gz
  • Upload date:
  • Size: 78.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tldr_podcast-1.7.3.tar.gz
Algorithm Hash digest
SHA256 3d6adf7d494f21526f1b9f47b07726f7d30901d28069c827eda7ab7d174d6192
MD5 165229fe1d393cdb31bca3429a10aeb3
BLAKE2b-256 bc9b456f9c17432099df987aadadb35db765168cc96c2b61b0f0139369d967a8

See more details on using hashes here.

Provenance

The following attestation bundles were made for tldr_podcast-1.7.3.tar.gz:

Publisher: publish.yml on obeone/tldr-podcast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tldr_podcast-1.7.3-py3-none-any.whl.

File metadata

  • Download URL: tldr_podcast-1.7.3-py3-none-any.whl
  • Upload date:
  • Size: 54.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tldr_podcast-1.7.3-py3-none-any.whl
Algorithm Hash digest
SHA256 45c137ff75547092b66560c113f65aee649c02eaefa4c42080eb52b795fd5605
MD5 e05360ab3821d1e0f1547dfea4b66d64
BLAKE2b-256 827e841237b6a6b03b668ac83e9c6116c654efc33db5c223cd5d7d7902ef5546

See more details on using hashes here.

Provenance

The following attestation bundles were made for tldr_podcast-1.7.3-py3-none-any.whl:

Publisher: publish.yml on obeone/tldr-podcast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page