Converts TLDR newsletters to a two-voice podcast MP3 using Gemini

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

obeone

These details have not been verified by PyPI

Project description

🎙️ tldr-podcast

Turn your TLDR newsletters into a listenable two-voice podcast — automatically.

Python Gemini ffmpeg Version License

Fetches any combination of TLDR topic newsletters, LLM-scores the articles, and generates a scripted dialogue + audio via Gemini AI. No email account, no subscription, no API beyond Gemini.

✨ Features

	Feature	What it does
🌐	Zero-config fetching	Pulls newsletters straight from `tldr.tech` — no inbox, no scraping your mail
🧠	Smart curation	An LLM interest-scores every article 1–10 before scraping; only the best survive
🗣️	Two-voice dialogue	Configurable speaker names, Gemini voices, personalities, and language
🎭	Expressive delivery	Inline audio tags (`[laughs]`, `[short pause]`, `[enthusiasm]`) on Gemini 3.x TTS; graceful fallback on older models
🗂️	Per-run reports	Overview, full article list, script, and extracted links (repos · papers · models)
🕵️	Stealth browser fallback	Optional CloakBrowser (Playwright stealth Chromium) re-renders pages that block trafilatura
🔁	Self-upgrading config	Versioned schema; missing keys are added in place, old file kept as `.bak`
💸	Cost tracking	Live token usage and a USD estimate at the end of every run
🔇	Dry & no-audio modes	Preview the script without ever calling TTS
🎚️	Flexible output	MP3 or WAV, custom output directory

📦 Installation

Requires Python 3.13+ and ffmpeg.

# macOS
brew install ffmpeg

# Ubuntu / Debian
sudo apt-get install -y ffmpeg

From PyPI — no clone needed

# uv (recommended)
uv tool install tldr-podcast

# uvx — run once, install nothing permanently
uvx tldr-podcast run -t ai --no-interactive

# pipx
pipx install tldr-podcast

# pip (inside an active venv)
pip install tldr-podcast

For the unreleased main instead of the latest PyPI release, replace tldr-podcast with git+https://github.com/obeone/tldr-podcast.

From a local clone

git clone https://github.com/obeone/tldr-podcast
cd tldr-podcast

uv tool install .                 # install as a CLI tool
# or, for development:
uv sync && uv pip install -e .    # editable install

Optional — stealth browser fallback (CloakBrowser)

The optional cloak extra adds a Playwright-based stealth Chromium (CloakBrowser) that re-renders pages which block trafilatura; the ~200 MB browser binary downloads automatically at first runtime use (not at install time).

# uv tool — from PyPI
uv tool install "tldr-podcast[cloak]"

# pipx — from PyPI
pipx install "tldr-podcast[cloak]"

# pip (inside an active venv) — from PyPI
pip install "tldr-podcast[cloak]"

# from a local clone
uv tool install ".[cloak]"          # as a CLI tool
uv sync --extra cloak               # for development

Already installed without it? Re-run the matching command above with the [cloak] extra to add the fallback.

🚀 Quick start

# 1. Create your config interactively
tldr-podcast config init

# 2. Export your Gemini API key
export GEMINI_API_KEY="your-key"

# 3. Pick topics interactively and generate
tldr-podcast run

# …or go straight to it
tldr-podcast run -t ai,devops --no-interactive

📰 Topics

13 TLDR newsletters, mix and match freely:

Slug	Newsletter	Slug	Newsletter
`ai`	AI	`design`	Design
`infosec`	Information Security	`product`	Product
`devops`	DevOps	`marketing`	Marketing
`tech`	Tech (the flagship)	`data`	Data
`crypto`	Crypto	`fintech`	Fintech
`founders`	Founders	`dev`	Web Dev
`it`	Information Technology

⚙️ Configuration

The wizard covers every option and writes a ready-to-use config.yaml:

tldr-podcast config init

Default path: $XDG_CONFIG_HOME/tldr/config.yaml (falls back to ~/.config/tldr/config.yaml).

A minimal config looks like this:

config_version: 6

web:
  default_topics: [ ai, infosec, devops ]

gemini:
  api_key_env: GEMINI_API_KEY        # name of the env var — never the key itself
  text_model: gemini-2.0-flash
  tts_model: gemini-2.5-flash-preview-tts
  language: French
  speaker1: { name: Alex,   voice: Puck,   personality: "enthusiastic, curious" }
  speaker2: { name: Jordan, voice: Charon, personality: "analytical, skeptical" }

output:
  dir: "."
  format: mp3

🔐 The only required secret is GEMINI_API_KEY. Secrets are never written to the file — keys ending in _env hold the name of the environment variable read at runtime.

For fine-grained tuning (TTS pace, dialogue style, service tiers, per-model pricing…), every key is documented inline in config.example.yaml.

Stealth browser fallback (optional)

When trafilatura fails to fetch or extract an article (bot-detection, JS-rendered pages, etc.), the scraper can fall back to CloakBrowser — a Playwright-based stealth Chromium that bypasses most bot-detection measures.

Install the optional cloak extra — see Installation.

Config key (scraping.cloak_fallback):

Value	Behaviour
`auto` (default)	Use the fallback when the `cloakbrowser` package is importable
`on`	Require the fallback; warns and degrades to newsletter summaries if not installed
`off`	Never use the browser fallback

scraping:
  cloak_fallback: auto   # auto | on | off

After navigation, the fallback automatically waits up to 35 seconds for any Cloudflare Turnstile challenge to resolve before reading the page.

At most 2 stealth-browser sessions run concurrently to avoid memory exhaustion; trafilatura workers are unaffected.

Known limitation: heavily fortified sites using enterprise bot management (e.g. g2.com) may still be blocked — the fallback handles standard Cloudflare challenges, not every anti-bot system.

tldr-podcast config show              # raw config
tldr-podcast config show --resolve    # env vars resolved, secrets masked

🖥️ CLI reference

Command	Description
`tldr-podcast run`	Interactive topic picker → generate podcast
`tldr-podcast run -t ai,devops`	Explicit topics, skip the prompt
`tldr-podcast run -t ai --no-interactive`	Non-interactive, use config defaults if no `-t`
`tldr-podcast run -d 2026-04-06`	Target a specific date
`tldr-podcast run -t ai -n`	Dry-run: print dialogue, skip TTS
`tldr-podcast run -t ai -A`	Generate script + report, skip TTS and audio
`tldr-podcast run -R`	Disable report generation
`tldr-podcast run -o ./podcasts`	Custom output directory
`tldr-podcast config init`	Interactive configuration wizard
`tldr-podcast config show`	Display current config
`tldr-podcast completions SHELL`	Print completion script (bash/zsh/fish)
`tldr-podcast --version`	Print the installed version and exit

Short flags: -c config · -t topics · -d date · -o output-dir · -n dry-run · -A no-audio · -v verbose · -r/-R report/no-report · -h help

Output naming

Topics are sorted alphabetically and joined with the date:

ai-devops-2026-04-17.mp3
ai-devops-2026-04-17/
├── overview.md
├── articles.md
├── script.md
└── summary.md

🐚 Shell completions

Generate and install a completion script for your shell. Write to a file — do not pipe into eval.

bash · zsh · fish

# Bash — user completion directory (auto-sourced by bash-completion)
mkdir -p ~/.local/share/bash-completion/completions
tldr-podcast completions bash > ~/.local/share/bash-completion/completions/tldr-podcast

# Zsh — a directory on $fpath
mkdir -p ~/.zsh/completions
tldr-podcast completions zsh > ~/.zsh/completions/_tldr-podcast
# ensure ~/.zshrc contains:
#   fpath=(~/.zsh/completions $fpath)
#   autoload -Uz compinit && compinit

# Fish — auto-sourced on next shell start
tldr-podcast completions fish > ~/.config/fish/completions/tldr-podcast.fish

🧪 Tests

uv run pytest tests/ -v

All external APIs (Gemini, HTTP) are mocked. A real captured TLDR HTML page in tests/fixtures/ drives realistic parse validation.

🧭 Pipeline

flowchart TB
    IN["🌐 tldr.tech/&lt;topic&gt;/&lt;date&gt;"]

    subgraph SRC["① Source"]
        WEB["Web Source<br/>BeautifulSoup · sponsor filter · dedup"]
    end

    subgraph CUR["② Curation"]
        RANK["Interest Ranking<br/>LLM scores 1–10"]
        WS["Web Scraper<br/>trafilatura full-text"]
    end

    subgraph GEN["③ Generation"]
        LLM["Script Writer<br/>Gemini Flash"]
        DC["Dialogue chunks<br/>≤ 3 000 bytes"]
        TTS["TTS Generator<br/>Gemini multi-speaker"]
    end

    subgraph OUT["④ Output"]
        AE["Audio Exporter<br/>pydub + ffmpeg"]
        RPT["📊 Report Generator"]
    end

    IN --> WEB --> RANK --> WS
    WS --> LE["Link Extractor<br/>repos · models · papers"]
    WS --> LLM --> DC
    DC --> TTS --> AE --> MP3["🎙️ .mp3 / .wav"]
    DC --> RPT
    LE --> RPT --> FILES["📂 overview · articles · script · links"]

🏷️ Releasing

Releases are automated by .github/workflows/publish.yml. Bumping version in pyproject.toml and pushing to main runs, in order:

Tests — uv run pytest must pass; a red suite blocks the release.
PyPI publish — built with uv build and uploaded via Trusted Publishing (OIDC, no stored token). Requires a publisher configured on PyPI for project tldr-podcast, repository obeone/tldr-podcast, workflow publish.yml, environment pypi.
GitHub release — a v<version> tag plus a release with auto-generated notes.

Dependency-only edits to pyproject.toml are ignored (the version value must actually change), and an already-released version is skipped, so re-runs and unrelated edits are safe no-ops.

🗂️ Project structure

tldr-podcast/
├── config.example.yaml         # Fully documented configuration template
├── pyproject.toml
├── src/tldr/
│   ├── cli.py                  # Click CLI (run · config · completions)
│   ├── config.py               # YAML loader with *_env resolution
│   ├── config_migrations.py    # Versioned schema + in-place auto-upgrade
│   ├── models.py               # Shared Article dataclass
│   ├── web_source.py           # tldr.tech fetcher + parser
│   ├── web_scraper.py          # trafilatura full-text scraper
│   ├── link_extractor.py       # URL extraction and categorisation
│   ├── llm_summarizer.py       # Interest ranking + dialogue generation
│   ├── tts_generator.py        # Gemini multi-speaker TTS
│   ├── audio_exporter.py       # PCM → MP3/WAV via pydub
│   ├── report_generator.py     # Timestamped report folder output
│   ├── token_tracker.py        # Token usage and cost tracking
│   └── retry.py                # Retry with exponential backoff
└── tests/
    ├── fixtures/               # Real captured HTML for parse tests
    └── …                       # pytest unit tests (all APIs mocked)

📜 Changelog

Version	Highlights
1.7.3	Tag-based release detection (robust to rebase/squash merges) + `skip-existing` on publish; MIT license; PyPI project page (README long description, author, project URLs); install docs use PyPI instead of GitHub
1.7.1	CI: install ffmpeg in the release test gate; skip ffmpeg-dependent audio-exporter tests when ffmpeg is absent
1.7.0	Optional CloakBrowser stealth-browser fallback (`scraping.cloak_fallback: auto\|on\|off`); config schema v4
1.6.x	Dependency security bumps; `trafilatura` 2.0 scraper user-agent fix
1.5.0	`--version` flag on the top-level group
1.4.0	Audio-tag support for Gemini 3.x Flash TTS; versioned config schema (`config_version`) with in-place auto-upgrade + backup
1.3.0	Shell completion support (`completions bash\|zsh\|fish`)
1.2.0	Numbered topic recap in the conclusion; `--no-audio` flag; out-of-order TTS progress bar
1.0.0	Breaking — switched from IMAP/email to direct web scraping of tldr.tech. No account or credentials needed; removed `-e/--eml`, `-s/--status` and the `imap:` config section

📄 License

MIT © Grégoire Compagnon

Made with 🎧 by Grégoire Compagnon — obeone@obeone.org

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

obeone

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.8.0

May 19, 2026

This version

1.7.3

May 19, 2026

1.7.1

May 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tldr_podcast-1.7.3.tar.gz (78.4 kB view details)

Uploaded May 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tldr_podcast-1.7.3-py3-none-any.whl (54.2 kB view details)

Uploaded May 19, 2026 Python 3

File details

Details for the file tldr_podcast-1.7.3.tar.gz.

File metadata

Download URL: tldr_podcast-1.7.3.tar.gz
Upload date: May 19, 2026
Size: 78.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tldr_podcast-1.7.3.tar.gz
Algorithm	Hash digest
SHA256	`3d6adf7d494f21526f1b9f47b07726f7d30901d28069c827eda7ab7d174d6192`
MD5	`165229fe1d393cdb31bca3429a10aeb3`
BLAKE2b-256	`bc9b456f9c17432099df987aadadb35db765168cc96c2b61b0f0139369d967a8`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tldr_podcast-1.7.3.tar.gz:

Publisher: publish.yml on obeone/tldr-podcast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tldr_podcast-1.7.3.tar.gz
- Subject digest: 3d6adf7d494f21526f1b9f47b07726f7d30901d28069c827eda7ab7d174d6192
- Sigstore transparency entry: 1574478911
- Sigstore integration time: May 19, 2026
Source repository:
- Permalink: obeone/tldr-podcast@ce7a5bea0c629c1ee436972c8a2d4d38d6b82d05
- Branch / Tag: refs/heads/main
- Owner: https://github.com/obeone
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@ce7a5bea0c629c1ee436972c8a2d4d38d6b82d05
- Trigger Event: push

File details

Details for the file tldr_podcast-1.7.3-py3-none-any.whl.

File metadata

Download URL: tldr_podcast-1.7.3-py3-none-any.whl
Upload date: May 19, 2026
Size: 54.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tldr_podcast-1.7.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`45c137ff75547092b66560c113f65aee649c02eaefa4c42080eb52b795fd5605`
MD5	`e05360ab3821d1e0f1547dfea4b66d64`
BLAKE2b-256	`827e841237b6a6b03b668ac83e9c6116c654efc33db5c223cd5d7d7902ef5546`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tldr_podcast-1.7.3-py3-none-any.whl:

Publisher: publish.yml on obeone/tldr-podcast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tldr_podcast-1.7.3-py3-none-any.whl
- Subject digest: 45c137ff75547092b66560c113f65aee649c02eaefa4c42080eb52b795fd5605
- Sigstore transparency entry: 1574478944
- Sigstore integration time: May 19, 2026
Source repository:
- Permalink: obeone/tldr-podcast@ce7a5bea0c629c1ee436972c8a2d4d38d6b82d05
- Branch / Tag: refs/heads/main
- Owner: https://github.com/obeone
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@ce7a5bea0c629c1ee436972c8a2d4d38d6b82d05
- Trigger Event: push

tldr-podcast 1.7.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

🎙️ tldr-podcast

📑 Table of Contents

✨ Features

📦 Installation

🚀 Quick start

📰 Topics

⚙️ Configuration

Stealth browser fallback (optional)

🖥️ CLI reference

Output naming

🐚 Shell completions

🧪 Tests

🧭 Pipeline

🏷️ Releasing

🗂️ Project structure

📜 Changelog

📄 License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance