Give it a topic, get back a narrated podcast episode — researched by Claude, voiced by a local Kokoro model.

These details have not been verified by PyPI

Project links

Project description

Earworm

Give it a topic; get back a narrated podcast episode. Earworm researches the topic with the Claude CLI, runs the findings through an adversarial review, rewrites them into a script written for the ear, narrates it with a local neural voice (Kokoro), masters the audio to broadcast loudness, and tags a ready-to-play mp3. Optionally it publishes to a private podcast feed you can subscribe to on your phone. Generation (LLM, occasionally slow) is fully decoupled from rendering (local, fast, deterministic) — they only ever communicate through a folder of script files.

Install

Earworm needs Python 3.11+, ffmpeg, and — for the research and scripting passes — the authenticated claude CLI on your PATH. Narration is fully local; no API key needed for the voice.

From PyPI:

pip install earworm-pod        # or: uv tool install earworm-pod  (the CLI is still `earworm`)
earworm download-models        # one-time: fetch the Kokoro voice model + G2P data (~few hundred MB)

From source (to tune the prompts — they're the product):

git clone https://github.com/vrennat/earworm && cd earworm
uv sync                        # .venv + locked deps (incl. Kokoro + torch)
uv run earworm download-models

Recommended for proper-noun pronunciation: brew install espeak-ng (Linux: apt install espeak-ng). Kokoro's misaki G2P uses it as an out-of-vocabulary fallback and degrades gracefully without it.

Quickstart

earworm init                   # scaffold prompts/ + config templates + queue db here
cp config/show.example.toml config/show.toml     # set your podcast title/author (optional)

earworm add "What is the current state of small language models, and why does it matter?"
earworm run                    # research -> review -> script  (writes inbox/scripts/<id>.md)
earworm watch                  # render scripts -> episodes/<id>.mp3  (long-running)

From a source checkout, prefix commands with uv run (e.g. uv run earworm init). The first synthesis warms up in ~30s; after that the watcher stays warm and renders faster than real time. With no config files at all, narration uses a sensible default voice — customize it in config/voice.toml.

Commands

earworm add "<topic>"        queue a topic
earworm autogen --count 3    propose + queue topics from interests.md
earworm list                 inspect the queue
earworm run [--id N] [--all] drain pending topic(s): research -> review -> script
earworm reset-stale          requeue topics stuck 'running' after a crash
earworm watch                render new scripts -> mp3 (+ publish), long-running
earworm render <file.md>     one-shot render of a single script (testing)
earworm download-models      pre-fetch the Kokoro model + voices (warm the cache)
earworm publish              retry upload + register for any unpublished episodes

run accepts --model to force one model across every stage (e.g. --model sonnet). For finer control — a different model per pass, retries, fallback, or skipping a quality pass — use config/pipeline.toml (see Pipeline configuration).

Architecture

Two halves that share nothing but a folder. The producer is the LLM-driven generator; the consumer is a dumb, deterministic renderer. Either can run, crash, or be restarted independently.

                 PRODUCE (Claude CLI, slow)                CONSUME (local, fast, no LLM)
  earworm add ─┐
               ├─► [ queue: earworm.db ] ─► earworm run                earworm watch (polls inbox/)
  earworm      │      topics, episodes        │                              │
  autogen ─────┘                              │ 1. research  (web)           │ read script.md
                                              │ 2. review    (adversarial)   │ normalize for speech
                                              │ 3. script    (write for ear) │ apply lexicon (IPA)
                                              │ 4. script-review             │ Kokoro TTS -> wav
                                              │ 5. revise in place           │ ffmpeg master + mp3
                                              ▼                              │ ID3 tags + show notes
                                    inbox/scripts/<id>.md ──────────────────►│ episodes/<id>.mp3
                                                                             ▼
                                                          (optional) upload to R2 + register
                                                          with Cloudflare Worker ─► RSS feed ─► phone

Queue: local SQLite (earworm.db), tables topics and episodes. The runner is offline-capable; the local queue is its source of truth.
Prompts (prompts/*.md) are the product. The research → review → script → script-review → revise chain is five LLM passes; tune the prompts constantly.
Idempotency: the renderer keys each episode on a hash of the script body, so re-processing the same script never produces a duplicate (tests/test_idempotency.py).
Atomic handoff: scripts are generated/revised in a staging dir and os.replaced into inbox/scripts/ only when finished, so the watcher never renders a half-written file.

Backends

Research + scripting run through the Claude CLI — claude -p headless with a tool allowlist, web search in the research pass. Earworm is coupled to Claude Code by design: the agentic web-research-and-write loop is the whole quality story, so there is no pluggable LLM backend. claude.py is the thin CLI wrapper; pipeline.py declares the five passes and the executor (per-stage model, retry, fallback); runner.py orchestrates. Authenticate with claude login or ANTHROPIC_API_KEY.

Narration (TTS) is Kokoro — a local neural voice model. 54 voices, runs on-device, no API key, free. Selected by engine in config/voice.toml; the engine is loaded behind a small interface (src/earworm/tts/base.py) so another backend can be dropped in later.

Cost per episode

Honest caveat: these are rough, unmeasured order-of-magnitude estimates, not a benchmark. Real cost depends on the model, topic depth, and how much the research pass fetches. Measure your own before trusting a number.

Narration (Kokoro): $0. Runs locally on CPU/GPU.
Research + scripting (Claude CLI):
- On a Claude Pro/Max subscription: ~$0 marginal — the five passes count against your subscription usage limits, not a per-call bill.
- On a pay-as-you-go API key: the five passes (research with web search is the heaviest) are the cost driver. Ballpark a few cents to ~$1 per episode with Sonnet; more with Opus, less with Haiku. Treat this as a starting guess, not a quote.
Publishing (Cloudflare R2 + Worker), if enabled: effectively $0 at personal volume (well within free tiers).

Configuration

Every setting is documented in one place in config/earworm.example.toml. At runtime the pipeline reads these as separate files — copy each *.example.toml to its real name:

File	Required?	Purpose
`config/pipeline.toml`	optional	Per-stage model, retries, fallback, stage toggles
`config/voice.toml`	for rendering	TTS engine, voice/blend, audio + mastering chain
`config/show.toml`	for rendering	Podcast title/author/description/cover (ID3 + RSS)
`config/lexicon.toml`	optional (recommended)	Pronunciation overrides (IPA) for proper nouns
`config/feed.toml`	only if publishing	Cloudflare account, R2 bucket, Worker URL
`config/secrets.toml`	only if publishing	API token + feed secrets (or use env vars)
`.env`	optional	`ANTHROPIC_API_KEY` and other secrets via env
`interests.md`	only for `autogen`	Free-form interests that steer auto-topic proposals

Voices. 54 Kokoro voices download on first use. Set voice (and a matching lang_code: a American, b British) in config/voice.toml, or set a weighted blend. Naming is <lang><gender>_<name> — e.g. af_sky (American female), am_michael (American male), bf_emma (British female). Audition them with uv run python scripts/voice_sampler.py.

Pronunciation. Kokoro mispronounces some proper nouns and acronyms. config/lexicon.toml maps a word to misaki modified-IPA; the renderer rewrites it inline so Kokoro honors it. The shipped example covers common AI/tech/networking terms — extend it for your subject.

Pipeline configuration

Each topic runs five Claude Code passes: research → review → script → script_review → revise. With no config/pipeline.toml they all run on the claude CLI's default model with one retry. Copy config/pipeline.example.toml to tune per stage:

Per-stage model — spend where it matters. [pipeline.research] model = "opus" for the web-research pass, cheaper models elsewhere. earworm run --model <m> still forces one model across every stage when you want a blunt override.
Retry + fallback — retries adds attempts on the primary model; fallback_model makes one final attempt on a different model after the primary budget is exhausted (independent of retries, so it fires even at retries = 0).
Toggle quality passes — [pipeline.review] enabled = false writes the script straight from the report; [pipeline.script_review] enabled = false skips the script-review and revise loop (revise exists only to fold the review back in). The three load-bearing passes (research, script, and revise-when-reviewing) can't be toggled, and stages can't be reordered — the order is a data dependency, not a preference.

Publishing — a private podcast feed (optional)

Local-only use needs none of this: episodes render to episodes/*.mp3 with full ID3 tags that any player reads. To subscribe on your phone, deploy the bundled Cloudflare Worker (worker/) — a token-gated RSS feed backed by D1, with audio served from a public R2 bucket. Everything is free-tier at personal volume. Uses bun.

cd worker
bun install                                        # pins wrangler + types (commit-tracked lockfile)
cp wrangler.example.jsonc wrangler.jsonc           # fill in account_id, D1 id, show vars

# Provision Cloudflare resources
bunx wrangler d1 create earworm                     # paste the printed database_id into wrangler.jsonc
bunx wrangler d1 execute earworm --remote --file schema.sql
# create a PUBLIC R2 bucket in the dashboard; note its pub-*.r2.dev base URL

# Secrets (token-gates the feed + the ingest endpoint)
bunx wrangler secret put FEED_TOKEN
bunx wrangler secret put INGEST_SECRET

bunx wrangler deploy

Then point the Python side at it — in config/feed.toml set enabled = true, the worker_url, R2 bucket, and public_audio_base; put FEED_TOKEN/INGEST_SECRET in config/secrets.toml (or the matching env vars). After that, earworm watch uploads each new episode to R2 and registers it with the Worker; earworm publish backfills any that failed.

The Worker serves a token-gated /<FEED_TOKEN>/feed.xml (valid podcast RSS 2.0 with the iTunes namespace) — also reachable as /feed.xml?token=… for finicky apps. Audio is served directly from the public R2 bucket under an unguessable per-episode key; the Worker never proxies bytes. A bad token returns 404 (not 401), so the feed's existence never leaks.

Scheduling (macOS)

For hands-off operation, launchd/ ships three agents — a watch daemon (renders + publishes continuously), a weekday run (drains one topic at 07:30), and a Monday autogen (proposes 3 fresh topics from interests.md):

bash launchd/install.sh        # substitutes paths, loads the agents, starts the watcher
bash launchd/uninstall.sh      # unload + remove them

Logs land in logs/. On Linux, adapt the three .plist files to systemd timers.

Docker

The painful part to install is the renderer — CPU PyTorch, Kokoro, espeak-ng, ffmpeg. The bundled image owns all of that and bakes in a pre-warmed Kokoro model, so rendering works out of the box. It is CPU-only (the default Linux torch wheel bundles CUDA at ~2GB; the build selects the CPU PyTorch index via UV_TORCH_BACKEND=cpu).

docker build -t earworm .                          # ~minutes; downloads torch + model

# Render: mount your working dir (config/*.toml, inbox/, episodes/, earworm.db) at /data
docker run --rm -v "$PWD":/data earworm watch        # render scripts as they appear
docker run --rm -v "$PWD":/data earworm render inbox/scripts/<id>.md   # one-shot

Earworm's two halves share only a folder, so the natural split is generate on the host, render in the container — they meet at inbox/scripts/. Generation (earworm run) needs the authenticated claude CLI, which isn't in the image. To also generate in-container, install the CLI and pass a key:

docker run --rm -v "$PWD":/data -e ANTHROPIC_API_KEY=sk-... earworm run

(That still requires the claude CLI on PATH inside the image — add it to the Dockerfile with a Node layer if you want a single do-everything container. The default image keeps generation on the host.)

The model is downloaded at build time (earworm download-models runs in the build and smoke-tests a synth), so first render is instant and a broken stack fails the build, not you.

NOT in v1

A hosted/managed service. This is a local CLI you run yourself.
Multi-voice / dialogue. Single narrator only.
A web UI. CLI only.
Music, stingers, or ad insertion. Voice + mastering only.
Windows support. Developed and tested on macOS (Apple Silicon); Linux should work, Windows is untested.

Layout

prompts/        the five LLM prompts — the heart of it (bundled into the wheel too)
config/         *.example.toml templates (copy to real names; reals are gitignored)
src/earworm/    cli, db, pipeline (stages + executor), runner, claude, render (TTS), normalize, tts/
scripts/        cover generator, voice sampler, regen/render/rerender helpers
launchd/        macOS agents: watch daemon + weekday run + Monday autogen
worker/         Cloudflare Worker (TypeScript, bun) — token-gated RSS feed over D1 + R2
tests/          pipeline + config + normalize + idempotency tests (run: uv run python tests/<file>)
Dockerfile      CPU-only renderer image (Kokoro + ffmpeg, pre-warmed model)

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

earworm_pod-0.1.0.tar.gz (217.7 kB view details)

Uploaded Jun 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

earworm_pod-0.1.0-py3-none-any.whl (57.9 kB view details)

Uploaded Jun 10, 2026 Python 3

File details

Details for the file earworm_pod-0.1.0.tar.gz.

File metadata

Download URL: earworm_pod-0.1.0.tar.gz
Upload date: Jun 10, 2026
Size: 217.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for earworm_pod-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`0b3c942657dd4795d60ffa4b2adffa3917aebd3cbffe5c1e0e9261cf67459c54`
MD5	`2bf214b6338ce2e5f3486fb9c78c44a1`
BLAKE2b-256	`575e97d872aecddc336f7ac81c7f6b4e664987494b24f260f392b947c6c0f512`

See more details on using hashes here.

File details

Details for the file earworm_pod-0.1.0-py3-none-any.whl.

File metadata

Download URL: earworm_pod-0.1.0-py3-none-any.whl
Upload date: Jun 10, 2026
Size: 57.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for earworm_pod-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b2ed16c91664f9922689d106a77fc58a2dad3f82f180625fcf4bea09cc4a8066`
MD5	`0e0cc0d9c18a11dbde06faf919d5bc67`
BLAKE2b-256	`d054063c7597e6e161e81debc45d51d466fbdeee49450533f78b728d58a04648`

See more details on using hashes here.

earworm-pod 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Earworm

Install

Quickstart

Commands

Architecture

Backends

Cost per episode

Configuration

Pipeline configuration

Publishing — a private podcast feed (optional)

Scheduling (macOS)

Docker

NOT in v1

Layout

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes