Skip to main content

Generate conversational podcasts from documents using AI

Project description

gencast

Generate conversational podcasts from documents using AI. A cost-effective, customisable, local-first alternative to NotebookLM.

gencast notebook.yaml  ->  podcast.m4a (with embedded subtitles)

Install

pip install gencast

System dependency: ffmpeg (for audio combining and M4A muxing).

API keys (export or use gencast init to be prompted):

export OPENAI_API_KEY="sk-..."          # required (TTS + Whisper)
export ANTHROPIC_API_KEY="sk-ant-..."   # required (default outline + transcript)
export MISTRAL_API_KEY="..."            # optional (better PDF extraction)

Quickstart

gencast init                        # interactive notebook wizard
gencast preview notebook.yaml       # outline-only dry run (free)
gencast generate notebook.yaml      # full pipeline -> out/<basename>.m4a

Or one-shot from a markdown file (uses default profiles):

gencast generate path/to/lecture.md

Three-axis profile system

Each notebook composes three orthogonal profiles:

speaker_profile: revision-duo       # WHO speaks (1-4 voices, personas)
episode_profile: exam-revision      # WHAT kind of podcast (briefing, segments, models)
room_profile:    small-room         # HOW it sounds (spatial pipeline)

List bundled profiles:

gencast list-profiles --type speakers
gencast list-profiles --type episodes
gencast list-profiles --type rooms

Profiles cascade: ./gencast/profiles/<kind>/<name>.yaml (project)

~/.config/gencast/profiles/<kind>/<name>.yaml (XDG) bundled defaults. Override per-notebook via overrides: block in the notebook YAML.

Worked example

./photosynthesis/notebook.yaml:

title: Photosynthesis revision
sources:
  - lectures/photosynthesis.md
  - lectures/calvin-cycle.md
speaker_profile: revision-duo
episode_profile: exam-revision
room_profile: small-room
output:
  basename: photosynthesis-revision
  formats: [m4a]
overrides:
  briefing_suffix: |
    Pay specific attention to the distinction between the light-dependent
    reactions and the Calvin cycle. Include one worked Q&A on this distinction.
gencast generate photosynthesis/notebook.yaml
# -> photosynthesis/out/photosynthesis-revision.m4a

Cost

Typical 10-min podcast (~5K-token source, 6 segments, 2 speakers):

Component Default model Cost
Outline claude-haiku-4-5 ~$0.005
Transcript (with prompt cache) claude-sonnet-4-5 ~$0.10
TTS openai/tts-1-hd ~$0.06
Subtitles native (no Whisper) $0.00
Total ~$0.17

Use --model overrides or different episode profiles to trade quality for cost.

Caches

  • TTS cache -- ~/.cache/gencast/tts/ -- always on. Re-runs cost only changed sentences.
  • LLM cache -- ~/.cache/gencast/llm/ -- opt-in via --cache-llm. Off by default since dialogue is non-deterministic.
  • PDF extract cache -- ~/.cache/gencast/extract/ -- always on for Mistral PDF extraction.

Manage:

gencast cache status
gencast cache clear --type tts --yes

CLI reference

gencast NB.yaml                       generate (alias for `gencast generate NB.yaml`)
gencast init [--copy NB] [--minimal]  interactive notebook wizard
gencast preview NB.yaml               outline-only dry run
gencast generate NB.yaml              full pipeline -> m4a + sidecars
gencast estimate NB.yaml [--json] [--no-suggestions]
                                      predict USD cost before running. +-25% uncertainty.
gencast estimate --rates-only [--json]
                                      dump per-1k-token rates for bundled-default models.
gencast list-profiles [--type X]      enumerate profiles in cascade
gencast subtitle audio.mp3            re-subtitle external audio (Whisper)
gencast cache status [--type X]       inspect cache sizes
gencast cache clear [--type X] [--yes]

Verbosity: -v, -vv, -q, --silent, --log-file PATH.

Cost preview

Predict cost before generating:

gencast estimate my-lecture.yaml
# gencast estimate -- my-lecture.yaml
# ================================================================
# Source:    12,840 tokens  (1 file)
#
# Stage breakdown                                          est. USD
# ------------------------------------------------------  --------
# Extract                                                    $0.00
# Outline      claude-haiku-4-5      . 13.0k in              $0.04
# Transcript   claude-sonnet-4-5     . 6 segs/~1.4k          $0.18
# TTS          openai/tts-1-hd       . ~4,500 chars          $0.14
# Whisper      whisper-1             . ~6.0 min              $0.04
#                                                          --------
#                                                  Total:    $0.40
#                                                            +-25%
#
# Cheaper alternatives
#   transcript   claude-sonnet-4-5  -> claude-haiku-4-5  saves ~$0.13 (-72%)
#                  (quality trade-off -- see docs)

For scripts and skills, use --json:

gencast estimate my-lecture.yaml --json

For the rate table only (used by Claude Code skills via dynamic context injection):

gencast estimate --rates-only --json
gencast estimate --rates-only --provider anthropic --json
gencast estimate --rates-only --all-models --json   # all ~2,700 LiteLLM models

Tests

pytest tests/unit                          # fast, no API calls
pytest tests/component                     # vcrpy cassettes, no keys needed once recorded
GENCAST_TEST_E2E=1 pytest tests/e2e        # real API calls, costs a few cents
GENCAST_TEST_AUDIO=1 pytest tests/audio    # TTS + spatial audio (requires OPENAI_API_KEY)

Specs and design

Claude Code integration

gencast ships with a Claude Code plugin that exposes four skills for conversational use inside Claude Code. The plugin is bundled with the gencast Python package — no separate install once you have pip install gencast>=1.2.0.

In Claude Code, install the plugin once:

/plugin install gencast

Then trigger any of the four skills with natural language:

Skill Example trigger What it does
notebook-init "draft a gencast notebook from these notes" Builds notebook.yaml conversationally; picks profiles from the bundled catalogue.
source-check "are these sources good for a podcast?" Token-counts sources and predicts USD cost via gencast estimate.
review-transcript "review this gencast transcript" Reads transcript.json, flags awkward phrasings + flow problems. Advisory only — does not auto-regenerate.
cost-explain "explain my gencast cost.json" Plain-language cost-by-stage breakdown with optimisation suggestions.

Skills are workflow recipes that shell out to the gencast CLI. The CLI remains the source of truth and is fully usable without Claude Code or the plugin.

Source for the skills lives at skills/ in the gencast repo — see .claude-plugin/plugin.json for the manifest.

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gencast-1.3.2.tar.gz (61.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gencast-1.3.2-py3-none-any.whl (68.5 kB view details)

Uploaded Python 3

File details

Details for the file gencast-1.3.2.tar.gz.

File metadata

  • Download URL: gencast-1.3.2.tar.gz
  • Upload date:
  • Size: 61.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gencast-1.3.2.tar.gz
Algorithm Hash digest
SHA256 5a2785c1fc5b0cc6567fb529e2e9e164d794660127fb751d745e206e93f0e580
MD5 20f23c088de80da70a997c1350e0f543
BLAKE2b-256 384797c3422a09fe220e26a928bacaaca9858a6e5d431d5f923490eb8836771a

See more details on using hashes here.

Provenance

The following attestation bundles were made for gencast-1.3.2.tar.gz:

Publisher: release.yml on cadrianmae/gencast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gencast-1.3.2-py3-none-any.whl.

File metadata

  • Download URL: gencast-1.3.2-py3-none-any.whl
  • Upload date:
  • Size: 68.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gencast-1.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 932189b892b0025b2faeab07e53d17e3a5aae1e4e51c6ab86de65e5dd620b5ff
MD5 8e0fcbcc2ac909edc1f2773b4b5045b6
BLAKE2b-256 577e57bbe2a411da893255f762ceb494ffb208325d619ba74eed57ee88402362

See more details on using hashes here.

Provenance

The following attestation bundles were made for gencast-1.3.2-py3-none-any.whl:

Publisher: release.yml on cadrianmae/gencast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page