Skip to main content

Generate conversational podcasts from documents using AI

Project description

gencast

Generate conversational podcasts from documents using AI. A cost-effective, customisable, local-first alternative to NotebookLM.

gencast notebook.yaml  ->  podcast.m4a (with embedded subtitles)

Install

pip install gencast

System dependency: ffmpeg (for audio combining and M4A muxing).

API keys (export or use gencast init to be prompted):

export OPENAI_API_KEY="sk-..."          # required (TTS + Whisper)
export ANTHROPIC_API_KEY="sk-ant-..."   # required (default outline + transcript)
export MISTRAL_API_KEY="..."            # optional (better PDF extraction)

Quickstart

gencast init                        # interactive notebook wizard
gencast preview notebook.yaml       # outline-only dry run (free)
gencast generate notebook.yaml      # full pipeline -> out/<basename>.m4a

Or one-shot from a markdown file (uses default profiles):

gencast generate path/to/lecture.md

Three-axis profile system

Each notebook composes three orthogonal profiles:

speaker_profile: revision-duo       # WHO speaks (1-4 voices, personas)
episode_profile: exam-revision      # WHAT kind of podcast (briefing, segments, models)
room_profile:    small-room         # HOW it sounds (spatial pipeline)

List bundled profiles:

gencast list-profiles --type speakers
gencast list-profiles --type episodes
gencast list-profiles --type rooms

Profiles cascade: ./gencast/profiles/<kind>/<name>.yaml (project)

~/.config/gencast/profiles/<kind>/<name>.yaml (XDG) bundled defaults. Override per-notebook via overrides: block in the notebook YAML.

Worked example

./photosynthesis/notebook.yaml:

title: Photosynthesis revision
sources:
  - lectures/photosynthesis.md
  - lectures/calvin-cycle.md
speaker_profile: revision-duo
episode_profile: exam-revision
room_profile: small-room
output:
  basename: photosynthesis-revision
  formats: [m4a]
overrides:
  briefing_suffix: |
    Pay specific attention to the distinction between the light-dependent
    reactions and the Calvin cycle. Include one worked Q&A on this distinction.
gencast generate photosynthesis/notebook.yaml
# -> photosynthesis/out/photosynthesis-revision.m4a

Cost

Typical 10-min podcast (~5K-token source, 6 segments, 2 speakers):

Component Default model Cost
Outline claude-haiku-4-5 ~$0.005
Transcript (with prompt cache) claude-sonnet-4-5 ~$0.10
TTS openai/tts-1-hd ~$0.06
Subtitles native (no Whisper) $0.00
Total ~$0.17

Use --model overrides or different episode profiles to trade quality for cost.

Caches

  • TTS cache -- ~/.cache/gencast/tts/ -- always on. Re-runs cost only changed sentences.
  • LLM cache -- ~/.cache/gencast/llm/ -- opt-in via --cache-llm. Off by default since dialogue is non-deterministic.
  • PDF extract cache -- ~/.cache/gencast/extract/ -- always on for Mistral PDF extraction.

Manage:

gencast cache status
gencast cache clear --type tts --yes

CLI reference

gencast NB.yaml                       generate (alias for `gencast generate NB.yaml`)
gencast init [--copy NB] [--minimal]  interactive notebook wizard
gencast preview NB.yaml               outline-only dry run
gencast generate NB.yaml              full pipeline -> m4a + sidecars
gencast estimate NB.yaml [--json] [--no-suggestions]
                                      predict USD cost before running. +-25% uncertainty.
gencast estimate --rates-only [--json]
                                      dump per-1k-token rates for bundled-default models.
gencast list-profiles [--type X]      enumerate profiles in cascade
gencast subtitle audio.mp3            re-subtitle external audio (Whisper)
gencast cache status [--type X]       inspect cache sizes
gencast cache clear [--type X] [--yes]

Verbosity: -v, -vv, -q, --silent, --log-file PATH.

Cost preview

Predict cost before generating:

gencast estimate my-lecture.yaml
# gencast estimate -- my-lecture.yaml
# ================================================================
# Source:    12,840 tokens  (1 file)
#
# Stage breakdown                                          est. USD
# ------------------------------------------------------  --------
# Extract                                                    $0.00
# Outline      claude-haiku-4-5      . 13.0k in              $0.04
# Transcript   claude-sonnet-4-5     . 6 segs/~1.4k          $0.18
# TTS          openai/tts-1-hd       . ~4,500 chars          $0.14
# Whisper      whisper-1             . ~6.0 min              $0.04
#                                                          --------
#                                                  Total:    $0.40
#                                                            +-25%
#
# Cheaper alternatives
#   transcript   claude-sonnet-4-5  -> claude-haiku-4-5  saves ~$0.13 (-72%)
#                  (quality trade-off -- see docs)

For scripts and skills, use --json:

gencast estimate my-lecture.yaml --json

For the rate table only (used by Claude Code skills via dynamic context injection):

gencast estimate --rates-only --json
gencast estimate --rates-only --provider anthropic --json
gencast estimate --rates-only --all-models --json   # all ~2,700 LiteLLM models

Tests

pytest tests/unit                          # fast, no API calls
pytest tests/component                     # vcrpy cassettes, no keys needed once recorded
GENCAST_TEST_E2E=1 pytest tests/e2e        # real API calls, costs a few cents
GENCAST_TEST_AUDIO=1 pytest tests/audio    # TTS + spatial audio (requires OPENAI_API_KEY)

Specs and design

Claude Code integration

gencast ships with a Claude Code plugin that exposes four skills for conversational use inside Claude Code. The plugin is bundled with the gencast Python package — no separate install once you have pip install gencast>=1.2.0.

In Claude Code, install the plugin once:

/plugin install gencast

Then trigger any of the four skills with natural language:

Skill Example trigger What it does
notebook-init "draft a gencast notebook from these notes" Builds notebook.yaml conversationally; picks profiles from the bundled catalogue.
source-check "are these sources good for a podcast?" Token-counts sources and predicts USD cost via gencast estimate.
review-transcript "review this gencast transcript" Reads transcript.json, flags awkward phrasings + flow problems. Advisory only — does not auto-regenerate.
cost-explain "explain my gencast cost.json" Plain-language cost-by-stage breakdown with optimisation suggestions.

Skills are workflow recipes that shell out to the gencast CLI. The CLI remains the source of truth and is fully usable without Claude Code or the plugin.

Source for the skills lives at skills/ in the gencast repo — see .claude-plugin/plugin.json for the manifest.

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gencast-1.3.0.tar.gz (61.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gencast-1.3.0-py3-none-any.whl (67.9 kB view details)

Uploaded Python 3

File details

Details for the file gencast-1.3.0.tar.gz.

File metadata

  • Download URL: gencast-1.3.0.tar.gz
  • Upload date:
  • Size: 61.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gencast-1.3.0.tar.gz
Algorithm Hash digest
SHA256 455ec217bafd7fe768edab016259a7db24ef4f1e9848d974bd2d4f926ab906f0
MD5 a72aa559dc0fd34b71755471b4a32764
BLAKE2b-256 c17593cea37d626023104976389e3459eec5132109344530042df8a28b20e225

See more details on using hashes here.

Provenance

The following attestation bundles were made for gencast-1.3.0.tar.gz:

Publisher: release.yml on cadrianmae/gencast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gencast-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: gencast-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 67.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gencast-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b7afeeffba4dcf991f88e6081c65eb74839b333ea9c040a8302f842bc7143e8c
MD5 e0e9b88fb786c05d0f01c35f814d7bbe
BLAKE2b-256 9d69e5d4e82bb24918923ec5ce91d6ee4a4a9369e95673ce25dd61f6ad8f2c87

See more details on using hashes here.

Provenance

The following attestation bundles were made for gencast-1.3.0-py3-none-any.whl:

Publisher: release.yml on cadrianmae/gencast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page