Skip to main content

Generate conversational podcasts from documents using AI

Project description

gencast

Generate conversational podcasts from documents using AI. A cost-effective, customisable, local-first alternative to NotebookLM.

gencast notebook.yaml  ->  podcast.m4a (with embedded subtitles)

Install

pip install gencast

System dependency: ffmpeg (for audio combining and M4A muxing).

API keys (export or use gencast init to be prompted):

export OPENAI_API_KEY="sk-..."          # required (TTS + Whisper)
export ANTHROPIC_API_KEY="sk-ant-..."   # required (default outline + transcript)
export MISTRAL_API_KEY="..."            # optional (better PDF extraction)

Quickstart

gencast init                        # interactive notebook wizard
gencast preview notebook.yaml       # outline-only dry run (free)
gencast generate notebook.yaml      # full pipeline -> out/<basename>.m4a

Or one-shot from a markdown file (uses default profiles):

gencast generate path/to/lecture.md

Three-axis profile system

Each notebook composes three orthogonal profiles:

speaker_profile: revision-duo       # WHO speaks (1-4 voices, personas)
episode_profile: exam-revision      # WHAT kind of podcast (briefing, segments, models)
room_profile:    small-room         # HOW it sounds (spatial pipeline)

List bundled profiles:

gencast list-profiles --type speakers
gencast list-profiles --type episodes
gencast list-profiles --type rooms

Profiles cascade: ./gencast/profiles/<kind>/<name>.yaml (project)

~/.config/gencast/profiles/<kind>/<name>.yaml (XDG) bundled defaults. Override per-notebook via overrides: block in the notebook YAML.

Worked example

./photosynthesis/notebook.yaml:

title: Photosynthesis revision
sources:
  - lectures/photosynthesis.md
  - lectures/calvin-cycle.md
speaker_profile: revision-duo
episode_profile: exam-revision
room_profile: small-room
output:
  basename: photosynthesis-revision
  formats: [m4a]
overrides:
  briefing_suffix: |
    Pay specific attention to the distinction between the light-dependent
    reactions and the Calvin cycle. Include one worked Q&A on this distinction.
gencast generate photosynthesis/notebook.yaml
# -> photosynthesis/out/photosynthesis-revision.m4a

Cost

Typical 10-min podcast (~5K-token source, 6 segments, 2 speakers):

Component Default model Cost
Outline claude-haiku-4-5 ~$0.005
Transcript (with prompt cache) claude-sonnet-4-5 ~$0.10
TTS openai/tts-1-hd ~$0.06
Subtitles native (no Whisper) $0.00
Total ~$0.17

Use --model overrides or different episode profiles to trade quality for cost.

Caches

  • TTS cache -- ~/.cache/gencast/tts/ -- always on. Re-runs cost only changed sentences.
  • LLM cache -- ~/.cache/gencast/llm/ -- opt-in via --cache-llm. Off by default since dialogue is non-deterministic.
  • PDF extract cache -- ~/.cache/gencast/extract/ -- always on for Mistral PDF extraction.

Manage:

gencast cache status
gencast cache clear --type tts --yes

CLI reference

gencast NB.yaml                       generate (alias for `gencast generate NB.yaml`)
gencast init [--copy NB] [--minimal]  interactive notebook wizard
gencast preview NB.yaml               outline-only dry run
gencast generate NB.yaml              full pipeline -> m4a + sidecars
gencast estimate NB.yaml [--json] [--no-suggestions]
                                      predict USD cost before running. +-25% uncertainty.
gencast estimate --rates-only [--json]
                                      dump per-1k-token rates for bundled-default models.
gencast list-profiles [--type X]      enumerate profiles in cascade
gencast subtitle audio.mp3            re-subtitle external audio (Whisper)
gencast cache status [--type X]       inspect cache sizes
gencast cache clear [--type X] [--yes]

Verbosity: -v, -vv, -q, --silent, --log-file PATH.

Cost preview

Predict cost before generating:

gencast estimate my-lecture.yaml
# gencast estimate -- my-lecture.yaml
# ================================================================
# Source:    12,840 tokens  (1 file)
#
# Stage breakdown                                          est. USD
# ------------------------------------------------------  --------
# Extract                                                    $0.00
# Outline      claude-haiku-4-5      . 13.0k in              $0.04
# Transcript   claude-sonnet-4-5     . 6 segs/~1.4k          $0.18
# TTS          openai/tts-1-hd       . ~4,500 chars          $0.14
# Whisper      whisper-1             . ~6.0 min              $0.04
#                                                          --------
#                                                  Total:    $0.40
#                                                            +-25%
#
# Cheaper alternatives
#   transcript   claude-sonnet-4-5  -> claude-haiku-4-5  saves ~$0.13 (-72%)
#                  (quality trade-off -- see docs)

For scripts and skills, use --json:

gencast estimate my-lecture.yaml --json

For the rate table only (used by Claude Code skills via dynamic context injection):

gencast estimate --rates-only --json
gencast estimate --rates-only --provider anthropic --json
gencast estimate --rates-only --all-models --json   # all ~2,700 LiteLLM models

Tests

pytest tests/unit                          # fast, no API calls
pytest tests/component                     # vcrpy cassettes, no keys needed once recorded
GENCAST_TEST_E2E=1 pytest tests/e2e        # real API calls, costs a few cents
GENCAST_TEST_AUDIO=1 pytest tests/audio    # TTS + spatial audio (requires OPENAI_API_KEY)

Specs and design

Claude Code integration

gencast ships with a Claude Code plugin that exposes four skills for conversational use inside Claude Code. The plugin is bundled with the gencast Python package — no separate install once you have pip install gencast>=1.2.0.

In Claude Code, install the plugin once:

/plugin install gencast

Then trigger any of the four skills with natural language:

Skill Example trigger What it does
notebook-init "draft a gencast notebook from these notes" Builds notebook.yaml conversationally; picks profiles from the bundled catalogue.
source-check "are these sources good for a podcast?" Token-counts sources and predicts USD cost via gencast estimate.
review-transcript "review this gencast transcript" Reads transcript.json, flags awkward phrasings + flow problems. Advisory only — does not auto-regenerate.
cost-explain "explain my gencast cost.json" Plain-language cost-by-stage breakdown with optimisation suggestions.

Skills are workflow recipes that shell out to the gencast CLI. The CLI remains the source of truth and is fully usable without Claude Code or the plugin.

Source for the skills lives at skills/ in the gencast repo — see .claude-plugin/plugin.json for the manifest.

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gencast-1.3.8.tar.gz (65.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gencast-1.3.8-py3-none-any.whl (72.1 kB view details)

Uploaded Python 3

File details

Details for the file gencast-1.3.8.tar.gz.

File metadata

  • Download URL: gencast-1.3.8.tar.gz
  • Upload date:
  • Size: 65.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gencast-1.3.8.tar.gz
Algorithm Hash digest
SHA256 cb36f2c24683388dbe9e6eef9fce617bc02c8047e6433e465a3d369802e88ef8
MD5 dfd68c779db8d7bcfb572b11e042b20b
BLAKE2b-256 484d8e91d3f75c8be7b346cb8f570c977a90b16e612d65ebbee522a7ade60ee0

See more details on using hashes here.

Provenance

The following attestation bundles were made for gencast-1.3.8.tar.gz:

Publisher: release.yml on cadrianmae/gencast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gencast-1.3.8-py3-none-any.whl.

File metadata

  • Download URL: gencast-1.3.8-py3-none-any.whl
  • Upload date:
  • Size: 72.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gencast-1.3.8-py3-none-any.whl
Algorithm Hash digest
SHA256 24892fe4658186088a763585dfd96fa990334803380274f8998480b50ac808e8
MD5 b98d818700d5450bc9b442eff3dc5fe9
BLAKE2b-256 c075700e3c0c7d9a3b8f91371b3433a92f8fa47559aa6011016f79cfdba6e65f

See more details on using hashes here.

Provenance

The following attestation bundles were made for gencast-1.3.8-py3-none-any.whl:

Publisher: release.yml on cadrianmae/gencast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page