Skip to main content

Generate conversational podcasts from documents using AI

Project description

gencast

Generate conversational podcasts from documents using AI. A cost-effective, customisable, local-first alternative to NotebookLM.

gencast notebook.yaml  ->  podcast.m4a (with embedded subtitles)

Install

pip install gencast

System dependency: ffmpeg (for audio combining and M4A muxing).

API keys (export or use gencast init to be prompted):

export OPENAI_API_KEY="sk-..."          # required (TTS + Whisper)
export ANTHROPIC_API_KEY="sk-ant-..."   # required (default outline + transcript)
export MISTRAL_API_KEY="..."            # optional (better PDF extraction)

Quickstart

gencast init                        # interactive notebook wizard
gencast preview notebook.yaml       # outline-only dry run (free)
gencast generate notebook.yaml      # full pipeline -> out/<basename>.m4a

Or one-shot from a markdown file (uses default profiles):

gencast generate path/to/lecture.md

Three-axis profile system

Each notebook composes three orthogonal profiles:

speaker_profile: revision-duo       # WHO speaks (1-4 voices, personas)
episode_profile: exam-revision      # WHAT kind of podcast (briefing, segments, models)
room_profile:    small-room         # HOW it sounds (spatial pipeline)

List bundled profiles:

gencast list-profiles --type speakers
gencast list-profiles --type episodes
gencast list-profiles --type rooms

Profiles cascade: ./gencast/profiles/<kind>/<name>.yaml (project)

~/.config/gencast/profiles/<kind>/<name>.yaml (XDG) bundled defaults. Override per-notebook via overrides: block in the notebook YAML.

Worked example

./photosynthesis/notebook.yaml:

title: Photosynthesis revision
sources:
  - lectures/photosynthesis.md
  - lectures/calvin-cycle.md
speaker_profile: revision-duo
episode_profile: exam-revision
room_profile: small-room
output:
  basename: photosynthesis-revision
  formats: [m4a]
overrides:
  briefing_suffix: |
    Pay specific attention to the distinction between the light-dependent
    reactions and the Calvin cycle. Include one worked Q&A on this distinction.
gencast generate photosynthesis/notebook.yaml
# -> photosynthesis/out/photosynthesis-revision.m4a

Cost

Typical 10-min podcast (~5K-token source, 6 segments, 2 speakers):

Component Default model Cost
Outline claude-haiku-4-5 ~$0.005
Transcript (with prompt cache) claude-sonnet-4-5 ~$0.10
TTS openai/tts-1-hd ~$0.06
Subtitles native (no Whisper) $0.00
Total ~$0.17

Use --model overrides or different episode profiles to trade quality for cost.

Caches

  • TTS cache -- ~/.cache/gencast/tts/ -- always on. Re-runs cost only changed sentences.
  • LLM cache -- ~/.cache/gencast/llm/ -- opt-in via --cache-llm. Off by default since dialogue is non-deterministic.
  • PDF extract cache -- ~/.cache/gencast/extract/ -- always on for Mistral PDF extraction.

Manage:

gencast cache status
gencast cache clear --type tts --yes

CLI reference

gencast NB.yaml                       generate (alias for `gencast generate NB.yaml`)
gencast init [--copy NB] [--minimal]  interactive notebook wizard
gencast preview NB.yaml               outline-only dry run
gencast generate NB.yaml              full pipeline -> m4a + sidecars
gencast estimate NB.yaml [--json] [--no-suggestions]
                                      predict USD cost before running. +-25% uncertainty.
gencast estimate --rates-only [--json]
                                      dump per-1k-token rates for bundled-default models.
gencast list-profiles [--type X]      enumerate profiles in cascade
gencast subtitle audio.mp3            re-subtitle external audio (Whisper)
gencast cache status [--type X]       inspect cache sizes
gencast cache clear [--type X] [--yes]

Verbosity: -v, -vv, -q, --silent, --log-file PATH.

Cost preview

Predict cost before generating:

gencast estimate my-lecture.yaml
# gencast estimate -- my-lecture.yaml
# ================================================================
# Source:    12,840 tokens  (1 file)
#
# Stage breakdown                                          est. USD
# ------------------------------------------------------  --------
# Extract                                                    $0.00
# Outline      claude-haiku-4-5      . 13.0k in              $0.04
# Transcript   claude-sonnet-4-5     . 6 segs/~1.4k          $0.18
# TTS          openai/tts-1-hd       . ~4,500 chars          $0.14
# Whisper      whisper-1             . ~6.0 min              $0.04
#                                                          --------
#                                                  Total:    $0.40
#                                                            +-25%
#
# Cheaper alternatives
#   transcript   claude-sonnet-4-5  -> claude-haiku-4-5  saves ~$0.13 (-72%)
#                  (quality trade-off -- see docs)

For scripts and skills, use --json:

gencast estimate my-lecture.yaml --json

For the rate table only (used by Claude Code skills via dynamic context injection):

gencast estimate --rates-only --json
gencast estimate --rates-only --provider anthropic --json
gencast estimate --rates-only --all-models --json   # all ~2,700 LiteLLM models

Tests

pytest tests/unit                          # fast, no API calls
pytest tests/component                     # vcrpy cassettes, no keys needed once recorded
GENCAST_TEST_E2E=1 pytest tests/e2e        # real API calls, costs a few cents
GENCAST_TEST_AUDIO=1 pytest tests/audio    # TTS + spatial audio (requires OPENAI_API_KEY)

Specs and design

Claude Code integration

gencast ships with a Claude Code plugin that exposes four skills for conversational use inside Claude Code. The plugin is bundled with the gencast Python package — no separate install once you have pip install gencast>=1.2.0.

In Claude Code, install the plugin once:

/plugin install gencast

Then trigger any of the four skills with natural language:

Skill Example trigger What it does
notebook-init "draft a gencast notebook from these notes" Builds notebook.yaml conversationally; picks profiles from the bundled catalogue.
source-check "are these sources good for a podcast?" Token-counts sources and predicts USD cost via gencast estimate.
review-transcript "review this gencast transcript" Reads transcript.json, flags awkward phrasings + flow problems. Advisory only — does not auto-regenerate.
cost-explain "explain my gencast cost.json" Plain-language cost-by-stage breakdown with optimisation suggestions.

Skills are workflow recipes that shell out to the gencast CLI. The CLI remains the source of truth and is fully usable without Claude Code or the plugin.

Source for the skills lives at skills/ in the gencast repo — see .claude-plugin/plugin.json for the manifest.

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gencast-1.3.1.tar.gz (61.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gencast-1.3.1-py3-none-any.whl (68.5 kB view details)

Uploaded Python 3

File details

Details for the file gencast-1.3.1.tar.gz.

File metadata

  • Download URL: gencast-1.3.1.tar.gz
  • Upload date:
  • Size: 61.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gencast-1.3.1.tar.gz
Algorithm Hash digest
SHA256 de65357622342fd52b9ed33c7bc1b2e33bcaf415f580c150592cb060a91bd945
MD5 89a24fb5ebcd6445331736f352700e5e
BLAKE2b-256 f4f14902bfd7f78d5a9cbb0a342634d2e847674cd9799c46167564a65a739984

See more details on using hashes here.

Provenance

The following attestation bundles were made for gencast-1.3.1.tar.gz:

Publisher: release.yml on cadrianmae/gencast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gencast-1.3.1-py3-none-any.whl.

File metadata

  • Download URL: gencast-1.3.1-py3-none-any.whl
  • Upload date:
  • Size: 68.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gencast-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bf9ad3722c6757d5d5dba3b0fb2f1186b63e26e4a9bba674506e66ea8d6f8056
MD5 2befe2c248012714c049f0e05b8156fa
BLAKE2b-256 9a6a4d24d07cd691f9ba4dd71e12175b4741b9717e79a142ba5d9e85d7823be4

See more details on using hashes here.

Provenance

The following attestation bundles were made for gencast-1.3.1-py3-none-any.whl:

Publisher: release.yml on cadrianmae/gencast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page