Skip to main content

Generate conversational podcasts from documents using AI

Project description

gencast

Generate conversational podcasts from documents using AI. A cost-effective, customisable, local-first alternative to NotebookLM.

gencast notebook.yaml  ->  podcast.m4a (with embedded subtitles)

Install

pip install gencast

System dependency: ffmpeg (for audio combining and M4A muxing).

API keys (export or use gencast init to be prompted):

export OPENAI_API_KEY="sk-..."          # required (TTS + Whisper)
export ANTHROPIC_API_KEY="sk-ant-..."   # required (default outline + transcript)
export MISTRAL_API_KEY="..."            # optional (better PDF extraction)

Quickstart

gencast init                        # interactive notebook wizard
gencast preview notebook.yaml       # outline-only dry run (free)
gencast generate notebook.yaml      # full pipeline -> out/<basename>.m4a

Or one-shot from a markdown file (uses default profiles):

gencast generate path/to/lecture.md

Three-axis profile system

Each notebook composes three orthogonal profiles:

speaker_profile: revision-duo       # WHO speaks (1-4 voices, personas)
episode_profile: exam-revision      # WHAT kind of podcast (briefing, segments, models)
room_profile:    small-room         # HOW it sounds (spatial pipeline)

List bundled profiles:

gencast list-profiles --type speakers
gencast list-profiles --type episodes
gencast list-profiles --type rooms

Profiles cascade: ./gencast/profiles/<kind>/<name>.yaml (project)

~/.config/gencast/profiles/<kind>/<name>.yaml (XDG) bundled defaults. Override per-notebook via overrides: block in the notebook YAML.

Worked example

./photosynthesis/notebook.yaml:

title: Photosynthesis revision
sources:
  - lectures/photosynthesis.md
  - lectures/calvin-cycle.md
speaker_profile: revision-duo
episode_profile: exam-revision
room_profile: small-room
output:
  basename: photosynthesis-revision
  formats: [m4a]
overrides:
  briefing_suffix: |
    Pay specific attention to the distinction between the light-dependent
    reactions and the Calvin cycle. Include one worked Q&A on this distinction.
gencast generate photosynthesis/notebook.yaml
# -> photosynthesis/out/photosynthesis-revision.m4a

Cost

Typical 10-min podcast (~5K-token source, 6 segments, 2 speakers):

Component Default model Cost
Outline claude-haiku-4-5 ~$0.005
Transcript (with prompt cache) claude-sonnet-4-5 ~$0.10
TTS openai/tts-1-hd ~$0.06
Subtitles native (no Whisper) $0.00
Total ~$0.17

Use --model overrides or different episode profiles to trade quality for cost.

Caches

  • TTS cache -- ~/.cache/gencast/tts/ -- always on. Re-runs cost only changed sentences.
  • LLM cache -- ~/.cache/gencast/llm/ -- opt-in via --cache-llm. Off by default since dialogue is non-deterministic.
  • PDF extract cache -- ~/.cache/gencast/extract/ -- always on for Mistral PDF extraction.

Manage:

gencast cache status
gencast cache clear --type tts --yes

CLI reference

gencast NB.yaml                       generate (alias for `gencast generate NB.yaml`)
gencast init [--copy NB] [--minimal]  interactive notebook wizard
gencast preview NB.yaml               outline-only dry run
gencast generate NB.yaml              full pipeline -> m4a + sidecars
gencast estimate NB.yaml [--json] [--no-suggestions]
                                      predict USD cost before running. +-25% uncertainty.
gencast estimate --rates-only [--json]
                                      dump per-1k-token rates for bundled-default models.
gencast list-profiles [--type X]      enumerate profiles in cascade
gencast subtitle audio.mp3            re-subtitle external audio (Whisper)
gencast cache status [--type X]       inspect cache sizes
gencast cache clear [--type X] [--yes]

Verbosity: -v, -vv, -q, --silent, --log-file PATH.

Cost preview

Predict cost before generating:

gencast estimate my-lecture.yaml
# gencast estimate -- my-lecture.yaml
# ================================================================
# Source:    12,840 tokens  (1 file)
#
# Stage breakdown                                          est. USD
# ------------------------------------------------------  --------
# Extract                                                    $0.00
# Outline      claude-haiku-4-5      . 13.0k in              $0.04
# Transcript   claude-sonnet-4-5     . 6 segs/~1.4k          $0.18
# TTS          openai/tts-1-hd       . ~4,500 chars          $0.14
# Whisper      whisper-1             . ~6.0 min              $0.04
#                                                          --------
#                                                  Total:    $0.40
#                                                            +-25%
#
# Cheaper alternatives
#   transcript   claude-sonnet-4-5  -> claude-haiku-4-5  saves ~$0.13 (-72%)
#                  (quality trade-off -- see docs)

For scripts and skills, use --json:

gencast estimate my-lecture.yaml --json

For the rate table only (used by Claude Code skills via dynamic context injection):

gencast estimate --rates-only --json
gencast estimate --rates-only --provider anthropic --json
gencast estimate --rates-only --all-models --json   # all ~2,700 LiteLLM models

Tests

pytest tests/unit                          # fast, no API calls
pytest tests/component                     # vcrpy cassettes, no keys needed once recorded
GENCAST_TEST_E2E=1 pytest tests/e2e        # real API calls, costs a few cents
GENCAST_TEST_AUDIO=1 pytest tests/audio    # TTS + spatial audio (requires OPENAI_API_KEY)

Specs and design

Claude Code integration

gencast ships with a Claude Code plugin that exposes four skills for conversational use inside Claude Code. The plugin is bundled with the gencast Python package — no separate install once you have pip install gencast>=1.2.0.

In Claude Code, install the plugin once:

/plugin install gencast

Then trigger any of the four skills with natural language:

Skill Example trigger What it does
notebook-init "draft a gencast notebook from these notes" Builds notebook.yaml conversationally; picks profiles from the bundled catalogue.
source-check "are these sources good for a podcast?" Token-counts sources and predicts USD cost via gencast estimate.
review-transcript "review this gencast transcript" Reads transcript.json, flags awkward phrasings + flow problems. Advisory only — does not auto-regenerate.
cost-explain "explain my gencast cost.json" Plain-language cost-by-stage breakdown with optimisation suggestions.

Skills are workflow recipes that shell out to the gencast CLI. The CLI remains the source of truth and is fully usable without Claude Code or the plugin.

Source for the skills lives at skills/ in the gencast repo — see .claude-plugin/plugin.json for the manifest.

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gencast-1.3.5.tar.gz (63.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gencast-1.3.5-py3-none-any.whl (70.8 kB view details)

Uploaded Python 3

File details

Details for the file gencast-1.3.5.tar.gz.

File metadata

  • Download URL: gencast-1.3.5.tar.gz
  • Upload date:
  • Size: 63.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gencast-1.3.5.tar.gz
Algorithm Hash digest
SHA256 214ef3f53c8271781b0f534dad77a848bc692ef7926525e824774990532e94e4
MD5 8f7c903a03da35a8c0f4f90eaf6b9df7
BLAKE2b-256 99b66677034ac22622d252c3e9004f478bd3e7dca52bee083ce51093ae7d9b60

See more details on using hashes here.

Provenance

The following attestation bundles were made for gencast-1.3.5.tar.gz:

Publisher: release.yml on cadrianmae/gencast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gencast-1.3.5-py3-none-any.whl.

File metadata

  • Download URL: gencast-1.3.5-py3-none-any.whl
  • Upload date:
  • Size: 70.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gencast-1.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 040bb9eba401ba875f7b6f3986a5d167bd9c243d5e46a3eb603bd0895758b07e
MD5 27ebe02da504e99a0cd6bd1eaff06419
BLAKE2b-256 0b17771adf5a8756a14bc4cc91ea0e3b9dcb51adc71b67eed5f3f36c9c00bdca

See more details on using hashes here.

Provenance

The following attestation bundles were made for gencast-1.3.5-py3-none-any.whl:

Publisher: release.yml on cadrianmae/gencast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page