CLI tool to transcribe podcasts from a URL using pluggable services (Whisper, AWS, GCP).
Project description
Podcast Transcription CLI Tool
Transcribe podcasts and other audio from a URL or local file. Choose between local Whisper, AWS Transcribe, or Google Cloud Speech‑to‑Text. Export transcripts to text, subtitles, and e‑books.
Badges
Features
- Backends:
--service whisper|aws|gcp(pluggable architecture). - Inputs: Local files, direct URLs, YouTube (via
yt-dlp), and podcast RSS feeds (first enclosure). - Outputs:
--format txt|pdf|epub|mobi|azw|azw3|srt|vtt|json|md.- Plus DOCX via optional extra:
docx.
- Plus DOCX via optional extra:
- Export details:
- PDF: headers/footers, optional cover page, auto‑TOC from segments, custom fonts and page size.
- EPUB/Kindle: built‑in themes or custom CSS, multi‑chapter from segments, optional cover.
- DOCX: simple manuscript export with optional cover page (install
[docx]). - Subtitles: SRT/VTT with timestamps and optional speaker labels.
- JSON: full transcript + segments + word‑level timings (when available).
- DOCX: simple manuscript export with optional cover page (install
- Advanced transcription:
- Speaker diarization:
--speakers Nfor AWS/GCP. - Whisper chunking:
--chunk-seconds Nfor long audio;--translatefor English translation. - GCP long‑running recognition:
--gcp-longrunning.
- Speaker diarization:
- Batch processing:
--input-file list.txtto process many items into a directory. - Caching and robustness: retry/backoff for downloads,
--cache-dirand--no-cachefor transcript caching. - Post‑processing:
--normalize(whitespace/paragraphs),--summarize N(naive summary).
Requirements
- Python 3.9+
- Core dependency:
requests - Optional extras (installed only if you use the feature):
- Whisper:
openai-whisper,ffmpeg - AWS:
boto3+ AWS credentials; env varAWS_TRANSCRIBE_S3_BUCKET - GCP:
google-cloud-speech+ credentials (GOOGLE_APPLICATION_CREDENTIALS) - PDF:
fpdf2 - EPUB/Kindle:
ebooklib(and Calibre’sebook-convertfor Kindle formats) - YouTube:
yt-dlp - ID3 cover/title:
mutagen(optional)
- Whisper:
Install from PyPI (core only):
pip install podcast-transcriber
Install with extras (examples):
# Local Whisper backend (requires ffmpeg on PATH)
pip install "podcast-transcriber[whisper]"
# Export formats (PDF/EPUB/Kindle)
pip install "podcast-transcriber[export]"
# Orchestrator + ingestion + templates
pip install "podcast-transcriber[orchestrator,ingest,templates]"
Extras quick reference:
| Feature | Extra | Install command | Notes |
|---|---|---|---|
| Whisper (local) | whisper |
pip install -e .[whisper] |
Requires ffmpeg on PATH |
| AWS Transcribe | aws |
pip install -e .[aws] |
Needs AWS creds + AWS_TRANSCRIBE_S3_BUCKET |
| GCP Speech-to-Text | gcp |
pip install -e .[gcp] |
Needs GOOGLE_APPLICATION_CREDENTIALS |
| Export formats (PDF/EPUB/Kindle) | export |
pip install -e .[export] |
Kindle formats require Calibre ebook-convert |
| Developer tools | dev |
pip install -e .[dev] |
Includes pytest, etc. |
| Docs | docs |
pip install -e .[docs] |
MkDocs + Material |
Installation
Install from source (editable) for development:
python -m venv .venv
source .venv/bin/activate
pip install -e .[dev]
Optional extras examples:
pip install -e .[whisper]
pip install -e .[aws]
pip install -e .[gcp]
pip install -e .[export]
Formatting & Linting
- Formatter: Ruff (via
make fmt/make fmt-check). - Linter: Ruff (via
make lint/make lint-fix). - Optional: Black config exists for local use, but CI and Make targets use Ruff.
Docker
Build a minimal image (choose extras via build-arg). By default, we include useful runtime extras: export,templates,ingest,orchestrator,env. For Whisper (heavy), add whisper explicitly.
# Base features (PDF/EPUB/templates/orchestrator/ingest):
docker build -t podcast-transcriber:latest \
--build-arg PIP_EXTRAS=export,templates,ingest,orchestrator,env .
# Include Whisper (requires ffmpeg; already installed in the image):
docker build -t podcast-transcriber:whisper \
--build-arg PIP_EXTRAS=export,templates,ingest,orchestrator,env,whisper .
Run the CLI (mount output directory):
mkdir -p ./out
docker run --rm \
-v "$(pwd)/out:/out" \
podcast-transcriber:latest \
--url "https://example.com/audio.mp3" \
--service aws \
--format txt \
--output /out/transcript.txt
Run the Orchestrator (override entrypoint with --entrypoint):
# config.yml should be in your current directory
docker run --rm \
--entrypoint podcast-cli \
-v "$(pwd)/config.yml:/config.yml:ro" \
-v "$(pwd)/out:/out" \
-e AWS_TRANSCRIBE_S3_BUCKET="$AWS_TRANSCRIBE_S3_BUCKET" \
-e GOOGLE_APPLICATION_CREDENTIALS="/secrets/gcp.json" \
podcast-transcriber:latest \
run --config /config.yml
Unicode PDF note
- Core PDF fonts (e.g., Helvetica) do not support full Unicode. To render non‑ASCII characters, embed a Unicode font via `--pdf-font-file` (CLI) or `pdf_font_file` (YAML outputs).
- Our Docker images install DejaVu fonts. Recommended path: `/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf`.
- Example (CLI): `--pdf-font-file /usr/share/fonts/truetype/dejavu/DejaVuSans.ttf`
## End-to-End Recipes (Oxford)
This repo includes ready-to-run recipes to exercise the full pipeline via Docker. They fetch a Creative Commons podcast RSS feed, transcribe the latest episodes, and produce multiple output formats.
- Recipes:
- `examples/recipes/oxford_quick.yml`: fastest profile for PR/CI (small Whisper model, `clip_minutes: 1`, all outputs).
- `examples/recipes/oxford_cc.yml`: standard profile (balanced quality).
- `examples/recipes/oxford_premium.yml`: highest quality (slowest).
- Run with Docker (Calibre image recommended to enable Kindle formats):
- Build (optional, the script can build for you):
- `docker build -f Dockerfile.calibre -t podcast-transcriber:calibre .`
- Orchestrator E2E (pick a recipe and limit N episodes):
- `./scripts/e2e_docker.sh -c examples/recipes/oxford_quick.yml -n 2 --fresh-state --dockerfile Dockerfile.calibre --image podcast-transcriber:calibre`
- Artifacts end up in `./out/`.
- What the script does:
- Ingests the feed(s), creates a job id, trims to the latest N episodes.
- Processes via orchestrator (`podcast-cli process`) and writes outputs per `outputs:` block in the YAML.
- Uses a local cache `./.e2e-cache -> /root/.cache` to reuse Whisper model downloads.
- `--fresh-state` deletes only the orchestrator state for deterministic runs; it does not clear the Whisper cache.
### Customizing a recipe
- Feeds: under `feeds:` provide one or more entries. You can use any RSS URL, PodcastIndex id/guid, or categories filter.
- By RSS URL:
- `feeds: [ { name: MyFeed, url: https://example.com/feed.xml } ]`
- By PodcastIndex (with env creds present):
- `feeds: [ { name: ById, podcastindex_feedid: "12345" } ]`
- Category filter (case-insensitive):
- `categories: ["creative commons", "technology"]`
- Quality presets:
- `quality: quick|standard|premium` (affects Whisper model and some defaults).
- Speed tip: `clip_minutes: 1` pre-clips audio before transcribing for faster runs.
- Outputs: choose formats and per-format options in the `outputs:` array.
- Common formats: `epub, pdf, docx, md, txt, json, srt, vtt, mobi, azw3` (Kindle uses Calibre).
- EPUB:
- `epub_css_text:` or `epub_css_file:` to embed CSS.
- PDF:
- `pdf_font_file:` set a Unicode TTF (e.g., `/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf` in Docker).
- `pdf_cover_fullpage: true` for a full-page cover before the transcript.
- `pdf_first_page_cover_only: true` to start text on a new page after the cover.
- DOCX:
- `docx_cover_first: true` to place cover first.
- `docx_cover_width_inches: 6.0` to control cover width.
- Markdown:
- `md_include_cover: true` to place cover image at the top and save the image alongside the `.md` file.
- Cover & metadata:
- Orchestrator tries to fetch the episode’s `itunes:image` as cover. You can override with `cover_image: /path/to/file.jpg`.
- Common metadata can be set at the top-level (e.g., `author`, `language`), and passed into exports.
### Testing with your own RSS feed
- Duplicate a recipe (e.g., copy `examples/recipes/oxford_cc.yml` to `my_feed.yml`).
- Update:
- `feeds: - name: MyFeed, url: https://my/podcast.rss`
- Optionally `categories: [...]` to filter entries.
- `quality:` to suit your needs.
- `clip_minutes:` for quicker tests.
- `outputs:` to the list of formats you want to verify.
- Run:
- `./scripts/e2e_docker.sh -c my_feed.yml -n 2 --fresh-state --dockerfile Dockerfile.calibre --image podcast-transcriber:calibre`
### Running without the script
- Direct orchestrator run from Docker (YAML config inside the container):
- `docker run --rm --entrypoint podcast-cli -v "$(pwd)":/workspace -w /workspace podcast-transcriber:calibre ingest --config /workspace/examples/recipes/oxford_cc.yml`
- Then process:
- `docker run --rm --entrypoint podcast-cli -v "$(pwd)":/workspace -w /workspace podcast-transcriber:calibre process --job-id <id>`
- Direct from host (after installing extras):
- `pip install -e .[orchestrator,ingest,templates,export,docx,whisper]`
- `podcast-cli ingest --config examples/recipes/oxford_cc.yml`
- `podcast-cli process --job-id <id> [--clip-minutes N]`
Notes
- Kindle conversion (MOBI/AZW3) requires Calibre’s `ebook-convert`; use `Dockerfile.calibre` image or install Calibre locally.
- KFX is not included in distro Calibre; AZW3 is the recommended modern Kindle format.
- If you hit state “No new episodes discovered”, pass `--fresh-state` to the script (or remove state at `$PODCAST_STATE_DIR`).
Notes
- Provide cloud credentials via environment variables (
AWS_*,GOOGLE_APPLICATION_CREDENTIALS, SMTP vars) or mount secrets files. - Whisper adds significant image size; only include it if needed.
- Kindle conversions (azw/azw3/kfx) require Calibre
ebook-convert, which is not installed in the image.
Docker Compose
Use compose.yaml to build and run the image locally.
# Build (choose extras via PIP_EXTRAS; add ",whisper" if needed)
PIP_EXTRAS=export,templates,ingest,orchestrator,env docker compose build
# Prepare config and output
cp examples/config.example.yml ./config.yml # or your own config
mkdir -p out secrets
# Optional: put GCP creds in ./secrets/gcp.json and export email/cloud envs
export AWS_TRANSCRIBE_S3_BUCKET=... \
KINDLE_TO_EMAIL=... \
KINDLE_FROM_EMAIL=... \
SMTP_HOST=... SMTP_PORT=587 SMTP_USER=... SMTP_PASS=...
# Run orchestrator pipeline
docker compose up orchestrator
# See output in ./out
Compose services
transcriber:podcast-transcriberCLI (default--help).orchestrator:podcast-clirun --config /config/config.ymlwith volumes mounted for/config,/out, and/secrets.
Environment (.env)
- Copy the example file and fill in values as needed:
cp .env.example .env
# edit .env and set SMTP_*, KINDLE_*, and optional PodcastIndex/API keys
- The orchestrator automatically loads
.envifpython-dotenvis installed (pip install -e .[env]). Never commit a real.env— the repo ignores.envby default.
Quickstart
Run via Bash wrapper from source (no package install of this project required):
Note: You still need Python dependencies available in your environment. At minimum, core runs require requests. For Whisper/AWS/GCP backends or exports, install the corresponding extras. See Installation below.
./Transcribe_podcast_to_text.sh --url "https://example.com/audio.mp3" --service whisper --output out.txt
Run via Python module or console entrypoint (requires installing the package and its deps):
python -m podcast_transcriber --url <URL|path> --service <whisper|aws|gcp> --output out.txt
# after install
podcast-transcriber --url <URL|path> --service <whisper|aws|gcp> --output out.txt
Orchestrator CLI (beta)
High‑level pipeline for “ingest → process → send to Kindle” and weekly digests.
- Install extras:
pip install -e .[orchestrator,ingest,templates](and optionally[scheduler,nlp]).
Subcommands:
podcast-cli ingest --config config.yml— Discover new episodes and create a job.podcast-cli process --job-id <id>— Transcribe and build EPUB for a job.- Ad‑hoc semantic segmentation: add
--semanticto this command to override YAML. - Speed up test runs: add
--clip-minutes Nto limit transcription to the first N minutes (pre-clips audio).
- Ad‑hoc semantic segmentation: add
podcast-cli send --job-id <id>— Email EPUBs to your Kindle address.podcast-cli run --config config.yml— Run ingest → process → send in one go.podcast-cli digest --feed <name> --weekly— Build a weekly digest EPUB.
Config (YAML) example:
feeds:
- name: myfeed
url: https://example.com/podcast.rss
- name: altfeed-by-id
podcastindex_feedid: "123456"
- name: altfeed-by-guid
podcast_guid: "urn:uuid:aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
service: whisper
quality: standard # quick|standard|premium
language: sv-SE
author: Your Name
output_dir: ./out
clip_minutes: 1 # optional: clip audio to N minutes before transcribing (faster E2E)
kindle:
to_email: your_name@kindle.com
from_email: sender@example.com
smtp:
host: smtp.example.com
port: 587
user: smtp-user
# password set via env only, e.g. SMTP_PASS
# NLP options (optional)
nlp:
semantic: true # enable semantic topic segmentation (requires [nlp] extra)
takeaways: true # add a simple "Key takeaways" section
# Markdown output (optional)
emit_markdown: true
markdown_template: ./path/to/ebook.md.j2 # omit to use built-in template
Templating and themes:
- The built-in template defines blocks you can override: `front_matter`, `title_page`, `preface`, `content`, and `appendix`.
- Create your own Jinja2 theme that `{% extends %}` the base template at `src/podcast_transcriber/templates/ebook.md.j2`.
- An example template is provided at `examples/templates/ebook_theme_minimal.md.j2`.
Topics and takeaways in Markdown:
- When NLP is enabled (`nlp.semantic: true` and/or `podcast-cli process --semantic`), the Markdown includes a "Topics" section listing chapter titles derived from segmentation.
- When `nlp.takeaways: true`, the Markdown also includes a "Key Takeaways" section with 3–5 concise bullets. If spaCy is installed, noun chunks are used; otherwise a heuristic is applied.
Secrets policy: Store SMTP password and API keys in environment variables (e.g. SMTP_PASS, cloud provider keys). Ensure your Kindle address whitelists your sender.
Scheduling (optional):
- Install:
pip install -e .[scheduler,orchestrator,ingest] - Run once:
podcast-auto-run --config config.yml --once - Run hourly/daily:
podcast-auto-run --config config.yml --interval hourly|daily
Topic segmentation (optional):
- Install:
pip install -e .[nlp] - The CLI uses a simple fallback if embeddings are unavailable; with embeddings, segments are formed by semantic similarity dips and “key takeaways” are extracted heuristically.
Bilingual EPUB (premium idea):
- Set
bilingual: truein config to attempt “Original” + “Translated” sections when using Whisper (translation is toggled internally). If translation fails, it falls back to original only.
CLI Overview
Quality presets
- quick: Uses a small Whisper model for fastest runs; ideal for CI smoke tests.
- standard: Default balance of speed/quality; enables simple summarization and 10‑minute chapters.
- premium: Largest Whisper model and richer processing (e.g., optional diarization/topic segmentation) for highest quality.
Usage
- Orchestrator YAML: set
quality: quick|standard|premium. For fast iterations also addclip_minutes: Nto limit transcription length. - Orchestrator CLI:
podcast-cli process --job-id ... --clip-minutes Noverrides YAML once. - CI: use
examples/recipes/oxford_quick.yml(fast), locally useexamples/recipes/oxford_cc.yml(standard) orexamples/recipes/oxford_premium.yml.
Required
--url: URL, local file, YouTube link, or RSS feed.--service:whisper,aws, orgcp.
Input and batch
--input-file list.txt: Process many items (one per line). Requires--outputto be a directory.--config config.toml: Provide defaults (e.g.,language,format,title, etc.). If omitted, a config is auto-discovered at~/.config/podcast-transcriber/config.toml(or$XDG_CONFIG_HOME/podcast-transcriber/config.toml).
Output and formats
--output: Output path (or directory for batch); defaults to stdout fortxt.--format:txt,pdf,epub,mobi,azw,azw3,srt,vtt,json,md.--title,--author: Document metadata.
Interactive mode
--interactive: Guided prompts for--url,--service,--format,--output, and optional--language. Great for first-time users.
Whisper options
--whisper-model base|small|medium|large--chunk-seconds N: Split long audio into chunks.--translate: Whisper translate task (to English).--language: Hint language code (e.g.,sv,en-US).- Whisper notes: BCP‑47 tags like
en-USare normalized to primary codes (e.g.,en). If a provided code is unsupported by Whisper, the service falls back to auto‑detect.
- Whisper notes: BCP‑47 tags like
AWS options
--aws-bucket,--aws-region--auto-languageand--aws-language-options sv-SE,en-US--speakers N: Enable speaker labels.--aws-keep: Keep uploaded S3 object after job completes.
GCP options
--gcp-alt-languages en-US,nb-NO--speakers N: Enable diarization.--gcp-longrunning: Use long running recognition for long audio.
PDF/EPUB options
- PDF:
--pdf-page-size A4|Letter,--pdf-orientation portrait|landscape,--pdf-margin <mm>,--pdf-font Arial,--pdf-font-size 12,--pdf-font-file path.ttf,--pdf-cover-fullpage,--pdf-first-page-cover-only.- Unicode: Set
--pdf-font-fileto a Unicode TTF/OTF (e.g.,/usr/share/fonts/truetype/dejavu/DejaVuSans.ttfin our Docker images) for full character coverage. - Cover:
--pdf-cover-fullpagefor a full-page cover;--pdf-first-page-cover-onlyto start the transcript on the next page.
- Unicode: Set
- EPUB/Kindle:
--epub-css-file style.css,--epub-theme minimal|reader|classic|darkorcustom:/path.css,--cover-image cover.jpg,--auto-toc(creates a simple TOC from segments; PDF also adds header/footer based on title/author).
DOCX/Markdown options (via orchestrator outputs)
- DOCX:
docx_cover_first: true(place cover first),docx_cover_width_inches: 6.0(control cover width). - Markdown:
md_include_cover: trueplaces the cover at the top and saves the image next to the.mdfile.
Caching and logging
--cache-dir /path/to/cache,--no-cache--verbose,--quiet
Post‑processing
--normalize: Normalize whitespace/paragraphs--summarize N: Naive summary (first N sentences)
Examples
Local Whisper to TXT
./Transcribe_podcast_to_text.sh \
--url "https://example.com/podcast.mp3" \
--service whisper \
--output transcript.txt
AWS with language auto‑detect restricted to Swedish or English (US)
export AWS_TRANSCRIBE_S3_BUCKET=my-bucket
./Transcribe_podcast_to_text.sh \
--url "./examples/tone.wav" \
--service aws \
--auto-language \
--aws-language-options sv-SE,en-US \
--aws-region eu-north-1 \
--output transcript.txt
GCP with alternative languages
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/creds.json
./Transcribe_podcast_to_text.sh \
--url "./examples/tone.wav" \
--service gcp \
--language sv-SE \
--gcp-alt-languages en-US,nb-NO \
--output transcript.txt
SRT/VTT with speaker labels (AWS)
./Transcribe_podcast_to_text.sh \
--url ./episode.wav \
--service aws \
--speakers 2 \
--format srt \
--output episode.srt
Whisper chunked VTT for a long file
./Transcribe_podcast_to_text.sh \
--url ./long.mp3 \
--service whisper \
--chunk-seconds 600 \
--format vtt \
--output long.vtt
EPUB with theme and auto TOC
./Transcribe_podcast_to_text.sh \
--url ./episode.mp3 \
--service whisper \
--format epub \
--epub-theme reader \
--auto-toc \
--output episode.epub
Batch processing to a directory
cat > list.txt <<EOF
https://example.com/ep1.mp3
https://example.com/ep2.mp3
EOF
./Transcribe_podcast_to_text.sh \
--service whisper \
--input-file list.txt \
--format md \
--output ./out_dir
KDP pipeline (EPUB) for a single episode
./Transcribe_podcast_to_text.sh \
--url ./episode.mp3 \
--service whisper \
--kdp \
--title "Podcast: Season 1 – Episode 1" \
--author "Ditt Namn" \
--description "En transkriberad version av avsnittet..." \
--keywords "podcast, svensk, teknik" \
--cover-image ./cover.jpg \
--output ./episode.epub
KDP book from multiple episodes (combine into one EPUB)
cat > episodes.txt <<EOF
https://example.com/ep1.mp3
https://example.com/ep2.mp3
EOF
./Transcribe_podcast_to_text.sh \
--service whisper \
--input-file episodes.txt \
--combine-into ./podcast-book.epub \
--kdp \
--title "Min Podcast – Volym 1" \
--author "Ditt Namn" \
--description "Transcriptions of the best episodes of the season" \
--keywords "podcast, swedish, society"
DOCX manuscript (requires extra)
pip install -e .[docx]
./Transcribe_podcast_to_text.sh \
--url ./episode.mp3 \
--service whisper \
--format docx \
--title "Avsnitt 1" \
--author "Ditt Namn" \
--output ./episode.docx
Notes and Tips
- Kindle formats (
mobi|azw|azw3|kfx) require Calibre’sebook-converton PATH. - YouTube extraction requires
yt-dlp; otherwise HTTP fallback is used. - ID3 metadata (title/cover) is read when
mutagenis installed; RSS feeds use the first<enclosure>URL. - AWS/GCP calls are not made during tests; unit tests mock external services.
Example Plugin + Smoke Test
- Example plugin:
examples/plugin_echo/registers anechoservice via entry points. Install withpip install -e examples/plugin_echoand use--service echo. - Smoke test script:
scripts/smoke.shautomates a basic run including plugin discovery and JSON export. Make it executable and run:
chmod +x scripts/smoke.sh
./scripts/smoke.sh
JSON Export Schema
When using --format json, the file includes additional metadata when available from the downloader (ID3, yt-dlp, etc.).
- Keys:
title: Document title.author: Document author (if provided).text: Full transcript.segments: List of coalesced segments withstart,end,text, and optionalspeaker.words: Optional word-level timings when the backend provides them.source: Optional object with downloader metadata, for example:source_url: Original URL.local_path: Local file path used for transcription.id3_title,id3_artist: From ID3 tags if present.source_title: From yt-dlp (e.g., video title).source_uploader: From yt-dlp (e.g., channel/uploader).cover_url: Thumbnail URL when available.
Plugins: Add Your Own Service
You can ship third-party services as plugins via Python entry points. Register the entry point group podcast_transcriber.services in your package and expose either a subclass of TranscriptionService or a zero-argument factory that returns one.
pyproject.toml (in your plugin):
[project.entry-points."podcast_transcriber.services"]
myservice = "my_package.my_module:MyService"
Your service must implement the TranscriptionService interface (see src/podcast_transcriber/services/base.py). Once installed, it appears in --service choices and in --interactive selection.
- Troubleshooting: see
docs/troubleshooting.mdfor common issues and fixes.
Quick Troubleshooting
- ffmpeg (Whisper): install via Homebrew (
brew install ffmpeg) or apt (sudo apt-get install -y ffmpeg). - ebook-convert (Kindle): install Calibre and ensure
ebook-convertis on PATH (macOS:brew install --cask calibre). - yt-dlp (YouTube):
pipx install yt-dlporpip install yt-dlpand ensure it’s on PATH. - mutagen (ID3 title/cover):
pip install mutagento auto‑read MP3 metadata. - Credentials (AWS/GCP):
pip install boto3and setAWS_TRANSCRIBE_S3_BUCKET;pip install google-cloud-speechand setGOOGLE_APPLICATION_CREDENTIALS.
Development
- Run tests with
pytest(external calls are mocked):pytest -q
- Layout:
src/podcast_transcriber/– core logic and servicestests/– unit tests with mocksdocs/– MkDocs documentationexamples/–generate_tone.pycreates a tiny WAV demo
CI/CD
- GitHub Actions runs
pyteston push/PR (matrix across Python versions and optional extras). - MkDocs builds and publishes docs to GitHub Pages (see
.github/workflows/docs.yml). - Test coverage: CI currently green at 85% (local ~87%). Generate locally with
make coverage(XML + terminal) ormake coverage-html(HTML inhtmlcov/). CI enforces a minimum via--cov-fail-under.
Author
Developed by Johan Caripson.
License
MIT (see LICENSE)
📚 Examples
Whisper (local):
./Transcribe_podcast_to_text.sh \
--url "https://example.com/podcast.mp3" \
--service whisper \
--output transcript.txt
AWS with language auto‑detect restricted to Swedish or English (US):
export AWS_TRANSCRIBE_S3_BUCKET=my-bucket
./Transcribe_podcast_to_text.sh \
--url "./examples/tone.wav" \
--service aws \
--auto-language \
--aws-language-options sv-SE,en-US \
--aws-region eu-north-1 \
--output transcript.txt
GCP with alternative languages:
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/creds.json
./Transcribe_podcast_to_text.sh \
--url "./examples/tone.wav" \
--service gcp \
--language sv-SE \
--gcp-alt-languages en-US,nb-NO \
--output transcript.txt
Kindle formats note: --format mobi|azw|azw3|kfx requires Calibre’s ebook-convert on PATH.
Utility
--credits: Print maintainer credits and exit.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file podcast_transcriber-1.4.3.tar.gz.
File metadata
- Download URL: podcast_transcriber-1.4.3.tar.gz
- Upload date:
- Size: 81.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7872d3961e0c0872eb97203cfd5afb9cee35eeb786d46e4e1737570b6486660e
|
|
| MD5 |
09c7a2a82e8f1b180f3c2825952c747a
|
|
| BLAKE2b-256 |
bfc455786da17a1ba8ca12ccd8fcc4f429bab3c2195908e446f4cc7a3ae9c817
|
File details
Details for the file podcast_transcriber-1.4.3-py3-none-any.whl.
File metadata
- Download URL: podcast_transcriber-1.4.3-py3-none-any.whl
- Upload date:
- Size: 56.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96af9bbb349292033a3ba5b49213132caf601965e49951d56254885c61962e0f
|
|
| MD5 |
ab454239033361fb4111b6a565c01bd8
|
|
| BLAKE2b-256 |
68d2cf38e33508da16fa37ca318fc8153c9c87907b0a6714e66d47026db60f9d
|