Just a simple podcast transcript generator

Project description

Podcast Transcript

A simple command-line tool to generate transcripts for podcast episodes or other audio files containing speech.

Features
Prerequisites
Installation
Configuration
Usage
Output Formats
Development
- Running Tests
- Code Style and Linting
License
Author

Features

Download and process podcast episodes or other audio content from a given URL or file path.
Automatically resamples audio to 16kHz mono because Groq will do this anyway.
Splits large audio files into manageable chunks.
Transcribes audio locally using whisper-cpp
Optionally transcribes audio locally using mlx-whisper.
Optionally transcribes audio using the Groq API.
Optionally transcribes audio through a remote Voxhelm service.
Outputs transcripts in multiple formats:
- DOTe JSON
- Podlove JSON
- WebVTT (subtitle format)
- Plaintext

Prerequisites

Python >=3.10
- MLX backend (via mlx-whisper) requires macOS on Apple Silicon and a Python version supported by the mlx-whisper / torch wheels.
ffmpeg installed and available in your system’s PATH.
A Groq API key for transcription services.

Installation

Install the package:

pip install podcast-transcript  # or pipx/uvx install podcast-transcript

(Optional) Install MLX backend dependencies (macOS on Apple Silicon only):

pip install "podcast-transcript[mlx]"
# or
uv pip install "podcast-transcript[mlx]"

To run without installing into your environment:

uvx --from "podcast-transcript[mlx]" transcribe --backend mlx <mp3_url>

Example (pin Python + package version):

time TRANSCRIPT_LANGUAGE=de uvx --python python3.14.2 --from "podcast-transcript[mlx]==0.1.5" transcribe --backend mlx https://d2mmy4gxasde9x.cloudfront.net/cast_audio/pp_67.mp3

Configuration

Setting the Groq API Key

Using the Groq backend requires a Groq API key to function. You can set the API key in one of the following ways:

Environment Variable:

Set the GROQ_API_KEY environment variable in your shell:

export GROQ_API_KEY=your_api_key_here
# or
GROQ_API_KEY=your_api_key_here podcast-transcript ...

.env File:

Create a .env file in the transcript directory (default is ~/.podcast-transcripts/) and add the following line:

GROQ_API_KEY=your_api_key_here

Configuring the Voxhelm Backend

Set these variables when using --backend voxhelm:

export VOXHELM_API_BASE=https://voxhelm.home.xn--wersdrfer-47a.de
export VOXHELM_API_KEY=your_voxhelm_token_here

VOXHELM_API_BASE can point at either the service root or the /v1 API prefix. The backend uses gpt-4o-mini-transcribe by default, or TRANSCRIPT_MODEL_NAME if you override it.

Transcript Home

By default, the transcripts home directory is ~/.podcast-transcripts/. You can change this by setting the TRANSCRIPT_HOME environment variable:

export TRANSCRIPT_HOME=/path/to/your/transcripts_home

The transcript home directory is the place where you could store your .env file. The model files for the whisper-cpp backend are also stored in the transcript home directory in a directory called whisper-cpp-models. The transcripts themselves are stored in a directory called transcripts unless you specify a different directory.

Transcripts Directory

By default, transcripts are stored in ~/.podcast-transcripts/transcripts/. You can change this by setting the TRANSCRIPT_DIR environment variable:

export TRANSCRIPT_DIR=/path/to/your/transcripts

Other Configuration Options

You can also set the following environment variables or specify them in the .env file:

TRANSCRIPT_MODEL_NAME: The name of the model to use for the transcript (default is "ggml-large-v3.bin" for whisper-cpp, "whisper-large-v3" for Groq and "mlx-community/whisper-large-v3-mlx" for MLX).
TRANSCRIPT_PROMPT: The prompt to use for the transcription (default is "podcast-transcript").
TRANSCRIPT_LANGUAGE: The language code for the transcription (default is en, you could set it to de for example).

Usage

To transcribe a podcast episode, run the transcribe command followed by the URL of the MP3 file:

transcribe <mp3_url>

Example:

transcribe https://d2mmy4gxasde9x.cloudfront.net/cast_audio/pp_53.mp3

Or if you want to use the Groq API:

transcribe --backend=groq https://d2mmy4gxasde9x.cloudfront.net/cast_audio/pp_53.mp3

Or if you want to use the MLX backend (requires the mlx extra; macOS on Apple Silicon only):

transcribe --backend=mlx https://d2mmy4gxasde9x.cloudfront.net/cast_audio/pp_53.mp3

Or if you want to use a remote Voxhelm service:

VOXHELM_API_BASE=https://voxhelm.home.xn--wersdrfer-47a.de \
VOXHELM_API_KEY=your_voxhelm_token_here \
transcribe --backend=voxhelm https://d2mmy4gxasde9x.cloudfront.net/cast_audio/pp_53.mp3

Detailed Steps

The transcription process involves the following steps:

Download the audio file from the provided URL or copy it from the file path if one was given.
Convert the audio to mp3 and resample to 16kHz mono for optimal transcription.
Split the audio into chunks if it exceeds the size limit (25 MB).
Transcribe each audio chunk using either whisper-cpp (converts mp3 to wav first), mlx-whisper, the Groq API, or Voxhelm.
Combine the transcribed chunks into a single transcript.
Generate output files in DOTe JSON, Podlove JSON, and WebVTT formats.

The output files are saved in a directory named after the episode, within the transcript directory.

Output Formats

DOTe JSON (*.dote.json): A JSON format suitable for further processing or integration with other tools.
Podlove JSON (*.podlove.json): A JSON format compatible with Podlove transcripts.
WebVTT (*.vtt): A subtitle format that can be used for captioning in media players.
Plaintext: Just the plain text of the transcription.

Roadmap

Support for multitrack transcripts with speaker identification.
Add support for other transcription backends (e.g., openAI, speechmatics, local whisper via pytorch).
Add support for other audio formats (e.g., AAC, WAV, FLAC).
Add more output formats (e.g., SRT, TTML).

Development

Install Development Version

Clone the repository:

git clone https://github.com/yourusername/podcast-transcript.git
cd podcast-transcript

Sync the project environment:

uv sync

This creates .venv and installs the project plus dev dependencies.

To ensure you’re using the repo’s pinned Python version, you can also run:

UV_PYTHON=python"$(cat .python-version)" uv sync

Common developer commands:

just lint
just typecheck
just test
just bead        # shows ready beads (issues)

Running Tests

The project uses pytest for testing. To run tests:

just test
# or: pytest

Show coverage:

coverage run -m pytest && coverage html && open htmlcov/index.html

Code Style and Linting

Install pre-commit hooks to ensure code consistency:

pre-commit install

Check the type hints:

just typecheck
# or: mypy src/

Run lint/format:

just lint

Publish a Release

Build the distribution package:

uv build

Publish the package to PyPI:

uv publish --token your_pypi_token

License

This project is licensed under the MIT License.

Author

Jochen Wersdörfer

Project details

Release history Release notifications | RSS feed

This version

0.1.7

Apr 30, 2026

0.1.6

Feb 26, 2026

0.1.5

Dec 18, 2025

0.1.4

Jan 12, 2025

0.1.3

Jan 12, 2025

0.1.2

Jan 11, 2025

0.1.1

Nov 22, 2024

0.1.0

Nov 17, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

podcast_transcript-0.1.7.tar.gz (13.3 kB view details)

Uploaded Apr 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

podcast_transcript-0.1.7-py3-none-any.whl (16.0 kB view details)

Uploaded Apr 30, 2026 Python 3

File details

Details for the file podcast_transcript-0.1.7.tar.gz.

File metadata

Download URL: podcast_transcript-0.1.7.tar.gz
Upload date: Apr 30, 2026
Size: 13.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for podcast_transcript-0.1.7.tar.gz
Algorithm	Hash digest
SHA256	`0494213b2c3fb0c6ea4c3b2454c3315ca44ff9467ffa434ac81f478604b250a7`
MD5	`e51b1e7a4a92112519fcdd2824193274`
BLAKE2b-256	`cb909da79f1e38d91c0ad4de69a97f006c547b8327ad3f73e4824d29e53759dc`

See more details on using hashes here.

File details

Details for the file podcast_transcript-0.1.7-py3-none-any.whl.

File metadata

Download URL: podcast_transcript-0.1.7-py3-none-any.whl
Upload date: Apr 30, 2026
Size: 16.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for podcast_transcript-0.1.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d2850097834d818e856e70d13a7a4690900b13088e7f5f6f6f9e70ac0aae5145`
MD5	`8b006a89804ab943ce4afa7e278e6573`
BLAKE2b-256	`d7cb464e89842519aac9c75242102df30a448fa3f209508990ae236c4cb61b3e`

See more details on using hashes here.

podcast-transcript 0.1.7

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Podcast Transcript

Table of Contents

Features

Prerequisites

Installation

Configuration

Setting the Groq API Key

Configuring the Voxhelm Backend

Transcript Home

Transcripts Directory

Other Configuration Options

Usage

Detailed Steps

Output Formats

Roadmap

Development

Install Development Version

Running Tests

Code Style and Linting

Publish a Release

License

Author

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes