CLI tool to extract Coursera course transcripts/subtitles
Project description
๐ Coursera Transcript Generator
A beautiful CLI tool to bulk-download transcripts and subtitles from any Coursera course you're enrolled in.
โจ Features
- Interactive prompts โ guided step-by-step experience, no need to memorize flags
- Bulk download โ grabs every lecture transcript in a course at once
- Organized output โ files are neatly sorted into module folders
- Progress tracking โ real-time progress bar with download status
- Retry logic โ automatic retries with exponential backoff on failures
- Multiple formats โ supports both
.txt(plain text) and.srt(subtitle) formats - Multi-language โ download transcripts in any available language
๐ฆ Installation
# Clone the repo
git clone https://github.com/your-username/coursera-transcript-generator.git
cd coursera-transcript-generator
# Install in editable mode
pip install -e .
๐ Usage
Interactive Mode (recommended)
Just run the command with no arguments โ it will guide you through everything:
coursera-transcripts
You'll be prompted for:
- CAUTH cookie โ your Coursera authentication token
- Course slug โ the identifier from the course URL
- Options โ language, format, and output directory
CLI Mode
Pass everything as flags for scripting / automation:
coursera-transcripts \
--cookie "YOUR_CAUTH_VALUE" \
--slug "machine-learning" \
--language en \
--format txt \
--output ./transcripts
All Options
| Flag | Short | Default | Description |
|---|---|---|---|
--cookie |
-c |
(prompted) | CAUTH cookie value |
--slug |
-s |
(prompted) | Course slug from URL |
--language |
-l |
en |
Subtitle language code |
--format |
txt |
Output format (txt or srt) |
|
--output |
-o |
./output |
Parent output directory |
๐ Getting Your CAUTH Cookie
- Open coursera.org and log in
- Open DevTools (
F12orCtrl+Shift+I) - Go to Application โ Cookies โ
https://www.coursera.org - Find the cookie named
CAUTH - Copy its Value
[!IMPORTANT] You must be enrolled in the course to download its transcripts.
๐ Output Structure
Transcripts are organized by module:
output/
โโโ machine-learning/
โโโ introduction-to-ml/
โ โโโ Welcome to Machine Learning.txt
โ โโโ What is Machine Learning.txt
โ โโโ Supervised Learning.txt
โโโ linear-regression/
โ โโโ Model Representation.txt
โ โโโ Cost Function.txt
โโโ ...
๐ง Finding the Course Slug
The slug is the part of the URL after /learn/:
https://www.coursera.org/learn/machine-learning
โโโ this is the slug
๐ Requirements
- Python 3.10+
- A Coursera account with enrollment in the target course
๐ License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file coursera_transcripts-0.1.1.tar.gz.
File metadata
- Download URL: coursera_transcripts-0.1.1.tar.gz
- Upload date:
- Size: 11.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
41ff2ac1a7998d4eb9dc3167874adcf31296910251ca8d746c902cafb846bdd2
|
|
| MD5 |
128f1782cec8753523fccf6ccccaed29
|
|
| BLAKE2b-256 |
fb1c77b904eadb2aa08a9d67035745f846fd872f8d5aca01e43f7b9f83159c92
|
File details
Details for the file coursera_transcripts-0.1.1-py3-none-any.whl.
File metadata
- Download URL: coursera_transcripts-0.1.1-py3-none-any.whl
- Upload date:
- Size: 10.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d26b567064339f8c086af9d83db076d65bf32fb3339295467b4b6593ec0113c8
|
|
| MD5 |
11ef4c17fd714e6663a8c2a048529f2f
|
|
| BLAKE2b-256 |
568b7fa9d92c7ba3ceb125a133af3970bb5f8668872e0e0cd1e23df882471d0d
|