A modular podcast episode downloader with RSS feed parsing and progress tracking
Project description
Easy Podcast
A modular Python package for downloading podcast episodes from RSS feeds. Features progress tracking, metadata management, and duplicate detection.
Python Version Requirements
This package requires Python 3.10, 3.11, or 3.12. Python 3.13+ is not supported due to dependency limitations with the WhisperX library.
Features
- RSS Feed Parsing: Download and parse podcast RSS feeds
- Episode Management: Track downloaded episodes with JSONL metadata
- Progress Tracking: Visual progress bars for downloads
- Duplicate Detection: Automatically skip already downloaded episodes
- Type Safety: Comprehensive type hints throughout
Installation
Standard Installation
git clone https://github.com/falahat/easy-podcast.git
cd easy-podcast
pip install -e .
Development Installation
git clone https://github.com/falahat/easy-podcast.git
cd easy-podcast
pip install -e .[dev,notebook]
Quick Start
Command Line Interface
# Download episodes from an RSS feed
easy_podcast "https://example.com/podcast/rss.xml"
# Specify custom data directory
easy_podcast "https://example.com/podcast/rss.xml" --data-dir ./my_podcasts
# List episodes without downloading
easy_podcast "https://example.com/podcast/rss.xml" --list-only
# Disable progress bars
easy_podcast "https://example.com/podcast/rss.xml" --no-progress
Python API
from easy_podcast.manager import PodcastManager
# Create manager from RSS URL (downloads and parses automatically)
manager = PodcastManager.from_rss_url("https://example.com/podcast/rss.xml")
if manager:
podcast = manager.get_podcast()
print(f"Podcast: {podcast.title}")
# Get new episodes to download
new_episodes = manager.get_new_episodes()
print(f"Found {len(new_episodes)} new episodes")
# Download episodes with progress tracking
successful, skipped, failed = manager.download_episodes(new_episodes)
print(f"Downloaded: {successful}, Skipped: {skipped}, Failed: {failed}")
Working with Existing Podcast Data
# Load manager from existing podcast folder
manager = PodcastManager.from_podcast_folder("data/My Podcast/")
if manager:
# Continue downloading new episodes
new_episodes = manager.get_new_episodes()
manager.download_episodes(new_episodes)
Data Storage Structure
Podcast data is organized in a clear directory structure:
data/
└── [Sanitized Podcast Name]/
├── episodes.jsonl # Episode metadata (one JSON object per line)
├── rss.xml # Cached RSS feed
└── downloads/ # Downloaded audio files
├── episode1.mp3
├── episode2.mp3
└── ...
Important: Episode objects store filenames only (e.g., "727175.mp3"), not full paths. Use manager.get_episode_audio_path(episode) to get complete file paths.
Development
Setting up Development Environment
git clone https://github.com/falahat/easy-podcast.git
cd easy-podcast
# Create virtual environment (note the .venv name)
python -m venv .venv
# Activate virtual environment
# Windows PowerShell:
.\.venv\Scripts\Activate.ps1
# Linux/macOS:
source .venv/bin/activate
# Install in development mode
pip install -e .[dev,notebook]
Running Tests
# Run all tests
pytest
# Run with coverage report
pytest --cov=easy_podcast --cov-report=html
# Run specific test file
pytest tests/test_manager.py -v
Code Quality Tools
The project uses:
- Black for code formatting
- mypy for type checking
- flake8 for linting
- pytest for testing
# Format code
black src/ tests/
# Type checking
mypy src/easy_podcast/
# Linting
flake8 src/easy_podcast/
Core Components
The package is built with a modular architecture:
PodcastManager- Main orchestrator for the complete workflowEpisode/Podcast- Data models with computed propertiesEpisodeTracker- JSONL-based metadata persistencePodcastParser- RSS feed parsing with custom episode ID extractionPodcastDownloader- HTTP downloads with progress tracking
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes with tests
- Ensure all tests pass (
pytest) - Check code quality (
black src/ tests/andmypy src/) - Submit a pull request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file easy_podcast-0.0.3.tar.gz.
File metadata
- Download URL: easy_podcast-0.0.3.tar.gz
- Upload date:
- Size: 39.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1cc5f5d71f94998a19a4f967001de7826618fa2ad61f9e74f1ac54095d4a8e18
|
|
| MD5 |
f92d330cd59d15bb5d12874a2801e4f7
|
|
| BLAKE2b-256 |
339a088a1ef713f88dd4c099e83183d01cc6fd5df1f786a49c4db962d0d6af5a
|
Provenance
The following attestation bundles were made for easy_podcast-0.0.3.tar.gz:
Publisher:
python-publish.yml on falahat/easy-podcast
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
easy_podcast-0.0.3.tar.gz -
Subject digest:
1cc5f5d71f94998a19a4f967001de7826618fa2ad61f9e74f1ac54095d4a8e18 - Sigstore transparency entry: 517124822
- Sigstore integration time:
-
Permalink:
falahat/easy-podcast@098d1676a681ca82e497fc526530cca240ebcbf4 -
Branch / Tag:
refs/tags/0.0.3 - Owner: https://github.com/falahat
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@098d1676a681ca82e497fc526530cca240ebcbf4 -
Trigger Event:
release
-
Statement type:
File details
Details for the file easy_podcast-0.0.3-py3-none-any.whl.
File metadata
- Download URL: easy_podcast-0.0.3-py3-none-any.whl
- Upload date:
- Size: 15.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bfbfb86a01f8b3efe1a18b9fa0eb5f4733f2f7161cbcbd2e4b5905e4c05bf3f5
|
|
| MD5 |
00e8edc0ae07aaa08d87780d9c02c0da
|
|
| BLAKE2b-256 |
a930ee142e98a1131afe379c0b5d3fd8bf775c0e7e6ea4f808f381d7efb00824
|
Provenance
The following attestation bundles were made for easy_podcast-0.0.3-py3-none-any.whl:
Publisher:
python-publish.yml on falahat/easy-podcast
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
easy_podcast-0.0.3-py3-none-any.whl -
Subject digest:
bfbfb86a01f8b3efe1a18b9fa0eb5f4733f2f7161cbcbd2e4b5905e4c05bf3f5 - Sigstore transparency entry: 517124847
- Sigstore integration time:
-
Permalink:
falahat/easy-podcast@098d1676a681ca82e497fc526530cca240ebcbf4 -
Branch / Tag:
refs/tags/0.0.3 - Owner: https://github.com/falahat
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@098d1676a681ca82e497fc526530cca240ebcbf4 -
Trigger Event:
release
-
Statement type: