Skip to main content

Tools for managing podcasting 2.0 feeds

Project description

pg-podcast-toolkit

Tools for parsing and managing Podcasting 2.0 RSS feeds with automatic namespace capture and database-ready output.

Features

  • Podcasting 2.0 Support - Automatically captures all podcast:* namespace tags without parser updates
  • Database-Ready Output - Built-in to_db_record() methods for PostgreSQL schema alignment
  • Backward Compatible - Existing code continues to work, new features are opt-in
  • Deterministic IDs - MD5-based UUID generation for podcasts and episodes
  • GUID Fallback - Handles episodes with missing GUIDs gracefully
  • Comprehensive Parsing - Supports RSS 2.0, iTunes extensions, and custom namespaces

Installation

pip install pg-podcast-toolkit

Quick Start

Basic Usage

from pg_podcast_toolkit import Podcast
import requests

# Fetch and parse a podcast feed
response = requests.get('https://example.com/feed.xml')
podcast = Podcast(response.content, feed_url='https://example.com/feed.xml')

# Access podcast metadata
print(podcast.title)
print(podcast.description)
print(podcast.itunes_image)

# Access episodes
for item in podcast.items:
    print(f"{item.title} - {item.itunes_duration}s")

Database Integration (New in v0.2.0)

# Get database-ready podcast record
podcast_record = podcast.to_db_record(
    etag='some-etag',              # Optional HTTP ETag
    last_modified='Wed, 06 Nov',   # Optional Last-Modified header
    last_fetched_at=1234567890     # Optional fetch timestamp
)

# Insert into PostgreSQL
# podcast_record matches schema: id, podcast_guid, title, feed_url,
# image_url, language, itunes_id, etag, last_modified, last_fetched_at,
# created_at, updated_at, extras (JSONB)

# Get database-ready episode records
for item in podcast.items:
    episode_record = item.to_db_record(podcast_id=podcast_record['id'])
    # episode_record matches schema: id, podcast_id, guid, title,
    # description, image_url, publish_date, duration_seconds,
    # episode_number, season_number, episode_type, explicit,
    # enclosure_url, enclosure_type, enclosure_size,
    # created_at, updated_at, extras (JSONB)

Accessing Podcasting 2.0 Namespaces (New in v0.2.0)

# All unknown namespace tags are automatically captured
print(podcast.namespaces)
# {
#   'podcast': {
#     'guid': {'value': '...'},
#     'locked': {'value': 'yes', 'attributes': {'owner': 'email@example.com'}},
#     'funding': {'value': 'Support!', 'attributes': {'url': 'https://...'}},
#     'person': [
#       {'value': 'Host Name', 'attributes': {'role': 'host', 'img': '...'}},
#       ...
#     ]
#   }
# }

# Episode-level namespaces
for item in podcast.items:
    print(item.namespaces)
    # {
    #   'podcast': {
    #     'chapters': {'attributes': {'url': '...', 'type': 'application/json'}},
    #     'transcript': {'attributes': {'url': '...', 'type': 'text/srt'}},
    #     'person': [...],
    #     ...
    #   }
    # }

What's New in v0.2.0

  • Automatic Namespace Capture - No parser updates needed for new Podcasting 2.0 tags
  • Database-Ready Methods - Podcast.to_db_record() and Item.to_db_record()
  • Schema Alignment - Output matches PostgreSQL schema with UUID primary keys
  • GUID Fallback - Episodes without GUIDs use enclosure_url for ID generation
  • 100% Backward Compatible - All existing attributes and methods unchanged

Supported Specifications

  • RSS 2.0
  • iTunes Podcast Extensions
  • Podcasting 2.0 Namespace (automatic capture)
  • Custom namespace extensions (automatic capture)

Development Status

This library is actively maintained and production-ready. The v0.2.0 release introduces database integration features while maintaining full backward compatibility.

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pg_podcast_toolkit-0.2.0.tar.gz (17.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pg_podcast_toolkit-0.2.0-py3-none-any.whl (19.1 kB view details)

Uploaded Python 3

File details

Details for the file pg_podcast_toolkit-0.2.0.tar.gz.

File metadata

  • Download URL: pg_podcast_toolkit-0.2.0.tar.gz
  • Upload date:
  • Size: 17.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for pg_podcast_toolkit-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0d766d41b1673619a4f0139c9d7c1f6e7ce92c89fb6b6c32be572c4f8378cae1
MD5 46d583a4d97b3e98e6d970716e943f57
BLAKE2b-256 6ef3581443360b14095c28672fa6cb679e8b7cd28707683677e1dda77eba45d5

See more details on using hashes here.

File details

Details for the file pg_podcast_toolkit-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pg_podcast_toolkit-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 25910081a3324a33b26e9ee07490c542f028442503ecee17008d2b8ba3647745
MD5 834874663015afa453aa7727ab5fa2dd
BLAKE2b-256 69e7d45222f8c14ffc6139fb142a9b55b143c8633ba070bbbaabca0205670505

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page