Skip to main content

Feed Tracking and Retrieval Abstraction Interface Layer

Project description

feedtrail

Feed Tracking and Retrieval Abstraction Interface Layer.

feedtrail provides a resilient RSS/Atom parser focused on production-style feeds where XML can be noisy, partially malformed, or inconsistent across publishers.

What It Does

  • Parses RSS and Atom feeds with namespace support.
  • Normalizes and cleans feed/item text content.
  • Converts heterogeneous date formats to UTC ISO strings.
  • Extracts structured metadata: title, link, description, summary, author, categories, and primary image.
  • Computes a deterministic request_hash for parsed payload integrity checks.
  • Handles malformed XML defensively (entity sanitization, escaped CDATA recovery, trailing content trimming).

Installation

Runtime

Requirements:

  • Python >= 3.10

Install from source:

pip install .

Development

pip install -r requirements_dev.txt

Quick Start

from feedtrail.feed_parser import FeedParser

xml_content = """<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>Example Feed</title>
    <link>https://example.com</link>
    <description>Demo</description>
    <item>
      <title>Hello</title>
      <link>/hello</link>
      <pubDate>Wed, 20 Mar 2024 09:00:00 GMT</pubDate>
      <description><![CDATA[Post body]]></description>
    </item>
  </channel>
</rss>"""

parser = FeedParser()
result = parser.parse(xml_content, base_url="https://example.com")

print(result["headers"]["title"])
print(result["items"][0]["link"])

Output Contract

FeedParser.parse(...) returns a dictionary with:

  • headers: feed-level metadata (title, link, description, updated, language, generator, parent_link, self_link).
  • items: list of normalized entries, each including:
    • title
    • link
    • description
    • summary
    • pub_date (ISO-like UTC string, when available)
    • author
    • image
    • categories
  • request_hash: SHA-256 hash of normalized parsed payload.
  • error: present when parsing fails (items will be empty in that case).

Development Workflow

Run tests

make test

Lint and formatting

make lint

Coverage

make coverage

Build Docker test image

make build-test-image

Run tox environments in Docker

make test-all

Project Structure

feedtrail/
  feed_parser.py          # Core RSS/Atom parser
  utils/
    date_utils.py         # Date parsing and normalization
    xml_utils.py          # XML sanitation and extraction helpers
tests/
  test_feed_parser.py
  test_utils_date_utils.py
  test_utils_xml_utils.py

Author

See AUTHORS.md for contributors.

History

0.1.0 (2026-03-25)

  • First release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

feedtrail-0.1.0.tar.gz (16.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

feedtrail-0.1.0-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file feedtrail-0.1.0.tar.gz.

File metadata

  • Download URL: feedtrail-0.1.0.tar.gz
  • Upload date:
  • Size: 16.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for feedtrail-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bdc21d7538f9a188f020afad505750adacd42306fdc3da909a5e78fef8ceb1c6
MD5 cd14305bd4ff8f06067473d3a9643724
BLAKE2b-256 fe63765699687bd92eaae060d6857e546647f59cc29a9810fc3fd98bcd81da57

See more details on using hashes here.

File details

Details for the file feedtrail-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: feedtrail-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for feedtrail-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e7ee8586d64469654c8ecab7881ff6fc7dd34125a5a8765bdd78d0f3cd302abd
MD5 503d7847e0bcaec40ebe47c2e8f6a829
BLAKE2b-256 0cc106ba9a9d65c0b0f9b24baad1623a06473bcc8515dcc20eddeff4b5be60ad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page