Feed Tracking and Retrieval Abstraction Interface Layer
Project description
FeedTrail
Feed Tracking and Retrieval Abstraction Interface Layer.
feedtrail provides a resilient RSS/Atom parser focused on production-style feeds where XML can be noisy, partially malformed, or inconsistent across publishers.
Repository: https://github.com/juanmcristobal/feedtrail
What It Does
- Parses RSS and Atom feeds with namespace support.
- Normalizes and cleans feed/item text content.
- Converts heterogeneous date formats to UTC ISO strings.
- Extracts structured metadata: title, link, description, summary, author, categories, and primary image.
- Computes a deterministic
request_hashfor parsed payload integrity checks. - Handles malformed XML defensively (entity sanitization, escaped CDATA recovery, trailing content trimming).
Installation
pip install feedtrail
Quick Start
from feedtrail.feed_parser import FeedParser
xml_content = """<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<title>Example Feed</title>
<link>https://example.com</link>
<description>Demo</description>
<item>
<title>Hello</title>
<link>/hello</link>
<pubDate>Wed, 20 Mar 2024 09:00:00 GMT</pubDate>
<description><![CDATA[Post body]]></description>
</item>
</channel>
</rss>"""
parser = FeedParser()
result = parser.parse(xml_content, base_url="https://example.com")
print(result["headers"]["title"])
print(result["items"][0]["link"])
Output Contract
FeedParser.parse(...) returns a dictionary with:
headers: feed-level metadata (title,link,description,updated,language,generator,parent_link,self_link).items: list of normalized entries, each including:titlelinkdescriptionsummarypub_date(ISO-like UTC string, when available)authorimagecategories
request_hash: SHA-256 hash of normalized parsed payload.error: present when parsing fails (itemswill be empty in that case).
Support & Connect
- ⭐ Star the repo if you found it useful
- ☕ Support me: Say thanks by buying me a coffee! https://buymeacoffee.com/juanmcristobal
- 💼 Open to work: https://www.linkedin.com/in/jmcristobal/
Author
- Juan Manuel Cristóbal Moreno (juanmcristobal@gmail.com)
See AUTHORS.md for contributors.
History
0.1.0 (2026-03-25)
- First release.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file feedtrail-0.3.0.tar.gz.
File metadata
- Download URL: feedtrail-0.3.0.tar.gz
- Upload date:
- Size: 16.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
991a3e32a874769a01cc9ec81550a5f2fbc444bdebdbfbd99d2e4dc0053b4e8e
|
|
| MD5 |
b773c1a939d5c6c9ce0c1d8d77988b44
|
|
| BLAKE2b-256 |
e0cc40aa71855d88e9d321455b130718cc955fc5253dced7f36891962c4b9ed5
|
File details
Details for the file feedtrail-0.3.0-py3-none-any.whl.
File metadata
- Download URL: feedtrail-0.3.0-py3-none-any.whl
- Upload date:
- Size: 11.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e46a51f65e4ba40664133d2b1c9b1994ecb5a4555094617f64b3cbe25aad2993
|
|
| MD5 |
5f6eecef3db0c0bda64e0460c5e9c333
|
|
| BLAKE2b-256 |
4c2edf1708ef12f91b6b12bf61cc392e93ee29947704e885fb6fd52fa88894ba
|