Typed pythonic RSS parser

These details have not been verified by PyPI

Project links

Project description

Rss parser

Docs PyPi publish

About

rss-parser is typed python RSS parsing module built using pydantic and xmltodict

Installation

pip install rss-parser

git clone https://github.com/dhvcc/rss-parser.git
cd rss-parser
poetry build
pip install dist/*.whl

Usage

Quickstart

from rss_parser import Parser
from requests import get

rss_url = "https://rss.art19.com/apology-line"
response = get(rss_url)

rss = Parser.parse(response.text)

# Print out rss meta data
print("Language", rss.channel.language)
print("RSS", rss.version)

# Iteratively print feed items
for item in rss.channel.items:
    print(item.title)
    print(item.description[:50])

# Language en
# RSS 2.0
# Wondery Presents - Flipping The Bird: Elon vs Twitter
# <p>When Elon Musk posted a video of himself arrivi
# Introducing: The Apology Line
# <p>If you could call a number and say you’re sorry

Here we can see that description is still somehow has

- this is beacause it's placed as CDATA like so

<![CDATA[<p>If you could call ...</p>]]>

Overriding schema

If you want to customize the schema or provide a custom one - use schema keyword argument of the parser

from rss_parser.models import XMLBaseModel
from rss_parser.models.rss import RSS
from rss_parser.models.types import Tag

class CustomSchema(RSS, XMLBaseModel):
    channel: None = None # Removing previous channel field
    custom: Tag[str]

with open("tests/samples/custom.xml") as f:
    data = f.read()

rss = Parser.parse(data, schema=CustomSchema)

print("RSS", rss.version)
print("Custom", rss.custom)

# RSS 2.0
# Custom Custom tag data

xmltodict

This library uses xmltodict to parse XML data. You can see the detailed documentation here

The basic thing you should know is that your data is processed into dictionaries

For example, this data

<tag>content</tag>

will result in the following

{
    "tag": "content"
}

But, when handling attributes, the content of the tag will be also a dictionary

<tag attr="1" data-value="data">data</tag>

Turns into

{
    "tag": {
        "@attr": "1",
        "@data-value": "data",
        "#text": "content"
    }
}

Tag field

This is a generic field that handles tags as raw data or a dictonary returned with attributes

Although this is a complex class, it forwards most of the methods to it's content attribute, so you don't notice a difference if you're only after the .content value

Example

from rss_parser.models import XMLBaseModel
class Model(XMLBaseModel):
     number: Tag[int]
     string: Tag[str]

m = Model(
    number=1,
    string={'@attr': '1', '#text': 'content'},
)

m.number.content == 1  # Content value is an integer, as per the generic type

m.number.content + 10 == m.number + 10  # But you're still able to use the Tag itself in common operators

m.number.bit_length() == 1  # As it's the case for methods/attributes not found in the Tag itself

type(m.number), type(m.number.content) == (<class 'rss_parser.models.image.Tag[int]'>, <class 'int'>)  # types are NOT the same, however, the interfaces are very similar most of the time

m.number.attributes == {}  # The attributes are empty by default

m.string.attributes == {'attr': '1'}  # But are populated when provided. Note that the @ symbol is trimmed from the beggining, however, camelCase is not converted

# Generic argument types are handled by pydantic - let's try to provide a string for a Tag[int] number

m = Model(number='not_a_number', string={'@customAttr': 'v', '#text': 'str tag value'})  # This will lead in the following traceback

# Traceback (most recent call last):
#     ...
# pydantic.error_wrappers.ValidationError: 1 validation error for Model
# number -> content
#     value is not a valid integer (type=type_error.integer)

If you wish to avoid all of the method/attribute forwarding "magic" - you should use rss_parser.models.types.TagRaw

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Install dependencies with poetry install (pip install poetry)

pre-commit usage is highly recommended. To install hooks run

poetry run pre-commit install -t=pre-commit -t=pre-push

License

GPLv3

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

3.0.0a2 pre-release

Nov 17, 2025

3.0.0a1 pre-release

Nov 17, 2025

2.1.1

Jul 6, 2025

2.1.0

Sep 26, 2024

2.0.0

Feb 22, 2024

2.0.0a0 pre-release

Feb 21, 2024

1.2.1

Oct 24, 2023

1.2.0

Oct 24, 2023

1.1.1

Oct 3, 2023

This version

1.1.0

Jul 15, 2023

1.0.0

May 31, 2023

0.2.4

Jun 7, 2022

0.2.3

Apr 15, 2021

0.2.2

Apr 13, 2021

0.2.0

Apr 13, 2021

0.1.1

Oct 5, 2020

0.1

Oct 3, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rss_parser-1.1.0.tar.gz (23.7 kB view details)

Uploaded Jul 15, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rss_parser-1.1.0-py3-none-any.whl (25.0 kB view details)

Uploaded Jul 15, 2023 Python 3

File details

Details for the file rss_parser-1.1.0.tar.gz.

File metadata

Download URL: rss_parser-1.1.0.tar.gz
Upload date: Jul 15, 2023
Size: 23.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.5.1 CPython/3.10.0 Linux/5.15.0-1041-azure

File hashes

Hashes for rss_parser-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f6b2e87753e498bb3c188f7f021870fbd6acc744ec69a4763cbaedbeb48acc74`
MD5	`af1b10559d180773f71017edede1c42b`
BLAKE2b-256	`0cb9094ad48a5cbeb5763e36142ae2f321e4a19019061861e05d8b01e0c63fce`

See more details on using hashes here.

File details

Details for the file rss_parser-1.1.0-py3-none-any.whl.

File metadata

Download URL: rss_parser-1.1.0-py3-none-any.whl
Upload date: Jul 15, 2023
Size: 25.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.5.1 CPython/3.10.0 Linux/5.15.0-1041-azure

File hashes

Hashes for rss_parser-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2aca4fae1a1ddb522b4118f7032e5a9b2b78154d66424440c876c35aa0d95a03`
MD5	`4172c78e9f755f31a7b437e71a0f6142`
BLAKE2b-256	`7c91bdf167ae4dbb6a7f3f6134326b8d906273c6ef092d9367862f84ffa64f9f`

See more details on using hashes here.

rss-parser 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Rss parser

About

Installation

Usage

Quickstart

Overriding schema

xmltodict

Tag field

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes