Skip to main content

A Wrapper around Reddit RSS feed

Project description

License PyPI - Python Version Release Downloads Last commit

Reddit RSS Reader

This is wrapper around publicly/privately available Reddit RSS feeds. It can be used to fetch content from front page, subreddit, all comments of subreddit, all comments of a certain post, comments of certain reddit user, search pages and many more. For more details about what type of RSS feed is provided by Reddit refer these links: link1 and link2.

*Note: These feeds are rate limited hence can only be used for testing purpose. For serious scrapping register your bot at apps to get client details and use it with Praw.

Installation

Install via PyPi:

pip install reddit-rss-reader

Install from master branch (if you want to try the latest features):

git clone https://github.com/lalitpagaria/reddit-rss-reader
cd reddit-rss-reader
pip install --editable .

How to use

RedditRSSReader require feed url, hence refer link to generate. For example to fetch all comments on subreddit r/wallstreetbets -

https://www.reddit.com/r/wallstreetbets/comments/.rss?sort=new

Now you can run the following example -

import pprint
from datetime import datetime, timedelta

import pytz as pytz

from reddit_rss_reader.reader import RedditRSSReader


reader = RedditRSSReader(
    url="https://www.reddit.com/r/wallstreetbets/comments/.rss?sort=new"
)

# To consider comments entered in past 5 days only
since_time = datetime.utcnow().astimezone(pytz.utc) + timedelta(days=-5)

# fetch_content will fetch all contents if no parameters are passed.
# If `after` is passed then it will fetch contents after this date
# If `since_id` is passed then it will fetch contents after this id
reviews = reader.fetch_content(
    after=since_time
)

pp = pprint.PrettyPrinter(indent=4)
for review in reviews:
    pp.pprint(review.__dict__)

Reader return RedditContent which have following information (extracted_text and image_alt_text are extracted from Reddit content via BeautifulSoup) -

@dataclass
class RedditContent:
    title: str
    link: int
    id: str
    content: str
    extracted_text: Optional[str]
    image_alt_text: Optional[str]
    updated: datetime
    author_uri: str
    author_name: str
    category: str

The output is given with UTF-8 charsets, if you are scraping non-english reddits then set the environment to use UTF -

export LANG=en_US.UTF-8
export PYTHONIOENCODING=utf-8

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reddit-rss-reader-1.3.2.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

reddit_rss_reader-1.3.2-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file reddit-rss-reader-1.3.2.tar.gz.

File metadata

  • Download URL: reddit-rss-reader-1.3.2.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.6.13

File hashes

Hashes for reddit-rss-reader-1.3.2.tar.gz
Algorithm Hash digest
SHA256 46d3aa81f1301a73314f7e417935d737a263bd9711049428b40d59c300d6e381
MD5 7851c55cbe7f024114f657728652e48f
BLAKE2b-256 87c03f341f4c96113a3fa81e134f99f71df1d2acd0d5c0be917a64df3f6db272

See more details on using hashes here.

File details

Details for the file reddit_rss_reader-1.3.2-py3-none-any.whl.

File metadata

  • Download URL: reddit_rss_reader-1.3.2-py3-none-any.whl
  • Upload date:
  • Size: 8.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.6.13

File hashes

Hashes for reddit_rss_reader-1.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1cf77ce452abac21c6cddc3504cc6b78c6bda56a6d603767367c3cc42c55bd7c
MD5 7a5caf73fb08e33dfb67082afaf95df5
BLAKE2b-256 5db71355b2d50e67915013c43b9900282479fb08a4e3b4a6d4646ee92bf8cab2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page