Python library for scraping Reddit data, powered by a .NET 10 backend

Project description

RedScrapsLib

A Python library for scraping Reddit data — posts, comments, and user activity — without needing the official API. The scraping logic is written in C# (.NET 10) and exposed to Python via pythonnet, with automatic rate-limit handling built in.

PyPI Python Platform License

Requirements

Requirement	Details
Python	3.10+
.NET Runtime	.NET 10 — must be installed separately
Platform	Windows x64, macOS 12+ (Apple Silicon & Intel), Linux x86_64 (incl. WSL2)

Note: pip installs the Python wrapper and the compiled .NET assembly, but cannot install the .NET runtime itself. Download and install it from the link above before using the library.

Installation

pip install redscrapslib

Quick Start

import RedScrapsLib as rs

# Must be called once before anything else
rs.init(user_agent="MyBot/1.0")

# Fetch posts from a subreddit
posts = rs.get_home("python", limit=10)
for post in posts.Posts:
    print(post.Title, post.Author)

# Fetch comments on a specific post
comments = rs.get_comments("python", post_id="abc123", limit=50)
for comment in comments.Comments:
    print(comment.Author, comment.Body)

# Fetch a user's post submissions
submissions = rs.get_user_posts("spez", limit=25)
for post in submissions.Posts:
    print(post.Title, post.Subreddit)

# Fetch a user's comments
user_comments = rs.get_user_comments("spez", limit=25)
for comment in user_comments.Comments:
    print(comment.Body, comment.Subreddit)

# Check session statistics
print(rs.get_stats())
# {'calls': 4, 'rate_limit_hits': 0, 'total_wait_seconds': 0.0}

Rate Limiting

Reddit's unofficial API enforces a hard limit of roughly 100 requests per window. RedScrapsLib handles this automatically — no extra code needed.

When a 429 response is received, the library:

Reads the Retry-After header (defaults to 60s if absent)
Prints a message so you know it's waiting
Sleeps for the required time
Retries the request transparently

[RedScrapsLib] Rate limited on get_home. Waiting 60s... (hit #1, 60s waited total)

This means you can run long loops without worrying about crashes:

rs.init(user_agent="MyBot/1.0")

for subreddit in my_list:
    data = rs.get_home(subreddit)  # sleeps and retries automatically if rate limited
    process(data)

print(rs.get_stats())
# {'calls': 250, 'rate_limit_hits': 3, 'total_wait_seconds': 780.0}

Based on testing: Reddit allows ~100 requests before rate limiting, then applies ~480s penalties for sustained hammering. For bulk scraping, adding a small delay between calls avoids the heavy penalty entirely.

API Reference

`init(user_agent=None, debug=False)`

Initialises the scraper. Must be called once before any other function.

Parameter	Type	Default	Description
`user_agent`	`str \| None`	`None`	Custom User-Agent string sent with every request. Defaults to `"RedScrapsBot"`
`debug`	`bool`	`False`	Prints step-by-step logs for each request when `True`

`get_home(subreddit, sort="hot", limit=100, time=None, after=None) → HomeSent`

Fetches posts from a subreddit.

Parameter	Type	Default	Description
`subreddit`	`str`	—	Subreddit name (without `r/`)
`sort`	`str`	`"hot"`	`"hot"`, `"new"`, `"top"`, `"rising"`
`limit`	`int`	`100`	Number of posts to fetch (max 100 per request)
`time`	`str \| None`	`None`	Time filter for `"top"`: `"hour"`, `"day"`, `"week"`, `"month"`, `"year"`, `"all"`
`after`	`str \| None`	`None`	Post ID to paginate from

Returns: HomeSent

HomeSent
├── Subreddit     str
├── FirstID       str
├── LastID        str          ← use as `after` to paginate
├── TotalPosts    int
└── Posts         List[Post]
    ├── PostID    str | None
    ├── Title     str | None
    ├── Author    str | None
    ├── SelfText  str | None
    └── Link      str | None

`get_comments(subreddit, post_id, sort="confidence", limit=100) → CommentSent`

Fetches comments for a specific post.

Parameter	Type	Default	Description
`subreddit`	`str`	—	Subreddit the post belongs to
`post_id`	`str`	—	Post ID (e.g. `"abc123"`)
`sort`	`str`	`"confidence"`	`"confidence"`, `"top"`, `"new"`, `"controversial"`, `"old"`
`limit`	`int`	`100`	Max number of comments to fetch

Returns: CommentSent

CommentSent
├── PostID        str | None
├── Title         str | None
├── Author        str | None
├── Selftext      str | None
├── Subreddit     str | None
├── Num_comments  int | None
├── Permalink     str | None
└── Comments      List[Comment]
    ├── CommentID str | None
    ├── Author    str | None
    ├── ParentID  str | None
    └── Body      str | None

`get_user_posts(user, sort=None, limit=None, time=None, after=None) → UserSubmittedSent`

Fetches a user's post submissions.

Parameter	Type	Default	Description
`user`	`str`	—	Reddit username (without `u/`)
`sort`	`str \| None`	`None`	`"hot"`, `"new"`, `"top"`, `"controversial"`
`limit`	`int \| None`	`None`	Number of posts to fetch
`time`	`str \| None`	`None`	Time filter when using `"top"`
`after`	`str \| None`	`None`	Post ID to paginate from

Returns: UserSubmittedSent

UserSubmittedSent
├── Username      str
├── FirstID       str
├── LastID        str          ← use as `after` to paginate
├── TotalCount    int
└── Posts         List[Post]
    ├── PostID       str | None
    ├── Title        str | None
    ├── Author       str | None
    ├── Subreddit    str | None
    ├── SelfText     str | None
    ├── Link         str | None
    ├── Upvotes      int | None
    ├── CommentCount int | None
    └── CreatedUtc   float

`get_user_comments(user, sort=None, limit=None, time=None, after=None) → UserCommentsSent`

Fetches a user's comment history.

Parameter	Type	Default	Description
`user`	`str`	—	Reddit username (without `u/`)
`sort`	`str \| None`	`None`	`"hot"`, `"new"`, `"top"`, `"controversial"`
`limit`	`int \| None`	`None`	Number of comments to fetch
`time`	`str \| None`	`None`	Time filter when using `"top"`
`after`	`str \| None`	`None`	Comment ID to paginate from

Returns: UserCommentsSent

UserCommentsSent
├── Username      str
├── FirstID       str
├── LastID        str          ← use as `after` to paginate
├── TotalCount    int
└── Comments      List[Comment]
    ├── CommentID  str | None
    ├── Author     str | None
    ├── Subreddit  str | None
    ├── Body       str | None
    ├── ParentID   str | None
    ├── PostID     str | None
    ├── PostTitle  str | None
    ├── Link       str | None
    ├── Upvotes    int | None
    └── CreatedUtc float

`get_stats() → dict`

Returns session statistics since init() was called.

{
    'calls': int,               # total successful API calls
    'rate_limit_hits': int,     # number of 429 responses received
    'total_wait_seconds': float # total time spent waiting on rate limits
}

Pagination

Every response includes FirstID and LastID. Pass LastID as the after parameter to fetch the next page:

rs.init(user_agent="MyBot/1.0")

after = None
all_posts = []

while True:
    page = rs.get_home("python", limit=100, after=after)
    all_posts.extend(page.Posts)

    if page.TotalPosts < 100:
        break  # last page

    after = page.LastID

Architecture

Python (RedScrapsLib)
    │
    │  pythonnet
    ▼
C# .NET 10 Assembly (RedScrap.dll)
    ├── Scraper          — HttpClient, request logic
    ├── URLs             — URL builders for each endpoint
    ├── Receive (JSON)   — deserialisation models
    ├── Map              — raw → clean data mapping
    └── Sent             — clean data models returned to Python

Project details

Release history Release notifications | RSS feed

This version

0.1.5

May 11, 2026

0.1.4

May 4, 2026

0.1.3

May 4, 2026

0.1.2

May 3, 2026

0.1.1

Apr 30, 2026

0.1.0

Apr 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

redscrapslib-0.1.5-py3-none-manylinux2014_x86_64.whl (25.0 kB view details)

Uploaded May 11, 2026 Python 3

redscrapslib-0.1.5-py3-none-macosx_12_0_universal2.whl (25.0 kB view details)

Uploaded May 11, 2026 Python 3macOS 12.0+ universal2 (ARM64, x86-64)

redscrapslib-0.1.5-cp313-cp313-win_amd64.whl (25.0 kB view details)

Uploaded May 11, 2026 CPython 3.13Windows x86-64

File details

Details for the file redscrapslib-0.1.5-py3-none-manylinux2014_x86_64.whl.

File metadata

Download URL: redscrapslib-0.1.5-py3-none-manylinux2014_x86_64.whl
Upload date: May 11, 2026
Size: 25.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.6

File hashes

Hashes for redscrapslib-0.1.5-py3-none-manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`62b312a675000e161b165d2270598f2043323c938a177deb7ada1981033597d5`
MD5	`85987257b3ae296a349cfff712a5a6c5`
BLAKE2b-256	`ec7fcb23bb0ff78d2afb884b154d05e5c438435884175e24814c99a45f1ffb90`

See more details on using hashes here.

File details

Details for the file redscrapslib-0.1.5-py3-none-macosx_12_0_universal2.whl.

File metadata

Download URL: redscrapslib-0.1.5-py3-none-macosx_12_0_universal2.whl
Upload date: May 11, 2026
Size: 25.0 kB
Tags: Python 3, macOS 12.0+ universal2 (ARM64, x86-64)
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.6

File hashes

Hashes for redscrapslib-0.1.5-py3-none-macosx_12_0_universal2.whl
Algorithm	Hash digest
SHA256	`8248166a298ba9ae3d0907d0d6fcda9eaa4e8f692e51767c8a1aea9859f2537b`
MD5	`3886cd6405bfc0bc7688ae53d3f47414`
BLAKE2b-256	`52966600e7b0ac5804ee8318a0a483125d7c8d7c0e4d6f720efdd94743c2173e`

See more details on using hashes here.

File details

Details for the file redscrapslib-0.1.5-cp313-cp313-win_amd64.whl.

File metadata

Download URL: redscrapslib-0.1.5-cp313-cp313-win_amd64.whl
Upload date: May 11, 2026
Size: 25.0 kB
Tags: CPython 3.13, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.6

File hashes

Hashes for redscrapslib-0.1.5-cp313-cp313-win_amd64.whl
Algorithm	Hash digest
SHA256	`ecd7d1d64455da0d0ee5f38cea3fe1e06c08a778d0962f5e90fadaf8d3264c55`
MD5	`afab70789f4166faccf30953802844e4`
BLAKE2b-256	`50ba0ab124bc0ae83e7cceb6ff286d1e5cc0cc79864529d86355b946c45aa27c`

See more details on using hashes here.

redscrapslib 0.1.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

RedScrapsLib

Requirements

Installation

Quick Start

Rate Limiting

API Reference

init(user_agent=None, debug=False)

get_home(subreddit, sort="hot", limit=100, time=None, after=None) → HomeSent

get_comments(subreddit, post_id, sort="confidence", limit=100) → CommentSent

get_user_posts(user, sort=None, limit=None, time=None, after=None) → UserSubmittedSent

get_user_comments(user, sort=None, limit=None, time=None, after=None) → UserCommentsSent

get_stats() → dict

Pagination

Architecture

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

`init(user_agent=None, debug=False)`

`get_home(subreddit, sort="hot", limit=100, time=None, after=None) → HomeSent`

`get_comments(subreddit, post_id, sort="confidence", limit=100) → CommentSent`

`get_user_posts(user, sort=None, limit=None, time=None, after=None) → UserSubmittedSent`

`get_user_comments(user, sort=None, limit=None, time=None, after=None) → UserCommentsSent`

`get_stats() → dict`