Skip to main content

Reddit Extraction and Data Dumper โ€” a modern, async-ready library for extracting Reddit data without API keys.

Project description

๐Ÿ”ด REDD

Reddit Extraction and Data Dumper

PyPI version Python License: MIT

A modern, async-ready Python library for extracting Reddit data โ€” no API keys required.


โœจ Features

  • No API keys โ€” uses Reddit's public .json endpoints
  • Sync & Async โ€” choose Redd or AsyncRedd depending on your stack
  • Typed models โ€” frozen dataclasses, not raw dicts
  • Hexagonal architecture โ€” swap HTTP adapters freely
  • Auto-pagination โ€” fetch hundreds of posts with a single call
  • User-Agent rotation โ€” built-in rotation to reduce ban risk
  • Proxy support โ€” pass a proxy URL and you're set
  • Throttling โ€” configurable random sleep between paginated requests

๐Ÿ“ฆ Installation

pip install redd

For async support (uses httpx):

pip install redd[async]

Or with uv:

uv add redd

๐Ÿš€ Quick Start

Sync

from redd import Redd, Category, TimeFilter

with Redd() as r:
    # Search Reddit
    results = r.search("Python programming", limit=5)
    for item in results:
        print(f"  {item.title} โ†’ {item.url}")

    # Fetch top posts from a subreddit
    posts = r.get_subreddit_posts(
        "Python",
        limit=10,
        category=Category.TOP,
        time_filter=TimeFilter.WEEK,
    )
    for post in posts:
        print(f"  [{post.score:>5}] {post.title}")

    # Get full post details with comments
    detail = r.get_post("/r/Python/comments/abc123/example_post/")
    print(f"  {detail.title} โ€” {len(detail.comments)} comments")

    # Scrape user activity
    items = r.get_user("spez", limit=10)
    for item in items:
        print(f"  [{item.kind}] {item.title or item.body[:80]}")

Async

import asyncio
from redd import AsyncRedd

async def main():
    async with AsyncRedd() as r:
        results = await r.search("machine learning", limit=5)
        for item in results:
            print(item.title)

asyncio.run(main())

๐Ÿ“– API Reference

Clients

Class Description
Redd Synchronous client (uses requests)
AsyncRedd Asynchronous client (uses httpx)

Both support context managers and share the exact same API surface:

Methods

Method Description
search(query, *, limit, sort, after, before) Search all of Reddit
search_subreddit(subreddit, query, *, limit, sort, after, before) Search within a subreddit
get_post(permalink) Get full post details + comment tree
get_user(username, *, limit) Get a user's recent activity
get_subreddit_posts(subreddit, *, limit, category, time_filter) Fetch subreddit listings
get_user_posts(username, *, limit, category, time_filter) Fetch a user's submitted posts
download_image(image_url, *, output_dir) Download an image
close() Release HTTP resources

Models

Model Fields
SearchResult title, url, description, subreddit
PostDetail title, author, body, score, url, subreddit, created_utc, num_comments, comments
Comment author, body, score, replies
SubredditPost title, author, permalink, score, num_comments, created_utc, subreddit, url, image_url, thumbnail_url
UserItem kind, subreddit, url, created_utc, title, body

Enums

Enum Values
Category HOT, TOP, NEW, RISING
UserCategory HOT, TOP, NEW
TimeFilter HOUR, DAY, WEEK, MONTH, YEAR, ALL
SortOrder RELEVANCE, HOT, TOP, NEW, COMMENTS

Configuration

r = Redd(
    proxy="http://user:pass@host:port",  # optional proxy
    timeout=15.0,                        # request timeout (seconds)
    rotate_user_agent=True,              # rotate UA per request
    throttle=(1.0, 3.0),                 # random sleep range between pages
)

๐Ÿ—๏ธ Architecture

REDD follows hexagonal architecture (ports & adapters):

src/redd/
โ”œโ”€โ”€ __init__.py           # Public API
โ”œโ”€โ”€ _client.py            # Sync client (Redd)
โ”œโ”€โ”€ _async_client.py      # Async client (AsyncRedd)
โ”œโ”€โ”€ _parsing.py           # JSON โ†’ domain models (I/O-free)
โ”œโ”€โ”€ _exceptions.py        # Error hierarchy
โ”‚
โ”œโ”€โ”€ domain/               # Pure domain layer
โ”‚   โ”œโ”€โ”€ models.py         # Frozen dataclasses
โ”‚   โ””โ”€โ”€ enums.py          # Type-safe enumerations
โ”‚
โ”œโ”€โ”€ ports/                # Abstract interfaces
โ”‚   โ””โ”€โ”€ http.py           # HttpPort & AsyncHttpPort protocols
โ”‚
โ””โ”€โ”€ adapters/             # Concrete implementations
    โ”œโ”€โ”€ http_sync.py      # requests-based adapter
    โ””โ”€โ”€ http_async.py     # httpx-based adapter

โš ๏ธ Disclaimer

Use responsibly. Reddit may rate-limit or ban IPs that make excessive requests. Consider using rotating proxies for large-scale scraping.

๐Ÿ“„ License

MIT ยฉ Elias Biondo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

redd-0.1.0.tar.gz (24.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

redd-0.1.0-py3-none-any.whl (19.9 kB view details)

Uploaded Python 3

File details

Details for the file redd-0.1.0.tar.gz.

File metadata

  • Download URL: redd-0.1.0.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for redd-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e3fdcbd414295027a37e502891182def665025946803024476e9f0cc6fe4adec
MD5 40a895fefc45e8636ef70bff528587a4
BLAKE2b-256 aef068d161328979cd5b750b681c1a364372a5e1c8d6ebc550422637df525371

See more details on using hashes here.

File details

Details for the file redd-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: redd-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 19.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for redd-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3789329d529227fe52a5e7c62735f6220e46616c96861bdb2c198cdc9e5e41ff
MD5 ab61ea3f8c8c65caf064e2f96f9d1203
BLAKE2b-256 f8e39ef696caaef38691c0c56d5075bbce999a562625f5eadd0ccdfa848c8e32

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page