Reddit Extraction and Data Dumper — a modern, async-ready library for extracting Reddit data without API keys.

These details have not been verified by PyPI

Project links

Project description

REDD

Reddit Extraction and Data Dumper

A modern, async-ready Python library for extracting Reddit data. No API keys required.

Features
Installation
Quick Start
API Reference
Architecture
Examples
Contributing
Disclaimer
License

1. Features

No API keys — uses Reddit's public .json endpoints.
Sync and async — choose Redd or AsyncRedd depending on your stack.
Typed models — frozen dataclasses instead of raw dictionaries.
Hexagonal architecture — swap HTTP adapters without touching business logic.
Auto-pagination — fetch hundreds of posts with a single call.
User-Agent rotation — built-in rotation to reduce ban risk.
Proxy support — pass a proxy URL and scrape at scale.
Throttling — configurable random sleep between paginated requests.

2. Installation

With uv (recommended):

uv add redd

With pip:

pip install redd

For async support (requires httpx):

uv add redd httpx

3. Quick Start

3.1. Synchronous usage

from redd import Redd, Category, TimeFilter

with Redd() as r:
    # Search Reddit
    results = r.search("Python programming", limit=5)
    for item in results:
        print(f"  {item.title}")

    # Fetch top posts from a subreddit
    posts = r.get_subreddit_posts(
        "Python",
        limit=10,
        category=Category.TOP,
        time_filter=TimeFilter.WEEK,
    )
    for post in posts:
        print(f"  [{post.score:>5}] {post.title}")

    # Get full post details with comments
    detail = r.get_post("/r/Python/comments/abc123/example_post/")
    print(f"  {detail.title} -- {len(detail.comments)} comments")

    # Scrape user activity
    items = r.get_user("spez", limit=10)
    for item in items:
        print(f"  [{item.kind}] {item.title or item.body[:80]}")

3.2. Asynchronous usage

import asyncio
from redd import AsyncRedd

async def main():
    async with AsyncRedd() as r:
        results = await r.search("machine learning", limit=5)
        for item in results:
            print(item.title)

asyncio.run(main())

3.3. Configuration

r = Redd(
    proxy="http://user:pass@host:port",  # Optional proxy
    timeout=15.0,                        # Request timeout in seconds
    rotate_user_agent=True,              # Rotate UA per request
    throttle=(1.0, 3.0),                 # Random sleep range between pages
)

4. API Reference

4.1. Clients

Class	Description
`Redd`	Synchronous client (`requests`)
`AsyncRedd`	Asynchronous client (`httpx`)

Both clients support context managers and expose the same API surface.

4.2. Methods

Method	Description
`search(query, *, limit, sort, after, before)`	Search all of Reddit
`search_subreddit(subreddit, query, *, limit, sort, after, before)`	Search within a subreddit
`get_post(permalink)`	Get full post details and comment tree
`get_user(username, *, limit)`	Get a user's recent activity
`get_subreddit_posts(subreddit, *, limit, category, time_filter)`	Fetch subreddit listings
`get_user_posts(username, *, limit, category, time_filter)`	Fetch a user's submitted posts
`download_image(image_url, *, output_dir)`	Download an image
`close()`	Release HTTP resources

4.3. Models

All models are frozen dataclasses.

Model	Fields
`SearchResult`	`title`, `url`, `description`, `subreddit`
`PostDetail`	`title`, `author`, `body`, `score`, `url`, `subreddit`, `created_utc`, `num_comments`, `comments`
`Comment`	`author`, `body`, `score`, `replies`
`SubredditPost`	`title`, `author`, `permalink`, `score`, `num_comments`, `created_utc`, `subreddit`, `url`, `image_url`, `thumbnail_url`
`UserItem`	`kind`, `subreddit`, `url`, `created_utc`, `title`, `body`

4.4. Enums

Enum	Values
`Category`	`HOT`, `TOP`, `NEW`, `RISING`
`UserCategory`	`HOT`, `TOP`, `NEW`
`TimeFilter`	`HOUR`, `DAY`, `WEEK`, `MONTH`, `YEAR`, `ALL`
`SortOrder`	`RELEVANCE`, `HOT`, `TOP`, `NEW`, `COMMENTS`

4.5. Exceptions

Exception	Description
`ReddError`	Base exception for all REDD errors
`HttpError`	HTTP request failed after retries
`ParseError`	Reddit's JSON could not be parsed into domain models
`NotFoundError`	Requested resource does not exist

5. Architecture

REDD follows hexagonal architecture (ports and adapters), separating business logic from I/O concerns:

graph LR
    subgraph Public API
        A["Redd (sync)"]
        B["AsyncRedd (async)"]
    end

    subgraph Core
        C["Parsing Layer"]
        D["Domain Models"]
        E["Enums"]
    end

    subgraph Ports
        F["HttpPort"]
        G["AsyncHttpPort"]
    end

    subgraph Adapters
        H["RequestsHttpAdapter"]
        I["HttpxAsyncAdapter"]
    end

    A --> C
    B --> C
    C --> D
    C --> E
    A --> F
    B --> G
    F -.implements.-> H
    G -.implements.-> I
    H --> J["reddit.com"]
    I --> J

Directory layout

src/redd/
├── __init__.py           # Public API surface
├── _client.py            # Sync client (Redd)
├── _async_client.py      # Async client (AsyncRedd)
├── _parsing.py           # JSON to domain model parsing (I/O-free)
├── _exceptions.py        # Error hierarchy
│
├── domain/               # Pure domain layer
│   ├── models.py         # Frozen dataclasses
│   └── enums.py          # Type-safe enumerations
│
├── ports/                # Abstract interfaces
│   └── http.py           # HttpPort and AsyncHttpPort protocols
│
└── adapters/             # Concrete implementations
    ├── http_sync.py      # requests-based adapter
    └── http_async.py     # httpx-based adapter

The parsing module has no I/O dependencies. Clients interact with the HTTP layer exclusively through protocol-based ports, making it straightforward to swap adapters, mock dependencies in tests, or add new transports.

6. Examples

See the examples/ directory for runnable scripts.

Fetch hot posts from a subreddit (subreddit_hot_posts.py):

from redd import Category, Redd

with Redd() as r:
    posts = r.get_subreddit_posts("brdev", limit=10, category=Category.HOT)

    for i, post in enumerate(posts, 1):
        print(f"{i:>2}. [{post.score:>5}] {post.title}")
        print(f"     by u/{post.author} — {post.num_comments} comments")
        print(f"     {post.url}")
        print()

Sample output:

 1. [   91] Qual o plano B de vocês caso a área piore muito?
     by u/Spiritual_Pangolin18 — 185 comments
     https://www.reddit.com/r/brdev/comments/1rnytuh/...

 2. [   83] Fuçando minhas coisas, encontrei um código de 600 linhas em Portugol
     by u/Dramatic-Revenue-802 — 7 comments
     https://www.reddit.com/r/brdev/comments/1ro269a/...

7. Contributing

Contributions are welcome. Please read CONTRIBUTING.md for guidelines on setting up the project, running tests, and submitting changes.

8. Disclaimer

Use responsibly. Reddit may rate-limit or ban IPs that make excessive requests. Consider using rotating proxies for large-scale scraping.

9. License

MIT. See LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Mar 10, 2026

0.1.2

Mar 9, 2026

0.1.0

Mar 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

redd-0.2.0.tar.gz (48.4 kB view details)

Uploaded Mar 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

redd-0.2.0-py3-none-any.whl (21.9 kB view details)

Uploaded Mar 10, 2026 Python 3

File details

Details for the file redd-0.2.0.tar.gz.

File metadata

Download URL: redd-0.2.0.tar.gz
Upload date: Mar 10, 2026
Size: 48.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for redd-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`e25f3cd0cfa2405b5cacdacdff116e1e8204a217afad228c0a6f3e311f656154`
MD5	`c1b48d5d505af8be39017297909fb680`
BLAKE2b-256	`bd16eab8755597bfbccc097e5d2de164e4c2b34620724376b4bf3ebb2220fe3c`

See more details on using hashes here.

File details

Details for the file redd-0.2.0-py3-none-any.whl.

File metadata

Download URL: redd-0.2.0-py3-none-any.whl
Upload date: Mar 10, 2026
Size: 21.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for redd-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f1b8d0aeaa43890d58c0ea05c50bd0e27b93eb6da977259f4faf8ff95887ef3d`
MD5	`4e57e5b6f9f099f3db21db3f5b0cad09`
BLAKE2b-256	`8b97f8c3533e2d88c1609f8302a326bd7f2f34879fb7cd3755cd5e5789c6ea94`

See more details on using hashes here.

redd 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

REDD

Table of Contents

1. Features

2. Installation

3. Quick Start

3.1. Synchronous usage

3.2. Asynchronous usage

3.3. Configuration

4. API Reference

4.1. Clients

4.2. Methods

4.3. Models

4.4. Enums

4.5. Exceptions

5. Architecture

Directory layout

6. Examples

7. Contributing

8. Disclaimer

9. License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes