Skip to main content

A professional tool for analyzing which Reddit post flairs have the highest chance of going viral

Project description

Reddit Flair Analyzer

Python Version License Version GitHub stars PyPI version Downloads

A professional tool for analyzing which post flairs have the highest chance of going viral on Reddit subreddits. This package helps content creators, marketers, and researchers understand flair performance to optimize content strategy.

🚀 Features

  • Comprehensive Scraping: Collect thousands of posts from any public subreddit
  • Advanced Analytics: Calculate viral rates, engagement metrics, and performance statistics
  • Beautiful Visualizations: Generate publication-ready charts and interactive dashboards
  • Flexible Export: Save results in multiple formats (CSV, JSON, Excel)
  • Detailed Logging: Track progress with configurable logging levels
  • Command-line Interface: Run analyses without writing code
  • Optimized Performance: Multi-threaded scraping for faster data collection

🔧 Installation

# Install from PyPI
pip install reddit-flair-analyzer

# Or install from the repository
git clone https://github.com/themanojdesai/reddit-flair-analyzer.git
cd reddit-flair-analyzer
pip install -e .

The package is available on PyPI: https://pypi.org/project/reddit-flair-analyzer/

📖 Quick Start

Using the CLI

The CLI provides a user-friendly way to analyze Reddit flairs without writing code. You'll need to provide your Reddit API credentials (either as arguments or environment variables).

# Basic analysis with credentials
reddit-analyze --subreddit AskReddit --client-id YOUR_CLIENT_ID --client-secret YOUR_CLIENT_SECRET

# Using environment variables for credentials
export REDDIT_CLIENT_ID="your_client_id"
export REDDIT_CLIENT_SECRET="your_client_secret"
reddit-analyze --subreddit datascience

# Specify number of posts and time frame
reddit-analyze --subreddit wallstreetbets --posts 500 --timeframe month

# Set viral threshold (percentile)
reddit-analyze --subreddit science --threshold 95  # Top 5% are considered viral

# Export results to different formats
reddit-analyze --subreddit movies --export excel
reddit-analyze --subreddit gaming --export json

# Save results to custom directory
reddit-analyze --subreddit programming --output ./my_analysis

# Disable auto-opening dashboard in browser
reddit-analyze --subreddit python --auto-open false

# Get verbose output for debugging
reddit-analyze --subreddit learnpython --verbose --log-file

# Show version information
reddit-analyze --version

# Get help with all available options
reddit-analyze --help

CLI Options

Option Description Default
--subreddit, -s Subreddit to analyze (required)
--client-id, -c Reddit API client ID (Environment variable)
--client-secret, -cs Reddit API client secret (Environment variable)
--user-agent, -u Reddit API user agent "Reddit Flair Analyzer CLI v1.0"
--posts, -p Maximum posts to retrieve 500
--timeframe, -t Time filter (all, day, week, month, year) all
--threshold, -th Viral threshold percentile (50-99) 90
--output, -o Output directory for results ./results
--export, -e Export format (csv, excel, json) csv
--interactive, -i Generate interactive visualizations True
--auto-open, -a Open dashboard in browser True
--verbose, -v Enable verbose output False
--log-file Enable logging to file False
--version Show version information
--help Show help message

Using the Python API

The Python API gives you programmatic access to all analysis features with more flexibility:

from redditflairanalyzer import RedditAnalyzer
import os

# Initialize the analyzer with your Reddit API credentials
analyzer = RedditAnalyzer(
    client_id="YOUR_CLIENT_ID",
    client_secret="YOUR_CLIENT_SECRET",
    user_agent="MyApp/1.0 (by /u/YourUsername)"
)

# Analyze a subreddit
results = analyzer.analyze_subreddit(
    subreddit="datascience",    # Subreddit name (without r/ prefix)
    post_limit=1000,            # Number of posts to analyze
    time_filter="year",         # Time filter (all, day, week, month, year)
    viral_threshold=90          # Percentile to consider viral (top 10%)
)

# Generate visualizations
visualization_files = analyzer.visualize(
    results,
    output_dir="./results/datascience",
    plot_types=["bar", "heatmap", "scatter", "dashboard"],  # Specific plots to generate
    interactive=True            # Create interactive HTML visualizations
)

# Export results to different formats
analyzer.export(results, format="excel", filename="datascience_analysis.xlsx")
analyzer.export(results, format="csv", filename="datascience_analysis.csv")
analyzer.export(results, format="json", filename="datascience_analysis.json")

# Access analysis results directly
flair_stats = results["flair_stats"]
posts_df = results["posts_df"]
viral_threshold = results["viral_threshold"]
metrics = results["metrics"]

# Print top 5 flairs by viral rate
top_flairs = flair_stats.head(5)
print("\nTop flairs by viral rate:")
for i, (_, row) in enumerate(top_flairs.iterrows(), 1):
    print(f"{i}. {row['flair']}: {row['viral_rate']:.1%} viral rate ({row['total_posts']} posts)")

# Find optimal posting time
if "created_utc" in posts_df.columns:
    posts_df["hour"] = posts_df["created_utc"].dt.hour
    hour_stats = posts_df.groupby("hour")["score"].mean()
    best_hour = hour_stats.idxmax()
    print(f"\nBest hour to post: {best_hour}:00 UTC (Avg score: {hour_stats.max():.1f})")

Main API Components

RedditAnalyzer Class

The main entry point for the package:

analyzer = RedditAnalyzer(
    client_id,              # Reddit API client ID
    client_secret,          # Reddit API client secret
    user_agent,             # Reddit API user agent
    log_level=logging.INFO  # Optional logging level
)

analyze_subreddit() Method

Analyzes a subreddit to find which flairs have the highest viral potential:

results = analyzer.analyze_subreddit(
    subreddit,              # Name of the subreddit
    post_limit=1000,        # Maximum posts to retrieve
    time_filter="all",      # Time filter (all, day, week, month, year)
    viral_threshold=90      # Percentile to consider viral
)

visualize() Method

Generates visualizations from analysis results:

files = analyzer.visualize(
    results,                # Results from analyze_subreddit()
    output_dir="./results", # Directory to save visualizations
    plot_types=None,        # Types of plots (None = all)
    interactive=True        # Whether to create interactive visualizations
)

Available plot types: 'bar', 'heatmap', 'scatter', 'bubble', 'time', 'distribution', 'dashboard'

export() Method

Exports analysis results to file:

path = analyzer.export(
    results,                # Results from analyze_subreddit()
    format="csv",           # Export format (csv, excel, json)
    filename=None           # Custom filename (None = auto-generated)
)

📝 Documentation

For detailed documentation, see:

🔒 Authentication

You need to create a Reddit application to get API credentials:

  1. Visit https://www.reddit.com/prefs/apps
  2. Click "create app" at the bottom
  3. Fill in the name, select "script", and enter "http://localhost:8080" as the redirect URI
  4. Note your client ID and client secret

⭐ Star This Repository

If you find this tool useful, please consider giving it a star on GitHub! It helps others discover the project and motivates further development.

GitHub Repo stars

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📧 Contact

Questions? Issues? Please open an issue on the GitHub repository.

Connect with the author:


Made with ❤️ by Manoj Desai

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reddit_flair_analyzer-1.0.1.tar.gz (41.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

reddit_flair_analyzer-1.0.1-py3-none-any.whl (35.1 kB view details)

Uploaded Python 3

File details

Details for the file reddit_flair_analyzer-1.0.1.tar.gz.

File metadata

  • Download URL: reddit_flair_analyzer-1.0.1.tar.gz
  • Upload date:
  • Size: 41.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for reddit_flair_analyzer-1.0.1.tar.gz
Algorithm Hash digest
SHA256 fc7629a2fb84de7694fc73b5441dabbb8415dc465acf0c80ff69e60e6f2d9ebe
MD5 424f86a90bf29f2057aead9467c5fcfd
BLAKE2b-256 c41949969b30a4e6f04ca53a83199ce5206d9b509d397c98bea7e3932bb5eda1

See more details on using hashes here.

File details

Details for the file reddit_flair_analyzer-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for reddit_flair_analyzer-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cd1472d1c01a5f87ea800b4aee2a6ebadc90bf58cdec3b19f84078a9a62b7d63
MD5 bd2c61bc22acbf022a50f650165bbc48
BLAKE2b-256 1b397c0e3dde9b1cb0912397be5006920867c74f1e65df6e0772ca8b8eaabd54

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page