A professional tool for analyzing which Reddit post flairs have the highest chance of going viral
Project description
Reddit Flair Analyzer
A professional tool for analyzing which post flairs have the highest chance of going viral on Reddit subreddits. This package helps content creators, marketers, and researchers understand flair performance to optimize content strategy.
🚀 Features
- Comprehensive Scraping: Collect thousands of posts from any public subreddit
- Advanced Analytics: Calculate viral rates, engagement metrics, and performance statistics
- Beautiful Visualizations: Generate publication-ready charts and interactive dashboards
- Flexible Export: Save results in multiple formats (CSV, JSON, Excel)
- Detailed Logging: Track progress with configurable logging levels
- Command-line Interface: Run analyses without writing code
- Optimized Performance: Multi-threaded scraping for faster data collection
🔧 Installation
# Install from PyPI
pip install reddit-flair-analyzer
# Or install from the repository
git clone https://github.com/themanojdesai/reddit-flair-analyzer.git
cd reddit-flair-analyzer
pip install -e .
The package is available on PyPI: https://pypi.org/project/reddit-flair-analyzer/
📖 Quick Start
Using the CLI
The CLI provides a user-friendly way to analyze Reddit flairs without writing code. You'll need to provide your Reddit API credentials (either as arguments or environment variables).
# Basic analysis with credentials
reddit-analyze --subreddit AskReddit --client-id YOUR_CLIENT_ID --client-secret YOUR_CLIENT_SECRET
# Using environment variables for credentials
export REDDIT_CLIENT_ID="your_client_id"
export REDDIT_CLIENT_SECRET="your_client_secret"
reddit-analyze --subreddit datascience
# Specify number of posts and time frame
reddit-analyze --subreddit wallstreetbets --posts 500 --timeframe month
# Set viral threshold (percentile)
reddit-analyze --subreddit science --threshold 95 # Top 5% are considered viral
# Export results to different formats
reddit-analyze --subreddit movies --export excel
reddit-analyze --subreddit gaming --export json
# Save results to custom directory
reddit-analyze --subreddit programming --output ./my_analysis
# Disable auto-opening dashboard in browser
reddit-analyze --subreddit python --auto-open false
# Get verbose output for debugging
reddit-analyze --subreddit learnpython --verbose --log-file
# Show version information
reddit-analyze --version
# Get help with all available options
reddit-analyze --help
CLI Options
| Option | Description | Default |
|---|---|---|
--subreddit, -s |
Subreddit to analyze (required) | |
--client-id, -c |
Reddit API client ID | (Environment variable) |
--client-secret, -cs |
Reddit API client secret | (Environment variable) |
--user-agent, -u |
Reddit API user agent | "Reddit Flair Analyzer CLI v1.0" |
--posts, -p |
Maximum posts to retrieve | 500 |
--timeframe, -t |
Time filter (all, day, week, month, year) | all |
--threshold, -th |
Viral threshold percentile (50-99) | 90 |
--output, -o |
Output directory for results | ./results |
--export, -e |
Export format (csv, excel, json) | csv |
--interactive, -i |
Generate interactive visualizations | True |
--auto-open, -a |
Open dashboard in browser | True |
--verbose, -v |
Enable verbose output | False |
--log-file |
Enable logging to file | False |
--version |
Show version information | |
--help |
Show help message |
Using the Python API
The Python API gives you programmatic access to all analysis features with more flexibility:
from redditflairanalyzer import RedditAnalyzer
import os
# Initialize the analyzer with your Reddit API credentials
analyzer = RedditAnalyzer(
client_id="YOUR_CLIENT_ID",
client_secret="YOUR_CLIENT_SECRET",
user_agent="MyApp/1.0 (by /u/YourUsername)"
)
# Analyze a subreddit
results = analyzer.analyze_subreddit(
subreddit="datascience", # Subreddit name (without r/ prefix)
post_limit=1000, # Number of posts to analyze
time_filter="year", # Time filter (all, day, week, month, year)
viral_threshold=90 # Percentile to consider viral (top 10%)
)
# Generate visualizations
visualization_files = analyzer.visualize(
results,
output_dir="./results/datascience",
plot_types=["bar", "heatmap", "scatter", "dashboard"], # Specific plots to generate
interactive=True # Create interactive HTML visualizations
)
# Export results to different formats
analyzer.export(results, format="excel", filename="datascience_analysis.xlsx")
analyzer.export(results, format="csv", filename="datascience_analysis.csv")
analyzer.export(results, format="json", filename="datascience_analysis.json")
# Access analysis results directly
flair_stats = results["flair_stats"]
posts_df = results["posts_df"]
viral_threshold = results["viral_threshold"]
metrics = results["metrics"]
# Print top 5 flairs by viral rate
top_flairs = flair_stats.head(5)
print("\nTop flairs by viral rate:")
for i, (_, row) in enumerate(top_flairs.iterrows(), 1):
print(f"{i}. {row['flair']}: {row['viral_rate']:.1%} viral rate ({row['total_posts']} posts)")
# Find optimal posting time
if "created_utc" in posts_df.columns:
posts_df["hour"] = posts_df["created_utc"].dt.hour
hour_stats = posts_df.groupby("hour")["score"].mean()
best_hour = hour_stats.idxmax()
print(f"\nBest hour to post: {best_hour}:00 UTC (Avg score: {hour_stats.max():.1f})")
Main API Components
RedditAnalyzer Class
The main entry point for the package:
analyzer = RedditAnalyzer(
client_id, # Reddit API client ID
client_secret, # Reddit API client secret
user_agent, # Reddit API user agent
log_level=logging.INFO # Optional logging level
)
analyze_subreddit() Method
Analyzes a subreddit to find which flairs have the highest viral potential:
results = analyzer.analyze_subreddit(
subreddit, # Name of the subreddit
post_limit=1000, # Maximum posts to retrieve
time_filter="all", # Time filter (all, day, week, month, year)
viral_threshold=90 # Percentile to consider viral
)
visualize() Method
Generates visualizations from analysis results:
files = analyzer.visualize(
results, # Results from analyze_subreddit()
output_dir="./results", # Directory to save visualizations
plot_types=None, # Types of plots (None = all)
interactive=True # Whether to create interactive visualizations
)
Available plot types: 'bar', 'heatmap', 'scatter', 'bubble', 'time', 'distribution', 'dashboard'
export() Method
Exports analysis results to file:
path = analyzer.export(
results, # Results from analyze_subreddit()
format="csv", # Export format (csv, excel, json)
filename=None # Custom filename (None = auto-generated)
)
📝 Documentation
For detailed documentation, see:
🔒 Authentication
You need to create a Reddit application to get API credentials:
- Visit https://www.reddit.com/prefs/apps
- Click "create app" at the bottom
- Fill in the name, select "script", and enter "http://localhost:8080" as the redirect URI
- Note your client ID and client secret
⭐ Star This Repository
If you find this tool useful, please consider giving it a star on GitHub! It helps others discover the project and motivates further development.
📜 License
This project is licensed under the MIT License - see the LICENSE file for details.
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
📧 Contact
Questions? Issues? Please open an issue on the GitHub repository.
Connect with the author:
Made with ❤️ by Manoj Desai
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file reddit_flair_analyzer-1.0.1.tar.gz.
File metadata
- Download URL: reddit_flair_analyzer-1.0.1.tar.gz
- Upload date:
- Size: 41.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc7629a2fb84de7694fc73b5441dabbb8415dc465acf0c80ff69e60e6f2d9ebe
|
|
| MD5 |
424f86a90bf29f2057aead9467c5fcfd
|
|
| BLAKE2b-256 |
c41949969b30a4e6f04ca53a83199ce5206d9b509d397c98bea7e3932bb5eda1
|
File details
Details for the file reddit_flair_analyzer-1.0.1-py3-none-any.whl.
File metadata
- Download URL: reddit_flair_analyzer-1.0.1-py3-none-any.whl
- Upload date:
- Size: 35.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cd1472d1c01a5f87ea800b4aee2a6ebadc90bf58cdec3b19f84078a9a62b7d63
|
|
| MD5 |
bd2c61bc22acbf022a50f650165bbc48
|
|
| BLAKE2b-256 |
1b397c0e3dde9b1cb0912397be5006920867c74f1e65df6e0772ca8b8eaabd54
|