Skip to main content

Tool for extracting reddit comments

Project description

Reddit Comments Analyzer

Build Status Python 3.4 Python 3.5 Python 3.6 License: MIT

General package for reddit comments analysis, data manipulation, and other areas

Install Instructions

pip install .

or for released version:

pip install reddit-extract

How to Use

Required:

  • Reddit Client ID
  • Reddit Client Secret
  • Reddit User Agent
  • Subreddit
  • List of Thread IDs for bulk extract of multiple Reddit threads
  • Dictionary (Should match your regex search pattern, otherwise your headers will not match the data retrieved, for example: a defined copy/paste form for a reddit thread that users reply to, aka "Megathreads")
  • Regex Pattern (for csv)

Extracting all comments for list of threads to csv with defined headers and search pattern:

import reddit_extract
reddit_extract.extract_comments_csv_bulk(<Reddit ClientID>, <Reddit ClientSecret>, <Reddit User Agent>, <Subreddit>, <Thread IDs>, <Dictionary Headers>)

Extracting all comments for list of threads to txt:

import reddit_extract
reddit_extract.extract_comments_txt_bulk(<Reddit ClientID>, <Reddit ClientSecret>, <Reddit User Agent>, <Subreddit>, <Thread IDs>)

Example:

import reddit_extract
threads = ['aw79c5', 'b7x7n1', 'am5uk7', 'bji681', 'abv2gl', '9klf8e']
search_pattern = r'Form: (.*)\n*Entity: (.*)\n*Pending: (.*)\n*Approved: (.*)\n*Standardized wait: (.*)\n*STATE: (.*)'
headers = {'Form': 'Form', 'Entity': 'Entity', 'Pending': 'Pending', 'Approved': 'Approved', 'Standardized Wait': 'Standardized Wait', 'STATE': 'STATE'}
reddit_extract.extract_comments_csv_bulk(<client_id>, <client_secret>, <user_agent>, 'nfa', threads, headers, search_pattern)

Tests

python setup.py test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reddit_extract-0.2.0.tar.gz (6.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page