Tool for extracting reddit comments
Project description
Reddit Comments Analyzer
General package for reddit comments analysis, data manipulation, and other areas
Install Instructions
pip install .
or for released version:
pip install reddit-extract
How to Use
Required:
- Reddit Client ID
- Reddit Client Secret
- Reddit User Agent
- Subreddit
- List of Thread IDs for bulk extract of multiple Reddit threads
- Dictionary (Should match your regex search pattern, otherwise your headers will not match the data retrieved, for example: a defined copy/paste form for a reddit thread that users reply to, aka "Megathreads")
- Regex Pattern (for csv)
Extracting all comments for list of threads to csv with defined headers and search pattern:
import reddit_extract
reddit_extract.extract_comments_csv_bulk(<Reddit ClientID>, <Reddit ClientSecret>, <Reddit User Agent>, <Subreddit>, <Thread IDs>, <Dictionary Headers>)
Extracting all comments for list of threads to txt:
import reddit_extract
reddit_extract.extract_comments_txt_bulk(<Reddit ClientID>, <Reddit ClientSecret>, <Reddit User Agent>, <Subreddit>, <Thread IDs>)
Example:
import reddit_extract
threads = ['aw79c5', 'b7x7n1', 'am5uk7', 'bji681', 'abv2gl', '9klf8e']
search_pattern = r'Form: (.*)\n*Entity: (.*)\n*Pending: (.*)\n*Approved: (.*)\n*Standardized wait: (.*)\n*STATE: (.*)'
headers = {'Form': 'Form', 'Entity': 'Entity', 'Pending': 'Pending', 'Approved': 'Approved', 'Standardized Wait': 'Standardized Wait', 'STATE': 'STATE'}
reddit_extract.extract_comments_csv_bulk(<client_id>, <client_secret>, <user_agent>, 'nfa', threads, headers, search_pattern)
Tests
python setup.py test
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
reddit_extract-0.2.0.tar.gz
(6.2 kB
view details)
File details
Details for the file reddit_extract-0.2.0.tar.gz
.
File metadata
- Download URL: reddit_extract-0.2.0.tar.gz
- Upload date:
- Size: 6.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.5.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0abf19cfac08d2ad4552a4ce085b60e83ea01c7ffc9a25a4878e63e63fa3364c |
|
MD5 | 116677c8d1962c34e1fa8fa3df1f9760 |
|
BLAKE2b-256 | b5d2a960b687ce52c887a9c6349255a411b90902578c64a5245b9501442ee63f |