Skip to main content

No project description provided

Project description

reddit-to-sqlite

Save data from Reddit to SQLite. Dogsheep-based.

Inserts records of posts and comments into a SQLite database. Can be run repeatedly safely; will refresh already-saved results (see Reload, below). Creates posts and comments tables, plus an items view with a unified view.

Usage

reddit-to-sqlite r/python
reddit-to-sqlite u/catherinedevlin 
reddit-to-sqlite --help 

By default, writes to a local reddit.db database (change with --db).

Authorizing

reddit-to-sqlite will look for a file of authorization info (location determined by --auth, defaults to ~/.config/reddit-to-sqlite.json) and, if not found, will query the user and then save the info there. You will need a Reddit username and password, and you will need to register your app with Reddit to get a client_id and client_secret. (More instructions)

Limits

Whether used for users or for subreddits, can't guarantee getting all posts or comments, because

  • Reddit's API only supplies the last 1000 items for each API call, and does not support pagination;
  • Comments nested under a single post won't be retrieved if they are deeply nested in long comment chains (see replace_more)

Reload

reddit_to_sql can be run repeatedly for a given user or subreddit, replacing previously saved results each time. However, to save excessive API use, it works backward through time and stops after it reaches the timestamp of the last saved post, plus an overlap period (default 7 days). That way, recent changes (scores, new comments) will be recorded, but older ones won't unless --post_reload is increased. If posts keep getting comments of interest long after they are posted, you can increase --post_reload.

When loading an individual user's comments, by default reddit_to_sql stops just 1 day after reaching the most recent comment that is already recorded in the database. However, if you're interested in comment scores, you may want to impose a longer --comment_reload, since scores may keep changing for longer than a single day after the comment is posted.

Notes

  • author is saved in case-sensitive form, so case-insensitive searching with LIKE may be helpful.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reddit-to-sqlite-0.1.0.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

reddit_to_sqlite-0.1.0-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file reddit-to-sqlite-0.1.0.tar.gz.

File metadata

  • Download URL: reddit-to-sqlite-0.1.0.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.6 CPython/3.9.1 Darwin/18.7.0

File hashes

Hashes for reddit-to-sqlite-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7707dd5e41fc1e9e72be968873ce02a9bf3cd983dbf7b364c53eaa0b4ae5894e
MD5 f631a2fdb00d8b94788143f746383b53
BLAKE2b-256 99b1d0af3e21c49ef59878558d2b803514296d4973acd3cdb3f8af8bbc3fc43a

See more details on using hashes here.

File details

Details for the file reddit_to_sqlite-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for reddit_to_sqlite-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bcf8141212b6b650c94dbb0165d536ac5a01ad8c06b41e6ef97ff13af0b8cfea
MD5 250af1906f9cf8542110a9584d67c7d7
BLAKE2b-256 1cfc460dbd06447796497ff4ae89e3a35bb5e875569ca793e74439c97bcce082

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page