No project description provided
Project description
reddit-to-sqlite
Save data from Reddit to SQLite. Dogsheep-based.
Inserts records of posts and comments into a SQLite database. Can
be run repeatedly safely; will refresh already-saved results (see Reload, below).
Creates posts
and comments
tables, plus an items
view with a unified
view.
Usage
reddit-to-sqlite r/python
reddit-to-sqlite u/catherinedevlin
reddit-to-sqlite --help
By default, writes to a local reddit.db
database (change with --db
).
Authorizing
reddit-to-sqlite will look for a file of authorization info (location determined
by --auth, defaults to ~/.config/reddit-to-sqlite.json
) and, if not found, will
query the user and then save the info there. You will need a Reddit username and
password, and you will need to
register your app with Reddit to get a client_id
and client_secret. (More instructions)
Limits
Whether used for users or for subreddits, can't guarantee getting all posts or comments, because
- Reddit's API only supplies the last 1000 items for each API call, and does not support pagination;
- Comments nested under a single post won't be retrieved if they are deeply nested in long comment chains (see replace_more)
Reload
reddit_to_sql can be run repeatedly for a given user or subreddit, replacing previously saved
results each time. However, to save excessive API use, it works backward through time and
stops after it reaches the timestamp of the last saved post, plus an overlap period (default
7 days). That way, recent changes (scores, new comments) will be recorded, but older ones
won't unless --post_reload
is increased. If posts keep getting comments of interest long
after they are posted, you can increase --post_reload
.
When loading an individual user's comments, by default reddit_to_sql stops just 1 day after
reaching the most recent comment that is already recorded in the database. However, if you're
interested in comment scores, you may want to impose a longer --comment_reload
, since scores
may keep changing for longer than a single day after the comment is posted.
Notes
author
is saved in case-sensitive form, so case-insensitive searching withLIKE
may be helpful.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file reddit-to-sqlite-0.1.0.tar.gz
.
File metadata
- Download URL: reddit-to-sqlite-0.1.0.tar.gz
- Upload date:
- Size: 6.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.6 CPython/3.9.1 Darwin/18.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7707dd5e41fc1e9e72be968873ce02a9bf3cd983dbf7b364c53eaa0b4ae5894e |
|
MD5 | f631a2fdb00d8b94788143f746383b53 |
|
BLAKE2b-256 | 99b1d0af3e21c49ef59878558d2b803514296d4973acd3cdb3f8af8bbc3fc43a |
File details
Details for the file reddit_to_sqlite-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: reddit_to_sqlite-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.6 CPython/3.9.1 Darwin/18.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bcf8141212b6b650c94dbb0165d536ac5a01ad8c06b41e6ef97ff13af0b8cfea |
|
MD5 | 250af1906f9cf8542110a9584d67c7d7 |
|
BLAKE2b-256 | 1cfc460dbd06447796497ff4ae89e3a35bb5e875569ca793e74439c97bcce082 |