Skip to main content

Scrape WUMB playlists to SQLite

Project description

wumb-to-sqlite

PyPI Changelog Tests License

Scrape WUMB playlists to SQLite.

WUMB is a public radio station based at UMass Boston. It's awesome and you should support it if you like great music with no ads. This is a personal project, however, and not associated with WUMB or UMass Boston in any way.

The station puts its daily playlist online here: http://wumb.org/cgi-bin/playlist1.pl. I often want to look up a song I heard in the car, or remember something that played last week. I'm also just curious about the music mix. So this is a tool to scratch that itch.

Installation

Install this tool using pip:

pip install wumb-to-sqlite

Or install globally with pipx:

pipx install wumb-to-sqlite

Usage

Scrape today's playlist:

wumb-to-sqlite playlist wumb.db

That will use (or create) a SQLIte database called wumb.db and a table called playlist. Change the table name by passing a --table option.

Scrape a specific date, with a custom table name:

wumb-to-sqlite playlist wumb.db --table songs --date 2020-09-01

That will get songs from Sept. 1, 2020, and use a table called songs.

Scrape all daily playlists from Oct. 1 to Oct. 11, 2020:

wumb-to-sqlite playlist wumb.db --since 2020-10-01 --until 2020-10-01 --delay 1

That will pull down playlists for each day between Oct. 1 and 11, inclusive. It adds a one second delay (which is the default) between days, as a courtesy to WUMB's servers.

Downloaded pages are cached locally, so subsequent runs don't keep re-fetching the same data. By default, it's located at $HOME/.wumb-to-sqlite/.

Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

cd wumb-to-sqlite
python -mvenv venv
source venv/bin/activate

Or if you are using pipenv:

pipenv shell

Now install the dependencies and tests:

pip install -e '.[test]'

To run the tests:

pytest

Please note that scraping tests should be run against the included HTML file tests/wumb-2020-10-10.html, not against the live site. Again, this is a small public radio station. Please be nice.

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wumb-to-sqlite-0.1.tar.gz (4.3 kB view details)

Uploaded Source

Built Distribution

wumb_to_sqlite-0.1-py3-none-any.whl (9.1 kB view details)

Uploaded Python 3

File details

Details for the file wumb-to-sqlite-0.1.tar.gz.

File metadata

  • Download URL: wumb-to-sqlite-0.1.tar.gz
  • Upload date:
  • Size: 4.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.7.5

File hashes

Hashes for wumb-to-sqlite-0.1.tar.gz
Algorithm Hash digest
SHA256 3c7aa9e1e250f859295be11821fe23ab44f81081347b2357d4eb3d96e97251fa
MD5 b9a3773e9fa3a288ae6c88bf451257db
BLAKE2b-256 d7ee13aae389b28327481122bfc7078665d7a06f30c8694b26294c827f1e7a53

See more details on using hashes here.

File details

Details for the file wumb_to_sqlite-0.1-py3-none-any.whl.

File metadata

  • Download URL: wumb_to_sqlite-0.1-py3-none-any.whl
  • Upload date:
  • Size: 9.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.7.5

File hashes

Hashes for wumb_to_sqlite-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9ef9897ed39755d5b1ec35715c78cb42c6f1e8d5e428070602967f07d6bf2eef
MD5 14592a59a3bfad8e865590d5417d29ba
BLAKE2b-256 49d683eb284707b1f425923c1fe310404fdd79647fd8bb31c043c91fe6e1296c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page