Skip to main content

Board games data scraping and processing from BoardGameGeek and more!

Project description

🎲 Board Game Scraper 🕸

Scraping data about board games from the web. View the data live at Recommend.Games! Install via

pip install board-game-scraper

Sources

Run scrapers

Requires Python 3. Make sure Pipenv is installed and create the virtual environment:

python3 -m pip install --upgrade pipenv
pipenv install --dev
pipenv shell

Run a spider like so:

JOBDIR="jobs/${SPIDER}/$(date --utc +'%Y-%m-%dT%H-%M-%S')"
scrapy crawl "${SPIDER}" \
    --output 'feeds/%(name)s/%(time)s/%(class)s.csv' \
    --set "JOBDIR=${JOBDIR}"

where $SPIDER is one of the IDs above.

Run all the spiders with the run_scrapers.sh script. Get a list of the running scrapers' PIDs with the processes.sh script. You can close all the running scrapers via

./processes.sh stop

and resume them later.

Tests

You can run scrapy check to perform contract tests for all spiders, or scrapy check $SPIDER to test one particular spider. If tests fails, there most likely has been some change on the website and the spider needs updating.

Board game datasets

If you are interested in using any of the datasets produced by this scraper, take a look at the BoardGameGeek guild. A subset of the data can also be found on Kaggle.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

board-game-scraper-2.15.1.tar.gz (54.1 kB view details)

Uploaded Source

Built Distribution

board_game_scraper-2.15.1-py2.py3-none-any.whl (66.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file board-game-scraper-2.15.1.tar.gz.

File metadata

  • Download URL: board-game-scraper-2.15.1.tar.gz
  • Upload date:
  • Size: 54.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.11

File hashes

Hashes for board-game-scraper-2.15.1.tar.gz
Algorithm Hash digest
SHA256 7b225b56feb19d0a917f49a615bad37b6543c09b5b3e7a179e34222503564370
MD5 bbfcccba17d3df7a2f7f8547f9c92e64
BLAKE2b-256 49d3bcd7ef72edeaa594a288dec880f76d4a691132a1fe9caee81a983100aac4

See more details on using hashes here.

File details

Details for the file board_game_scraper-2.15.1-py2.py3-none-any.whl.

File metadata

  • Download URL: board_game_scraper-2.15.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 66.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.11

File hashes

Hashes for board_game_scraper-2.15.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 2d110a57b8a31d722967f6262e45ab59419a912d8f8dab5aa120de5971a5d2bb
MD5 f97abcd26168ebf1de83ee2e78419a4f
BLAKE2b-256 cbff20ee2d94e45bcb9ce6799dd72869dc8a24453c8ee7693ac5ea935cee799b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page