Skip to main content

Hybrid Python/Node.js web scraper for Major League Baseball (MLB) data.

Project description

PyPI version PyPI - Downloads PyPI - License PyPI - Python Version Maintainability codecov

vigorish

vigorish is a hybrid Python/Node.js application that scrapes MLB data from mlb.com, brooksbaseball.net and baseball-reference.com.

My goal is to capture as much data as possible — ranging from PitchFX measurements at the most granular level to play-by-play data (play descriptions, substitutions, manager challenges, etc) and individual player pitch/bat stats at the highest level.

Requirements

  • Python 3.6+
  • Node.js 10+ (Tested with Node.js 11-13)
  • Xvfb
  • AWS account (optional but recommended, used to store scraped data in S3)

Project Documentation

For a step-by-step install guide and instructions for configuring/using vigorish, please visit the link below:

Vigorish: Hybrid Python/Node.Js Web Scraper

Credits

vigorish either relies on the following projects listed below directly or as a dev dependency. It would not have been possible for me to create vigorish without these projects, thanks to all of the creators/maintainers for making these available (projects are listed alphabetically):

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vigorish-0.7.0.tar.gz (2.7 MB view hashes)

Uploaded Source

Built Distribution

vigorish-0.7.0-py3-none-any.whl (2.8 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page