Hybrid Python/Node.js web scraper for Major League Baseball (MLB) data.
Project description
vigorish
vigorish is a hybrid Python/Node.js application that scrapes MLB data from mlb.com, brooksbaseball.net and baseball-reference.com.
My goal is to capture as much data as possible — ranging from PitchFX measurements at the most granular level to play-by-play data (play descriptions, substitutions, manager challenges, etc) and individual player pitch/bat stats at the highest level.
Requirements
- Python 3.6+
- Node.js 10+ (Tested with Node.js 11-13)
- Xvfb
- AWS account (optional but recommended, used to store scraped data in S3)
Project Documentation
For a step-by-step install guide and instructions for configuring/using vigorish, please visit the link below:
Vigorish: Hybrid Python/Node.Js Web Scraper
Credits
vigorish either relies on the following projects listed below directly or as a dev dependency. It would not have been possible for me to create vigorish without these projects, thanks to all of the creators/maintainers for making these available (projects are listed alphabetically):
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vigorish-0.7.0.tar.gz.
File metadata
- Download URL: vigorish-0.7.0.tar.gz
- Upload date:
- Size: 2.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39d405f810e7dabbf451af0832f0d95c0bd49c606618491b610988eee7c34cd8
|
|
| MD5 |
407b53314f558a2ed097c60b972ef088
|
|
| BLAKE2b-256 |
e30582491ed935fdd412331e2e20ee5ca5ce4929c15e353c3f36e03b8189bb27
|
File details
Details for the file vigorish-0.7.0-py3-none-any.whl.
File metadata
- Download URL: vigorish-0.7.0-py3-none-any.whl
- Upload date:
- Size: 2.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2cdd0139ffb52b45f5f67a610c1fa922a0c3e2d4c569e1cd901fff71c59f07a5
|
|
| MD5 |
05752b637244b00f8177393febc87fff
|
|
| BLAKE2b-256 |
c96c4c77fc210d476893e0ecd3d2e2d86ba4c4b7d2178f9fce3feae7891d6b37
|