Skip to main content

A Python package for scraping & analyzing sports statistics

Project description

chickenstats

Hero image - scatter plot with drumsticks and tooltips

PyPI - Version PyPI - Python Version tests codecov GitHub Release Date - Published_At GitHub License


About

chickenstats is a Python package for scraping & analyzing sports data. With just a few lines of code:

  • Scrape & manipulate data from various NHL endpoints, leveraging chickenstats.chicken_nhl, which includes a proprietary xG model for shot quality metrics
  • Augment play-by-play data & generate custom aggregations from raw csv files downloaded from Evolving-Hockey (subscription required) with chickenstats.evolving_hockey

For more in-depth explanations, tutorials, & detailed reference materials, consult the Documentation.


Compatibility

chickenstats requires Python 3.10 or greater & runs on the latest stable versions of Linux, macOS, & Windows operating systems.


Installation

Very simple - install using PyPi. Best practice is to develop in an isolated virtual environment (conda or otherwise), but who's a chicken to judge?

pip install chickenstats

To confirm installation & confirm the latest version (1.7.8):

pip show chickenstats

Usage

chickenstats is structured as two underlying modules, each used with different data sources:

  • chickenstats.chicken_nhl
  • chickenstats.evolving_hockey

The package is under active development - features will be added or modified over time.

chicken_nhl

The chickenstats.chicken_nhl module scrapes & manipulates data directly from various NHL endpoints, with outputs including schedule & game results, rosters, & play-by-play data.

The below example scrapes the schedule for the Nashville Predators, extracts the game IDs, then scrapes play-by-play data for the first ten regular season games.

from chickenstats.chicken_nhl import Season, Scraper

# Create a Season object for the current season
season = Season(2023)

# Download the Nashville schedule & filter for regular season games
nsh_schedule = season.schedule('NSH')
nsh_schedule_reg = nsh_schedule.loc[nsh_schedule.game_state == "OFF"].reset_index(drop=True)

# Extract game IDs, excluding pre-season games
game_ids = nsh_schedule_reg.game_id.tolist()[:10]

# Create a scraper object using the game IDs
scraper = Scraper(game_ids)

# Scrape play-by-play data
play_by_play = scraper.play_by_play

evolving_hockey

The chickenstats.evolving_hockey module manipulates raw csv files downloaded from Evolving-Hockey. Using their original shifts & play-by-play data, users can add additional information & aggregate for individual & on-ice statistics, including high-danger shooting events, xG & adjusted xG, faceoffs, & changes.

import pandas as pd
from chickenstats.evolving_hockey import prep_pbp, prep_stats, prep_lines

# The prep_pbp function takes the raw event and shifts dataframes
raw_shifts = pd.read_csv('./raw_shifts.csv')
raw_pbp = pd.read_csv('./raw_pbp.csv')

play_by_play = prep_pbp(raw_pbp, raw_shifts)

# You can use the play_by_play dataframe in various aggregations
# These are individual game statistics, including on-ice & usage,
# accounting for teammates & opposition on-ice
individual_game = prep_stats(play_by_play, level='game', teammates=True, opposition=True)

# These are game statistics for forward-line combinations, accounting for opponents on-ice
forward_lines = prep_lines(play_by_play, level='game', position='f', opposition=True)

Acknowledgements

chickenstats wouldn't be possible without the support & efforts of countless others. I am obviously extremely grateful, even if there are too many of you to thank individually. However, this chicken will do his best.

First & foremost is my wife - the lovely Mrs. Chicken has been patient, understanding, & supportive throughout the countless hours of development, sometimes to her detriment.

Sincere apologies to the friends & family that have put up with me since my entry into Python, programming, & data analysis in January 2021. Thank you for being excited for me & with me throughout all of this, especially when you've had to fake it...

Thank you to the hockey analytics community on (the artist formerly known as) Twitter. You're producing & reacting to cutting-edge statistical analyses, while providing a supportive, welcoming environment for newcomers. Thank y'all for everything that you do. This is by no means exhaustive, but there are a few people worth calling out specifically:

I'm also grateful to the thriving community of Python educators & open-source contributors on Twitter. Thank y'all for your knowledge & practical advice. Matt Harrison (@mharrison) deserves a special mention for his books on Pandas and XGBoost, both of which are available at his online store. Again, not exhaustive, but others worth thanking individually:

Finally, this library depends on a host of other open-source packages. chickenstats is possible because of the efforts of thousands of individuals, represented below:

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chickenstats-1.7.9.5.tar.gz (813.6 kB view details)

Uploaded Source

Built Distribution

chickenstats-1.7.9.5-py3-none-any.whl (827.6 kB view details)

Uploaded Python 3

File details

Details for the file chickenstats-1.7.9.5.tar.gz.

File metadata

  • Download URL: chickenstats-1.7.9.5.tar.gz
  • Upload date:
  • Size: 813.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.12 Linux/6.5.0-1025-azure

File hashes

Hashes for chickenstats-1.7.9.5.tar.gz
Algorithm Hash digest
SHA256 2bd48604d29a889aba0b5ef9f7d85540186e9a73f5796dacf01a92ba674a17a3
MD5 892d16289531774f2fc12cdc0c2f6106
BLAKE2b-256 26bb8ac4277d16b66230f2316bd3ae898f69b61480a1651ad0d535c7ff40583b

See more details on using hashes here.

File details

Details for the file chickenstats-1.7.9.5-py3-none-any.whl.

File metadata

  • Download URL: chickenstats-1.7.9.5-py3-none-any.whl
  • Upload date:
  • Size: 827.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.12 Linux/6.5.0-1025-azure

File hashes

Hashes for chickenstats-1.7.9.5-py3-none-any.whl
Algorithm Hash digest
SHA256 039adb61d4d15ba7727d79b5b84e1c2b6b0769535179d6b11ce209188ba2f0f0
MD5 14c5bb5500e5765d23b89aebab791c2d
BLAKE2b-256 276abdf73b31ec8ce69bb98220d6ea43e7a35e6bf78ee0b8a4a5099586f94118

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page