Skip to main content

This project is designed to allow people to scrape Play by Play and Shift data off of the National Hockey League (NHL) API and website for all regular season and playoff games since the 2010-2011 season

Project description

This project is designed to allow people to scrape Play by Play and Shift data off of the National Hockey League (NHL) API and website for all regular season and playoff games since the 2010-2011 season (further testing needs to be done to ensure it works for earlier seasons).

Prerequisites

You are going to need to have python installed for this. Specifically, you’ll need from at least version 3.6.0.

If you don’t have python installed on your machine, I’d recommend installing it through the anaconda distribution (here - https://www.continuum.io/downloads). Anaconda comes with a bunch of libraries pre-installed so it’ll be easier to start off.

How to Use

First just download this repository onto your computer.

Then open up the command line or terminal and navigate over to the folder which contains the code. Then type in “python” to open the interactive python console.

You then want to import the file in the folder which contains the functions for scraping the data. That file is called scrape_functions.py. So just type in (and press enter):

import scrape_functions

There are three relevant functions used for scraping data (After any scraping function finishes running, the data scraped can be found in the folder which contains your code):

1. scrape_seasons:

This function is used to scrape on a season by season level. It takes two arguments:

  1. ‘seasons’ - List of seasons you want to scrape (Note: A given season is referred to by the first of the two years it spans. So you would refer to the 2016-2017 season as 2016.

  2. ‘if_scrape_shifts’ - Boolean indicating whether or not you want to scrape the shifts too.

    # Scrapes 2015 & 2016 season with shifts
    scrape_functions.scrape_seasons([2015, 2016], True)
    
    # Scrapes 2016 season without shifts
    scrape_functions.scrape_seasons([2016], False)

2. scrape_games:

This function is used to scrape any collection of games you want. It takes two arguments:

  1. ‘games’ - List of games you want to scrape. A game is identified by the game id used by the NHL (ex: 2016020001). The list of corresponding id’s for games can be found here (https://statsapi.web.nhl.com/api/v1/schedule?startDate=2016-10-12&endDate=2016-10-12 - Just fiddle with the start and end dates in the url to find the game you are looking for).

  2. ‘if_scrape_shifts’ - Boolean indicating whether or not you want to scrape the shifts too.

    # Scrapes first game of 2014, 2015, and 2016 seasons with shifts
    scrape_functions.scrape_games([2014020001, 2015020001, 2016020001], True)

3. scrape_date_range:

This functions is used to scrape any games in a given date range. All dates must be written in the following format yyyy-mm-dd (ex: ‘2016-10-20’). It take three arguments:

  1. ‘from_date’ - Date of beginning of interval you want to scrape

  2. ‘to_date’ - Date of end of interval you want to scrape

  3. ‘if_scrape_shifts’ - Boolean indicating whether or not you want to scrape the shifts too.

    # Scrapes games between 2016-10-10 and 2016-10-20 without shifts
    scrape_functions.scrape_date_range('2016-10-10', '2016-10-20', False)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hockey_scraper-1.tar.gz (22.8 kB view details)

Uploaded Source

Built Distribution

hockey_scraper-1-py3-none-any.whl (29.1 kB view details)

Uploaded Python 3

File details

Details for the file hockey_scraper-1.tar.gz.

File metadata

  • Download URL: hockey_scraper-1.tar.gz
  • Upload date:
  • Size: 22.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for hockey_scraper-1.tar.gz
Algorithm Hash digest
SHA256 333532a4d0f9063e1fbad3e05dae7917efc9c5d28b7b64a0c1c1934550701ceb
MD5 25598147a277f76ccb375bea651b090c
BLAKE2b-256 09bd0cd33352a054df58ab40f3effe8cdb4752be60e8dc7c448f7a43c8d8cc42

See more details on using hashes here.

File details

Details for the file hockey_scraper-1-py3-none-any.whl.

File metadata

File hashes

Hashes for hockey_scraper-1-py3-none-any.whl
Algorithm Hash digest
SHA256 8285be66f037ca21685a1d40c66646d3e6c13965a9b482f099d58c60d14a9ecc
MD5 5c5f3965b6b028a3cc5ee8924c6125fe
BLAKE2b-256 bf2a9b5a55a5b0eae727560fad8fc31b31bd962b48ab1650b183cb0e0c70bd62

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page