Skip to main content

Functions to scrape ice hockey data and statistics from swehockey

Project description

swehockey_scraper

This package can be used to collect data with web scraping from the page stats.swehockey.se. This is the website where the Swedish Icehockey Federation stores match statistics.

This package is only for personal usage.

I try to update this package in case something is changing. Any changes to the homepage structure means likely that functions needs to get updated as well. If you find something, please get in touch so that we can fix it.

Getting started

Package can be installed with pip
pip install swehockey_scraper

In python, import module with
import swehockey.swehockey_scraper as swe

See description of functions in package with
help(swe)

Functions can be used together and input and output is linked.

Data structure

On the page for swehockey, there are two keys available, season_id and game_id.

Season ID

For each season and league there is a schedule id. This is found in the URL, for example https://stats.swehockey.se/ScheduleAndResults/Schedule/6108 the season id is 6108.

Game ID

Each game can be found with URL of structure https://stats.swehockey.se/Game/Events/252961 Here, the game id is the last part of the URL, e.g. 252961

Functions

getGames(season_id)

Input is a list of season ids. This returns a dataframe containing all games for the specific season together with results.

cleanGames(df_games)

Input is a list of the structure as returned from getGames(). This step cleans up the data and adds additional columns for further data processing.

getTeamData(df_games_clean)

Input is a list of the structure as returned from cleanGames(). This step make a dataframe on team level. It calculate season specific metrics for each team, Head-to-Head comparisons and table positions.

getGameData(df_games_clean)

Input is a list game_ids (for example can be extracted from the output from getGames).
This function extracts game specific data like penaltys, goals, shot statistics.

Example Notebook

See this notebook for examples of how to use the package, and in what order you can run the functions.

swehockey_scraper

Package to scrape hockey data from swehockey

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

swehockey_scraper-1.5-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file swehockey_scraper-1.5-py3-none-any.whl.

File metadata

File hashes

Hashes for swehockey_scraper-1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 1a4f9b72b226aee9972c184070a4eb90407496ab2a8a0ccfd68dca46e6a36fc6
MD5 c40a3ff43d348e40da6c6ecd814579a5
BLAKE2b-256 efb14ded83a3559080881da7abf78de8a79753b267298059faa1e25ce0bb79f3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page