Functions to scrape ice hockey data and statistics from swehockey
Project description
swehockey_scraper
This package can be used to collect data with web scraping from the page stats.swehockey.se. This is the website where the Swedish Icehockey Federation stores match statistics.
This package is only for personal usage.
I try to update this package in case something is changing. Any changes to the homepage structure means likely that functions needs to get updated as well. If you find something, please get in touch so that we can fix it.
Getting started
Package can be installed with pip
pip install swehockey_scraper
In python, import module with
import swehockey.swehockey_scraper as swe
See description of functions in package with
help(swe)
Functions can be used together and input and output is linked.
Data structure
On the page for swehockey, there are two keys available, season_id and game_id.
Season ID
For each season and league there is a schedule id. This is found in the URL, for example https://stats.swehockey.se/ScheduleAndResults/Schedule/6108 the season id is 6108.
Game ID
Each game can be found with URL of structure https://stats.swehockey.se/Game/Events/252961 Here, the game id is the last part of the URL, e.g. 252961
Functions
getGames(season_id)
Input is a list of season ids. This returns a dataframe containing all games for the specific season together with results.
cleanGames(df_games)
Input is a list of the structure as returned from getGames(). This step cleans up the data and adds additional columns for further data processing.
getTeamData(df_games_clean)
Input is a list of the structure as returned from cleanGames(). This step make a dataframe on team level. It calculate season specific metrics for each team, Head-to-Head comparisons and table positions.
getGameData(df_games_clean)
Input is a list game_ids (for example can be extracted from the output from getGames).
This function extracts game specific data like penaltys, goals, shot statistics.
Example Notebook
See this notebook for examples of how to use the package, and in what order you can run the functions.
swehockey_scraper
Package to scrape hockey data from swehockey
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file swehockey_scraper-1.5-py3-none-any.whl
.
File metadata
- Download URL: swehockey_scraper-1.5-py3-none-any.whl
- Upload date:
- Size: 6.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1a4f9b72b226aee9972c184070a4eb90407496ab2a8a0ccfd68dca46e6a36fc6 |
|
MD5 | c40a3ff43d348e40da6c6ecd814579a5 |
|
BLAKE2b-256 | efb14ded83a3559080881da7abf78de8a79753b267298059faa1e25ce0bb79f3 |