Train and analyse xG models on soccer event stream data
Project description
Soccer xG
A Python package for training and analyzing expected goals (xG) models in soccer.
About
This repository contains the code and models for our series on the analysis of xG models:
- How data availability affects the ability to learn good xG models
- Illustrating the interplay between features and models in xG
- How data quality affects xG
In particular, it contains code for experimenting with an exhaustive set of features and machine learning pipelines for predicting xG values from soccer event stream data. Since we rely on the SPADL language as input format, soccer_xg
currently supports event streams provided by Opta, Wyscout, and StatsBomb.
Getting started
The recommended way to install soccer_xg
is to simply use pip:
$ pip install soccer_xg
Subsequently, a basic xG model can be trained and applied with the code below:
from itertools import product
from soccer_xg import XGModel, DataApi
# load the data
provider = 'wyscout_opensource'
leagues = ['ENG', 'ESP', 'ITA', 'GER', 'FRA']
seasons = ['1718']
api = DataApi([f"data/{provider}/spadl-{provider}-{l}-{s}.h5"
for (l,s) in product(leagues, seasons)])
# load the default pipeline
model = XGModel()
# train the model
model.train(api, training_seasons=[('ESP', '1718'), ('ITA', '1718'), ('GER', '1718')])
# validate the model
model.validate(api, validation_seasons=[('ENG', '1718')])
# predict xG values
model.estimate(api, game_ids=[2500098])
Although this default pipeline is suitable for computing xG, it is by no means the best possible model.
The notebook 4-creating-custom-xg-pipelines
illustrates how you can train your own xG models or you can use one of the four pipelines used in our blogpost series. These can be loaded with:
XGModel.load_model('openplay_logreg_basic')
XGModel.load_model('openplay_xgboost_basic')
XGModel.load_model('openplay_logreg_advanced')
XGModel.load_model('openplay_xgboost_advanced')
Note that these models are meant to predict shots from open play. To be able to compute xG values from all shot types, you will have to combine them with a pipeline for penalties and free kicks.
from soccer_xg import xg
openplay_model = xg.XGModel.load_model(f'openplay_xgboost_advanced') # custom pipeline for open play shots
openplay_model = xg.PenaltyXGModel() # default pipeline for penalties
freekick_model = xg.FreekickXGModel() # default pipeline for free kicks
model = xg.XGModel()
model.model = [openplay_model, openplay_model, freekick_model]
model.train(api, training_seasons=...)
For developers
Create venv and install deps
make init
Install git precommit hook
make precommit_install
Run linters, autoformat, tests etc.
make pretty lint test
Bump new version
make bump_major
make bump_minor
make bump_patch
License
Copyright (c) DTAI - KU Leuven – All rights reserved.
Licensed under the Apache License, Version 2.0
Written by Pieter Robberechts, 2020
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file soccer_xg-0.0.1.tar.gz
.
File metadata
- Download URL: soccer_xg-0.0.1.tar.gz
- Upload date:
- Size: 149.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.9 CPython/3.6.7 Linux/4.15.0-1028-gcp
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0767935e75947817aa1f1ed6caf9302f4a72792378d611bd84b99c1aac21240c |
|
MD5 | 966af26f82aaa60abe6f84ef2590dcba |
|
BLAKE2b-256 | 4e7f968a8b088d7876d65c8d4a32fc8d729d4b8bb9e4767a508f29043bf5c9e0 |
File details
Details for the file soccer_xg-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: soccer_xg-0.0.1-py3-none-any.whl
- Upload date:
- Size: 156.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.9 CPython/3.6.7 Linux/4.15.0-1028-gcp
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2d1389959980ff1c60a47e0f4382fb7bdd17d48c25ffbc847c27c08bbf0127d8 |
|
MD5 | ffd28347e4fa0f032a882b0b51fc854c |
|
BLAKE2b-256 | c9c56b0c41f2c4717bcb1636e84ecd5c54befdaa5ce3a99418d226fa24e3eafb |