Skip to main content

Train and analyse xG models on soccer event stream data

Project description

Soccer xG

A Python package for training and analyzing expected goals (xG) models in soccer.




About

This repository contains the code and models for our series on the analysis of xG models:

In particular, it contains code for experimenting with an exhaustive set of features and machine learning pipelines for predicting xG values from soccer event stream data. Since we rely on the SPADL language as input format, soccer_xg currently supports event streams provided by Opta, Wyscout, and StatsBomb.

Getting started

The recommended way to install soccer_xg is to simply use pip:

$ pip install soccer_xg

Subsequently, a basic xG model can be trained and applied with the code below:

from itertools import product
from soccer_xg import XGModel, DataApi

# load the data
provider = 'wyscout_opensource'
leagues = ['ENG', 'ESP', 'ITA', 'GER', 'FRA']
seasons = ['1718']
api = DataApi([f"data/{provider}/spadl-{provider}-{l}-{s}.h5" 
        for (l,s) in product(leagues, seasons)])
# load the default pipeline
model = XGModel()
# train the model
model.train(api, training_seasons=[('ESP', '1718'), ('ITA', '1718'), ('GER', '1718')])
# validate the model
model.validate(api, validation_seasons=[('ENG', '1718')])
# predict xG values
model.estimate(api, game_ids=[2500098])

Although this default pipeline is suitable for computing xG, it is by no means the best possible model. The notebook 4-creating-custom-xg-pipelines illustrates how you can train your own xG models or you can use one of the four pipelines used in our blogpost series. These can be loaded with:

XGModel.load_model('openplay_logreg_basic')
XGModel.load_model('openplay_xgboost_basic')
XGModel.load_model('openplay_logreg_advanced')
XGModel.load_model('openplay_xgboost_advanced')

Note that these models are meant to predict shots from open play. To be able to compute xG values from all shot types, you will have to combine them with a pipeline for penalties and free kicks.

from soccer_xg import xg

openplay_model = xg.XGModel.load_model(f'openplay_xgboost_advanced') # custom pipeline for open play shots
openplay_model = xg.PenaltyXGModel() # default pipeline for penalties
freekick_model = xg.FreekickXGModel() # default pipeline for free kicks

model = xg.XGModel()
model.model = [openplay_model, openplay_model, freekick_model]
model.train(api, training_seasons=...)

For developers

Create venv and install deps

make init

Install git precommit hook

make precommit_install

Run linters, autoformat, tests etc.

make pretty lint test

Bump new version

make bump_major
make bump_minor
make bump_patch

License

Copyright (c) DTAI - KU Leuven – All rights reserved.
Licensed under the Apache License, Version 2.0
Written by Pieter Robberechts, 2020

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soccer_xg-0.0.1.tar.gz (149.3 kB view details)

Uploaded Source

Built Distribution

soccer_xg-0.0.1-py3-none-any.whl (156.3 kB view details)

Uploaded Python 3

File details

Details for the file soccer_xg-0.0.1.tar.gz.

File metadata

  • Download URL: soccer_xg-0.0.1.tar.gz
  • Upload date:
  • Size: 149.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.9 CPython/3.6.7 Linux/4.15.0-1028-gcp

File hashes

Hashes for soccer_xg-0.0.1.tar.gz
Algorithm Hash digest
SHA256 0767935e75947817aa1f1ed6caf9302f4a72792378d611bd84b99c1aac21240c
MD5 966af26f82aaa60abe6f84ef2590dcba
BLAKE2b-256 4e7f968a8b088d7876d65c8d4a32fc8d729d4b8bb9e4767a508f29043bf5c9e0

See more details on using hashes here.

File details

Details for the file soccer_xg-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: soccer_xg-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 156.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.9 CPython/3.6.7 Linux/4.15.0-1028-gcp

File hashes

Hashes for soccer_xg-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2d1389959980ff1c60a47e0f4382fb7bdd17d48c25ffbc847c27c08bbf0127d8
MD5 ffd28347e4fa0f032a882b0b51fc854c
BLAKE2b-256 c9c56b0c41f2c4717bcb1636e84ecd5c54befdaa5ce3a99418d226fa24e3eafb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page