Bayesian models for football leagues
Project description
bpl
bpl is a python 3 library for fitting Bayesian versions of the Dixon & Coles (1997) model to data.
It uses the stan library to fit models to data.
Installation
pip install bpl
Usage
bpl provides a class BPLModel that can be used to forecast the outcome of football matches.
Data should be provided to the model as a pandas dataframe, with columns home_team, away_team, home_goals and away_goals.
You can also optionally provide a set of numerical covariates for each team (e.g. their ratings on FIFA) and these will be used in the fit.
Example usage:
import bpl
import pandas as pd
df_train = pd.read_csv("<path-to-training-data>")
df_X = pd.read_csv("<path-to-team-level-covariates>")
forecaster = bpl.BPLModel(data=df_train, X=df_X)
forecaster.fit(seed=42)
# calculate the probability that team 1 beats team 2 3-0 at home:
forecaster.score_probability("Team 1", "Team 2", 3, 0)
# calculate the probabilities of a home win, an away win and a draw:
forecaster.overall_probabilities("Team 1", "Team 2")
# compute home win, away win and draw probabilities for a collection of matches:
df_test = pd.read_csv("<path-to-test-data>") # must have columns "home_team" and "away_team"
forecaster.predict_future_matches(df_test)
# add a new, previously unseen team to the model by sampling from the prior
X_3 = np.array([0.1, -0.5, 3.0]) # the covariates for the new team
forecaster.add_new_team("Team 3", X=X_3, seed=43)
Statistical model
The statistical model behind bpl is a slight variation on the Dixon & Coles approach.
The likelihood is:
where y_h and y_a are the number of goals scored by the home team and the away team, respectively. a_i is the attacking aptitude of team i and b_i is the defending aptitude of team j. gamma_i represents the home advantage for team i, and tau is a correlation term that was introduced by Dixon and Coles to produce more realistic scorelines in low-scoring matches. The model uses the following bivariate, hierarchical prior for a and b
X_i are a set of (optional) team-level covariates (these could be, for example, the attack and defence ratings of team i on Fifa). beta are coefficient vectors, and mu_b is an offset for the defence parameter. rho encodes the correlation between a and b, since teams that are strong at attacking also tend to be strong at defending as well. The home advantage has a log-normal prior
Finally, the hyper-priors are
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bpl-0.1.1.tar.gz.
File metadata
- Download URL: bpl-0.1.1.tar.gz
- Upload date:
- Size: 29.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.7.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b6b927cdb3e695a1baec7281a8f5580962eef3835306aee82fa268e8dfbc12bb
|
|
| MD5 |
9a1c3f748f42050e53b24230ca8ad1af
|
|
| BLAKE2b-256 |
2c463eaaafb83c7f0c6029d506d970c4c95555a69c3d603bc11974e4687bee34
|
File details
Details for the file bpl-0.1.1-py3-none-any.whl.
File metadata
- Download URL: bpl-0.1.1-py3-none-any.whl
- Upload date:
- Size: 13.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.7.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cb2205cd35812cbe36a31b3af01f1dc97bb18a4822ad0b02133e728e62828e92
|
|
| MD5 |
354a9cdfabd7c36f45c7142120e2446b
|
|
| BLAKE2b-256 |
ee3d9acddc153cf619e743fcca757242cc2a56e5cf1af5d56487fc11e9680ec0
|