Skip to main content

Bayesian models for football leagues

Project description

bpl

Build Status codecov Downloads

bpl is a python 3 library for fitting Bayesian versions of the Dixon & Coles (1997) model to data. It uses the stan library to fit models to data.

Installation

pip install bpl

Usage

bpl provides a class BPLModel that can be used to forecast the outcome of football matches. Data should be provided to the model as a pandas dataframe, with columns home_team, away_team, home_goals and away_goals. You can also optionally provide a set of numerical covariates for each team (e.g. their ratings on FIFA) and these will be used in the fit. Example usage:

import bpl
import pandas as pd

df_train = pd.read_csv("<path-to-training-data>")
df_X = pd.read_csv("<path-to-team-level-covariates>")
forecaster = bpl.BPLModel(data=df_train, X=df_X)
forecaster.fit(seed=42)

# calculate the probability that team 1 beats team 2 3-0 at home:
forecaster.score_probability("Team 1", "Team 2", 3, 0)

# calculate the probabilities of a home win, an away win and a draw:
forecaster.overall_probabilities("Team 1", "Team 2")

# compute home win, away win and draw probabilities for a collection of matches:
df_test = pd.read_csv("<path-to-test-data>") # must have columns "home_team" and "away_team"
forecaster.predict_future_matches(df_test)

# add a new, previously unseen team to the model by sampling from the prior
X_3 = np.array([0.1, -0.5, 3.0]) # the covariates for the new team
forecaster.add_new_team("Team 3", X=X_3, seed=43)

Statistical model

The statistical model behind bpl is a slight variation on the Dixon & Coles approach. The likelihood is:

equation

where y_h and y_a are the number of goals scored by the home team and the away team, respectively. a_i is the attacking aptitude of team i and b_i is the defending aptitude of team j. gamma_i represents the home advantage for team i, and tau is a correlation term that was introduced by Dixon and Coles to produce more realistic scorelines in low-scoring matches. The model uses the following bivariate, hierarchical prior for a and b

equation

X_i are a set of (optional) team-level covariates (these could be, for example, the attack and defence ratings of team i on Fifa). beta are coefficient vectors, and mu_b is an offset for the defence parameter. rho encodes the correlation between a and b, since teams that are strong at attacking also tend to be strong at defending as well. The home advantage has a log-normal prior

equation

Finally, the hyper-priors are

equation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bpl-0.1.1.tar.gz (29.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bpl-0.1.1-py3-none-any.whl (13.0 MB view details)

Uploaded Python 3

File details

Details for the file bpl-0.1.1.tar.gz.

File metadata

  • Download URL: bpl-0.1.1.tar.gz
  • Upload date:
  • Size: 29.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.7.10

File hashes

Hashes for bpl-0.1.1.tar.gz
Algorithm Hash digest
SHA256 b6b927cdb3e695a1baec7281a8f5580962eef3835306aee82fa268e8dfbc12bb
MD5 9a1c3f748f42050e53b24230ca8ad1af
BLAKE2b-256 2c463eaaafb83c7f0c6029d506d970c4c95555a69c3d603bc11974e4687bee34

See more details on using hashes here.

File details

Details for the file bpl-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: bpl-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 13.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.7.10

File hashes

Hashes for bpl-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cb2205cd35812cbe36a31b3af01f1dc97bb18a4822ad0b02133e728e62828e92
MD5 354a9cdfabd7c36f45c7142120e2446b
BLAKE2b-256 ee3d9acddc153cf619e743fcca757242cc2a56e5cf1af5d56487fc11e9680ec0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page