Skip to main content

A package for soccer modelling

Project description

Mezzala

Models for estimating football (soccer) team-strength

Install

pip install mezzala

How to use

import mezzala

Fitting a Dixon-Coles team strength model:

First, we need to get some data

import itertools
import json
import urllib.request


# Use 2016/17 Premier League data from the openfootball repo
url = 'https://raw.githubusercontent.com/openfootball/football.json/master/2016-17/en.1.json'


response = urllib.request.urlopen(url)
data_raw = json.loads(response.read())

# Reshape the data to just get the matches
data = list(itertools.chain(*[d['matches'] for d in data_raw['rounds']]))

data[0:3]
[{'date': '2016-08-13',
  'team1': 'Hull City AFC',
  'team2': 'Leicester City FC',
  'score': {'ft': [2, 1]}},
 {'date': '2016-08-13',
  'team1': 'Everton FC',
  'team2': 'Tottenham Hotspur FC',
  'score': {'ft': [1, 1]}},
 {'date': '2016-08-13',
  'team1': 'Crystal Palace FC',
  'team2': 'West Bromwich Albion FC',
  'score': {'ft': [0, 1]}}]

Fitting a model

To fit a model with mezzala, you need to create an "adapter". Adapters are used to connect a model to a data source.

Because our data is a list of dicts, we are going to use a KeyAdapter.

adapter = mezzala.KeyAdapter(       # `KeyAdapter` = datum['...']
    home_team='team1',
    away_team='team2',
    home_goals=['score', 'ft', 0],  # Get nested fields with lists of fields
    away_goals=['score', 'ft', 1],  # i.e. datum['score']['ft'][1]
)

# You'll never need to call the methods on an 
# adapter directly, but just to show that it 
# works as expected:
adapter.home_team(data[0])
'Hull City AFC'

Once we have an adapter for our specific data source, we can fit the model:

model = mezzala.DixonColes(adapter=adapter)
model.fit(data)
DixonColes(adapter=KeyAdapter(home_goals=['score', 'ft', 0], away_goals=['score', 'ft', 1], home_team='team1', away_team='team2'), blocks=[TeamStrength(), BaseRate(), HomeAdvantage()]), weight=UniformWeight()

Making predictions

By default, you only need to supply the home and away team to get predictions. This should be supplied in the same format as the training data.

DixonColes has two methods for making predictions:

  • predict_one - for predicting a single match
  • predict - for predicting multiple matches
match_to_predict = {
    'team1': 'Manchester City FC',
    'team2': 'Swansea City FC',
}

scorelines = model.predict_one(match_to_predict)

scorelines[0:5]
[ScorelinePrediction(home_goals=0, away_goals=0, probability=0.023625049697587167),
 ScorelinePrediction(home_goals=0, away_goals=1, probability=0.012682094432376022),
 ScorelinePrediction(home_goals=0, away_goals=2, probability=0.00623268833779594),
 ScorelinePrediction(home_goals=0, away_goals=3, probability=0.0016251514235046444),
 ScorelinePrediction(home_goals=0, away_goals=4, probability=0.00031781436109636405)]

Each of these methods return predictions in the form of ScorelinePredictions.

  • predict_one returns a list of ScorelinePredictions
  • predict returns a list of ScorelinePredictions for each predicted match (i.e. a list of lists)

However, it can sometimes be more useful to have predictions in the form of match outcomes. Mezzala exposes the scorelines_to_outcomes function for this purpose:

mezzala.scorelines_to_outcomes(scorelines)
{Outcomes('Home win'): OutcomePrediction(outcome=Outcomes('Home win'), probability=0.8255103334702835),
 Outcomes('Draw'): OutcomePrediction(outcome=Outcomes('Draw'), probability=0.11615659853961693),
 Outcomes('Away win'): OutcomePrediction(outcome=Outcomes('Away win'), probability=0.058333067990098304)}

Extending the model

It's possible to fit more sophisticated models with mezzala, using weights and model blocks

Weights

You can weight individual data points by supplying a function (or callable) to the weight argument to DixonColes:

mezzala.DixonColes(
    adapter=adapter,
    # By default, all data points are weighted equally,
    # which is equivalent to:
    weight=lambda x: 1
)
DixonColes(adapter=KeyAdapter(home_goals=['score', 'ft', 0], away_goals=['score', 'ft', 1], home_team='team1', away_team='team2'), blocks=[TeamStrength(), BaseRate(), HomeAdvantage()]), weight=<function <lambda> at 0x123067488>

Mezzala also provides an ExponentialWeight for the purpose of time-discounting:

mezzala.DixonColes(
    adapter=adapter,
    weight=mezzala.ExponentialWeight(
        epsilon=-0.0065,               # Decay rate
        key=lambda x: x['days_ago']
    )
)
DixonColes(adapter=KeyAdapter(home_goals=['score', 'ft', 0], away_goals=['score', 'ft', 1], home_team='team1', away_team='team2'), blocks=[TeamStrength(), BaseRate(), HomeAdvantage()]), weight=ExponentialWeight(epsilon=-0.0065, key=<function <lambda> at 0x122f938c8>)

Model blocks

Model "blocks" define the calculation and estimation of home and away goalscoring rates.

mezzala.DixonColes(
    adapter=adapter,
    # By default, only team strength and home advantage,
    # is estimated:
    blocks=[
        mezzala.blocks.HomeAdvantage(),
        mezzala.blocks.TeamStrength(),
        mezzala.blocks.BaseRate(),      # Adds "average goalscoring rate" as a distinct parameter
    ]
)
DixonColes(adapter=KeyAdapter(home_goals=['score', 'ft', 0], away_goals=['score', 'ft', 1], home_team='team1', away_team='team2'), blocks=[TeamStrength(), HomeAdvantage(), BaseRate()]), weight=UniformWeight()

To add custom parameters (e.g. per-league home advantage), you need to add additional model blocks.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mezzala-0.0.5.tar.gz (17.3 kB view details)

Uploaded Source

Built Distribution

mezzala-0.0.5-py3-none-any.whl (14.9 kB view details)

Uploaded Python 3

File details

Details for the file mezzala-0.0.5.tar.gz.

File metadata

  • Download URL: mezzala-0.0.5.tar.gz
  • Upload date:
  • Size: 17.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.7.2

File hashes

Hashes for mezzala-0.0.5.tar.gz
Algorithm Hash digest
SHA256 fc696177117ac786741b7c0c4377f4eeee7f07b02242d0eb5cc7f1fdf3fbe4a0
MD5 2b640414a55892701ba2f7210339d0a3
BLAKE2b-256 c7a7eaa7cf02f510bed1906f513b0b2700cc25662dae6eaab77ac06da4559cde

See more details on using hashes here.

File details

Details for the file mezzala-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: mezzala-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 14.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.7.2

File hashes

Hashes for mezzala-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 4bc8fba1f3fcfccf5b8da659fa79622d3ee6523b30f0f9b754cd3bc46c1d8567
MD5 798706ec8c349ba5ba190a1b685fdad8
BLAKE2b-256 299a59b40b1af2ac532902a69abb2affa12c88e4a7b1d3b9a9fb09e58b8fa8c4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page