Skip to main content

A package for soccer modelling

Project description

Mezzala

Models for estimating football (soccer) team-strength

Install

pip install mezzala

How to use

import mezzala

Fitting a Dixon-Coles team strength model:

First, we need to get some data

import itertools
import json
import urllib.request


# Use 2016/17 Premier League data from the openfootball repo
url = 'https://raw.githubusercontent.com/openfootball/football.json/master/2016-17/en.1.json'


response = urllib.request.urlopen(url)
data_raw = json.loads(response.read())

# Reshape the data to just get the matches
data = list(itertools.chain(*[d['matches'] for d in data_raw['rounds']]))

data[0:3]
[{'date': '2016-08-13',
  'team1': 'Hull City AFC',
  'team2': 'Leicester City FC',
  'score': {'ft': [2, 1]}},
 {'date': '2016-08-13',
  'team1': 'Everton FC',
  'team2': 'Tottenham Hotspur FC',
  'score': {'ft': [1, 1]}},
 {'date': '2016-08-13',
  'team1': 'Crystal Palace FC',
  'team2': 'West Bromwich Albion FC',
  'score': {'ft': [0, 1]}}]

Fitting a model

To fit a model with mezzala, you need to create an "adapter". Adapters are used to connect a model to a data source.

Because our data is a list of dicts, we are going to use a KeyAdapter.

adapter = mezzala.KeyAdapter(       # `KeyAdapter` = datum['...']
    home_team='team1',
    away_team='team2',
    home_goals=['score', 'ft', 0],  # Get nested fields with lists of fields
    away_goals=['score', 'ft', 1],  # i.e. datum['score']['ft'][1]
)

# You'll never need to call the methods on an 
# adapter directly, but just to show that it 
# works as expected:
adapter.home_team(data[0])
'Hull City AFC'

Once we have an adapter for our specific data source, we can fit the model:

model = mezzala.DixonColes(adapter=adapter)
model.fit(data)
DixonColes(adapter=KeyAdapter(home_goals=['score', 'ft', 0], away_goals=['score', 'ft', 1], home_team='team1', away_team='team2'), blocks=[TeamStrength(), BaseRate(), HomeAdvantage()]), weight=UniformWeight()

Making predictions

By default, you only need to supply the home and away team to get predictions. This should be supplied in the same format as the training data.

DixonColes has two methods for making predictions:

  • predict_one - for predicting a single match
  • predict - for predicting multiple matches
match_to_predict = {
    'team1': 'Manchester City FC',
    'team2': 'Swansea City FC',
}

scorelines = model.predict_one(match_to_predict)

scorelines[0:5]
[ScorelinePrediction(home_goals=0, away_goals=0, probability=0.023625049697587167),
 ScorelinePrediction(home_goals=0, away_goals=1, probability=0.012682094432376022),
 ScorelinePrediction(home_goals=0, away_goals=2, probability=0.00623268833779594),
 ScorelinePrediction(home_goals=0, away_goals=3, probability=0.0016251514235046444),
 ScorelinePrediction(home_goals=0, away_goals=4, probability=0.00031781436109636405)]

Each of these methods return predictions in the form of ScorelinePredictions.

  • predict_one returns a list of ScorelinePredictions
  • predict returns a list of ScorelinePredictions for each predicted match (i.e. a list of lists)

However, it can sometimes be more useful to have predictions in the form of match outcomes. Mezzala exposes the scorelines_to_outcomes function for this purpose:

mezzala.scorelines_to_outcomes(scorelines)
{Outcomes('Home win'): OutcomePrediction(outcome=Outcomes('Home win'), probability=0.8255103334702835),
 Outcomes('Draw'): OutcomePrediction(outcome=Outcomes('Draw'), probability=0.11615659853961693),
 Outcomes('Away win'): OutcomePrediction(outcome=Outcomes('Away win'), probability=0.058333067990098304)}

Extending the model

It's possible to fit more sophisticated models with mezzala, using weights and model blocks

Weights

You can weight individual data points by supplying a function (or callable) to the weight argument to DixonColes:

mezzala.DixonColes(
    adapter=adapter,
    # By default, all data points are weighted equally,
    # which is equivalent to:
    weight=lambda x: 1
)
DixonColes(adapter=KeyAdapter(home_goals=['score', 'ft', 0], away_goals=['score', 'ft', 1], home_team='team1', away_team='team2'), blocks=[TeamStrength(), BaseRate(), HomeAdvantage()]), weight=<function <lambda> at 0x123067488>

Mezzala also provides an ExponentialWeight for the purpose of time-discounting:

mezzala.DixonColes(
    adapter=adapter,
    weight=mezzala.ExponentialWeight(
        epsilon=-0.0065,               # Decay rate
        key=lambda x: x['days_ago']
    )
)
DixonColes(adapter=KeyAdapter(home_goals=['score', 'ft', 0], away_goals=['score', 'ft', 1], home_team='team1', away_team='team2'), blocks=[TeamStrength(), BaseRate(), HomeAdvantage()]), weight=ExponentialWeight(epsilon=-0.0065, key=<function <lambda> at 0x122f938c8>)

Model blocks

Model "blocks" define the calculation and estimation of home and away goalscoring rates.

mezzala.DixonColes(
    adapter=adapter,
    # By default, only team strength and home advantage,
    # is estimated:
    blocks=[
        mezzala.blocks.HomeAdvantage(),
        mezzala.blocks.TeamStrength(),
        mezzala.blocks.BaseRate(),      # Adds "average goalscoring rate" as a distinct parameter
    ]
)
DixonColes(adapter=KeyAdapter(home_goals=['score', 'ft', 0], away_goals=['score', 'ft', 1], home_team='team1', away_team='team2'), blocks=[TeamStrength(), HomeAdvantage(), BaseRate()]), weight=UniformWeight()

To add custom parameters (e.g. per-league home advantage), you need to add additional model blocks.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mezzala-0.0.6.tar.gz (16.7 kB view details)

Uploaded Source

Built Distribution

mezzala-0.0.6-py3-none-any.whl (14.9 kB view details)

Uploaded Python 3

File details

Details for the file mezzala-0.0.6.tar.gz.

File metadata

  • Download URL: mezzala-0.0.6.tar.gz
  • Upload date:
  • Size: 16.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.7.2

File hashes

Hashes for mezzala-0.0.6.tar.gz
Algorithm Hash digest
SHA256 e9a54b9cf575e62cd8c4b752d3c2dbe42f90b828a4a77f16d0877f53217b9238
MD5 fff4ff832ddb94dc353e0b2a4e3cab6d
BLAKE2b-256 9148856aff0d870cfaeb7e1b6ee65715bf9d9db5fcf750981e8af55fd09ec88d

See more details on using hashes here.

File details

Details for the file mezzala-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: mezzala-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 14.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.7.2

File hashes

Hashes for mezzala-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 1d51cfe1b7848948ffcc4cf2e93dbc7099af4e0a8c0459a7e44f40e49b7db92c
MD5 e11466ce50a8f4b58b2df92b238766a2
BLAKE2b-256 e14943e239cd08e44b1800f7b0493dc8eb1a11cd45c32295f2254ef66d3560b6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page