Skip to main content

SSAT: Statistical Sports Analysis Toolkit

Project description

SSAT: Statistical Sports Analysis Toolkit

Python 3.10+ License: MIT Code style: ruff PyPI

SSAT is a Python package implementing statistical models for sports analytics. The package provides a collection of frequentist statistical models for analyzing and predicting sports match outcomes.

Key Features

  • Multiple Statistical Models:

    • Bradley-Terry Model: Paired comparison model for team rankings
    • TOOR (Team Offense-Offense Rating): Offensive performance analysis
    • GSSD (Goal Scoring Statistical Distribution): Goal distribution modeling
    • ZSD (Zero-Score Distribution): Special case handling for 0-0 outcomes
    • PRP (Possession-based Rating Process): Team rating based on possession metrics
    • Poisson Model: Classic goal-scoring probability distribution
  • Data Processing: Integrated with flashscore-scraper for automated data collection

  • Visualization: Comprehensive plotting utilities for model analysis

  • Model Comparison: Tools for comparing predictions across different models

Installation

pip install ssat

For full functionality including all optional dependencies:

pip install ssat[all]

Dependencies

  • Core: numpy, pandas, scipy
  • Optional:
    • Development: ipykernel, ipywidgets, jupyter
    • Visualization: matplotlib, seaborn
    • Data Collection: flashscore-scraper, requests, beautifulsoup4
    • Machine Learning: scikit-learn, statsmodels
    • Bayesian (planned): arviz, cmdstanpy

Quick Start

import pandas as pd
from ssat.frequentist import BradleyTerry, Poisson

# Load data
df = pd.read_pickle("ssat/data/sample_handball_data.pkl")
X = df[["home_team", "away_team"]]
Z = df[["home_goals", "away_goals"]]
y = df["spread"]

# Initialize and fit models
bt_model = BradleyTerry()
poisson_model = Poisson()

# Fit models
bt_model.fit(X, y, Z)
poisson_model.fit(X, y, Z)

# Make predictions
bt_predictions = bt_model.predict(X)
poisson_predictions = poisson_model.predict(X)

# Predict probabilities
bt_probas = bt_model.predict_proba(X, Z, point_spread=0, include_draw=True)
poisson_probas = poisson_model.predict_proba(X, Z, point_spread=0, include_draw=True)

Data Sources

Match data is collected using the flashscore-scraper package. The package includes sample handball data in ssat/data/sample_handball_data.pkl for testing and examples.

API Documentation

Base Model

All models inherit from BaseModel providing common functionality:

  • fit(X, y, Z): Fit the model to training data
  • predict(X): Predict match outcomes
  • predict_proba(X, Z, point_spread, include_draw): Predict outcome probabilities

Specific Models

Bradley-Terry Model

from ssat.frequentist import BradleyTerry

model = BradleyTerry()
model.fit(X, y, Z)

Implements paired comparison modeling for team strength estimation.

Poisson Model

from ssat.frequentist import Poisson

model = Poisson()
model.fit(X, y, Z)

Models goal-scoring as a Poisson process.

[Additional model documentation available in the wiki]

Development Roadmap

  1. Current Release (v0.0.1):

    • Frequentist models implementation
    • Basic data processing utilities
    • Example notebooks
  2. Upcoming Features:

    • Bayesian implementations using Stan
    • Enhanced visualization tools
    • Additional sport-specific models
    • Performance optimization
  3. Future Plans:

    • Real-time prediction updates
    • Web API integration
    • Additional sports support

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Setup

git clone https://github.com/bjrnsa/ssat.git
cd ssat
pip install -e ".[all]"

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use SSAT in your research, please cite:

@software{ssat2025,
  author = {Aagaard, Bjørn},
  title = {SSAT: Statistical Sports Analysis Toolkit},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/bjrnsa/ssat}
}

Acknowledgments

  • Andrew Mack's "Statistical Sports Models in Excel" (ISBN: 978-1079013450)
  • Contributors and maintainers of dependent packages

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ssat-0.0.1.tar.gz (221.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ssat-0.0.1-py3-none-any.whl (114.3 kB view details)

Uploaded Python 3

File details

Details for the file ssat-0.0.1.tar.gz.

File metadata

  • Download URL: ssat-0.0.1.tar.gz
  • Upload date:
  • Size: 221.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.4

File hashes

Hashes for ssat-0.0.1.tar.gz
Algorithm Hash digest
SHA256 953196641b20546c811f8ad9906a05a43b52db0a48f0a585763177e98a3c8d8a
MD5 273cd4a9a2d5bfa08c4a2e3aaabaccde
BLAKE2b-256 abb84898c97bbe169320598a231c95ed593ab36df3f932a862cc95a6afa27ac4

See more details on using hashes here.

File details

Details for the file ssat-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: ssat-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 114.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.4

File hashes

Hashes for ssat-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 605cb63fc58462a390098ed932b5c57463fa1f9fb3555233e622a7e62e59df04
MD5 a54eb8661d8e71fd5110c89e57bd8b6d
BLAKE2b-256 2b899d233319bd9c13a0f62b126119ee6bf7072a9a317c588226950204b4f933

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page