Skip to main content

A library for processing sports features over a dataframe containing sports data.

Project description

sports-features

PyPi

A library for processing sports features over a dataframe containing sports data.

Dependencies :globe_with_meridians:

Python 3.11.6:

Raison D'être :thought_balloon:

sportsfeatures aims to process features relevant to predicting aspects of sporting games.

Architecture :triangular_ruler:

sportsfeatures is a functional library, meaning that each phase of feature extraction gets put through a different function until the final output. It contains some caching when the processing is heavy (such as skill processing). The features its computes are as follows:

  1. Process the player and teams skill levels using OpenSkill. This is an ELO like rating system giving a probability of win and loss.
  2. Compute the offensive efficiency of each team/player.
  3. Compute the time series values of the numeric features for each team/player over the various windows provided. This includes lag, count, sum, mean, median, var, std, min, max, skew, kurt, sem, rank.
  4. Compute the datetime features for any datetime columns.
  5. Remove the lookahead features.

Installation :inbox_tray:

This is a python package hosted on pypi, so to install simply run the following command:

pip install sportsfeatures

or install using this local repository:

python setup.py install --old-and-unmanageable

Usage example :eyes:

The use of sportsfeatures is entirely through code due to it being a library. It attempts to hide most of its complexity from the user, so it only has a few functions of relevance in its outward API.

Generating Features

To generate features:

import datetime

import pandas as pd

from sportsfeatures.process import process
from sportsfeatures.identifier import Identifier
from sportsfeatures.entity_type import EntityType

df = ... # Your sports data
identifiers = [
    Identifier(EntityType.TEAM, "teams/0/id", ["teams/0/kicks"], "teams/0"),
    Identifier(EntityType.TEAM, "teams/1/id", ["teams/1/kicks"], "teams/1"),
]
df = process(df, identifiers, [datetime.timedelta(days=365), None], "dt")

This will produce a dataframe that contains the new sports related features.

License :memo:

The project is available under the MIT License.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sportsfeatures-0.0.80.tar.gz (22.7 kB view details)

Uploaded Source

File details

Details for the file sportsfeatures-0.0.80.tar.gz.

File metadata

  • Download URL: sportsfeatures-0.0.80.tar.gz
  • Upload date:
  • Size: 22.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.6

File hashes

Hashes for sportsfeatures-0.0.80.tar.gz
Algorithm Hash digest
SHA256 ddbd7e6f20ad4ae4a17532cc915fe7ae7af33ffa209d0b53827bb789e22392d0
MD5 a7d6b5c8e546c9e0e7c37bdde15c96b9
BLAKE2b-256 8b2e4cca99531e2b20d2c1df537824cfb15157b7187716928c8d130a403a4503

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page