Skip to main content

Retrieve Sports data in Python

Project description

Table of Contents generated with DocToc

sportsdataverse-py

Lifecycle:experimental PyPIPyPI - Down
loads Contributors Twitter Follow

See CHANGELOG.md for details.

The goal of sportsdataverse-py is to provide the community with a python package for working with sports data as a companion to the cfbfastR, hoopR, and wehoop R packages. Beyond data aggregation and tidying ease, one of the multitude of services that sportsdataverse-py provides is for benchmarking open-source expected points and win probability metrics for American Football.

Supported leagues and data sources

League Module Surfaces covered
NBA sportsdataverse.nba ESPN (Site v2 + Web v3 + Core v2) + Fox Sports (Bifrost)
WNBA sportsdataverse.wnba ESPN
MBB (NCAA M) sportsdataverse.mbb ESPN + NCAA-only (rankings, recruits) + Fox Sports (Bifrost)
WBB (NCAA W) sportsdataverse.wbb ESPN + NCAA-only
CFB sportsdataverse.cfb ESPN + NCAA + football-only (QBR) + Fox Sports (Bifrost) + Yahoo Sports
NFL sportsdataverse.nfl ESPN + NFL.com API (api.nfl.com "Shield") + nflverse loaders (nflreadpy parity) + football-only (QBR)
MLB sportsdataverse.mlb ESPN + MLB Stats API (statsapi.mlb.com) + Baseball Savant / Statcast + Fox Sports (Bifrost)
NHL sportsdataverse.nhl api-web.nhle.com/v1/ (game-feed) + NHL EDGE (player tracking) + Stats REST + Records site + Fox Sports (Bifrost)

Each league exports 150–340 public functions (ESPN wrappers + that league's native-API wrappers + dataset loaders + parsers); ~1,600 in total. Fox Sports adds fox_<league>_* Bifrost wrappers (pbp / boxscore / odds / roster / stats / standings / leaders) for nba, mbb, cfb, mlb, nhl; Yahoo Sports adds yahoo_cfb_* season-stats / scoreboard wrappers for college football.

Polars / pandas parser layer

Parser-backed wrappers return a tidy polars DataFrame by default (0.0.54+). Pass return_parsed=False for the raw Dict, or return_as_pandas=True for pandas. Wrappers without a registered parser return the raw Dict.

from sportsdataverse.nba import espn_nba_team_roster

df  = espn_nba_team_roster(team_id=13)                          # → polars (default)
raw = espn_nba_team_roster(team_id=13, return_parsed=False)     # → Dict
pdf = espn_nba_team_roster(team_id=13,
                            return_as_pandas=True)              # → pandas

For the NHL and MLB sibling-API wrappers, compose the wrapper with its parser:

from sportsdataverse.nhl import nhl_web_pbp, parse_nhl_web_pbp
df = parse_nhl_web_pbp(nhl_web_pbp(2023030417))                 # 331-row polars frame

See py.sportsdataverse.org/docs/architecture/espn-cross-league and py.sportsdataverse.org/docs/parsers/index for the full architecture + parser registry.

Installation

The package metadata lives entirely in pyproject.toml (PEP 621 [project] table). There is no setup.py source-of-truth.

Standard install (pip)

pip install sportsdataverse

With optional extras (defined in [project.optional-dependencies] in pyproject.toml):

pip install "sportsdataverse[all]"      # everything below
pip install "sportsdataverse[models]"   # extra deps for the EPA / WP model code
pip install "sportsdataverse[tests]"    # adds pytest, mypy, ruff, etc.

Modern install (uv — recommended)

uv is the fast, drop-in package manager we use day to day.

# Add to a uv-managed project:
uv add sportsdataverse

# With extras:
uv add "sportsdataverse[all]"

# Or install the latest dev snapshot from GitHub:
uv add "sportsdataverse @ git+https://github.com/sportsdataverse/sportsdataverse-py"

Development install

For contributing or running the test suite:

git clone https://github.com/sportsdataverse/sportsdataverse-py.git
cd sportsdataverse-py

# uv (recommended) — fully resolved editable install with every extra:
uv pip install -e ".[all]"

# Plain pip works too if uv isn't available:
pip install -e ".[all]"

Note: once we add a PEP 735 [dependency-groups] block (currently the repo only ships PEP 621 [project.optional-dependencies]), uv sync --all-extras --all-groups will become the one-shot dev incantation. Until then, uv pip install -e ".[all]" is the equivalent path.

Run the test suite:

uv run pytest                       # offline tests only
SDV_PY_LIVE_TESTS=1 uv run pytest   # include live API tests (slower; hits ESPN / nflverse)

For deeper dev-environment detail (lint, mypy, dep-bumping workflow), see CONTRIBUTING.md.

Notes

  • Python target: 3.9–3.14.
  • DataFrame engine: polars 1.x. Most loaders accept return_as_pandas=True if you prefer pandas.
  • NFL caching: loaders cache to memory by default. Set SDV_PY_NFL_CACHE=filesystem for cross-session reuse, or SDV_PY_NFL_CACHE=off to disable. See sportsdataverse.nfl.config.update_config() for runtime control.

Examples and tutorials

Every public function ships a runnable Example: block in its docstring showing a quick-start call, common parameter combinations, and a one-line pipeline next-step. Regenerate the API reference locally with uv run python tools/codegen/generate.py --docs (then cd docs && yarn build to preview the Docusaurus site) or browse the live docs at py.sportsdataverse.org.

For longer-form walkthroughs, see the intro/intermediate Jupyter notebooks under examples/notebooks/:

Notebook Covers
01_quickstart.ipynb Cross-sport intro — package layout, polars vs pandas, the download() retry layer
02_cfb_intro.ipynb College football PBP, schedule, teams, espn_cfb_play_participants
03_nfl_intro.ipynb NFL — nflreadpy parity surface, caching layer, current-season helpers
04_nba_intro.ipynb NBA — PBP, schedule, teams, game rosters, shot distribution
05_wbb_wnba_intro.ipynb Women's basketball — NCAA + WNBA parallels, multi-table stats
06_mbb_intro.ipynb Men's college basketball — PBP, schedule, conference standings
07_nhl_intro.ipynb NHL — PBP, schedule, teams, shot-event filter

Companion packages

sportsdataverse-py is one corner of the broader SportsDataverse ecosystem. The R sister packages cover the same data sources with deeper sport-specific coverage:

The NFL submodule is a near drop-in replacement for nflreadpy; the broader nflverse ecosystem is the upstream data source for many of those loaders.

Our Authors

Citations

To cite the sportsdataverse-py Python package in publications, use:

BibTex Citation

@misc{gilani_sdvpy_2021,
  author = {Gilani, Saiem},
  title = {sportsdataverse-py: The SportsDataverse's Python Package for Sports Data.},
  url = {https://py.sportsdataverse.org},
  season = {2021}
}

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sportsdataverse-0.0.59.tar.gz (8.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sportsdataverse-0.0.59-py3-none-any.whl (8.6 MB view details)

Uploaded Python 3

File details

Details for the file sportsdataverse-0.0.59.tar.gz.

File metadata

  • Download URL: sportsdataverse-0.0.59.tar.gz
  • Upload date:
  • Size: 8.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sportsdataverse-0.0.59.tar.gz
Algorithm Hash digest
SHA256 3258632c05e128c321cdd6f4ce196624ec5779d5c5073f95184ee8a21efa87d6
MD5 33115a329185f6139d390410b45a1ee9
BLAKE2b-256 8dcb0103f65609d58bdfbfe2293dd8ac975b349c04489981a3f68e79755a288a

See more details on using hashes here.

Provenance

The following attestation bundles were made for sportsdataverse-0.0.59.tar.gz:

Publisher: python-publish.yml on sportsdataverse/sportsdataverse-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sportsdataverse-0.0.59-py3-none-any.whl.

File metadata

File hashes

Hashes for sportsdataverse-0.0.59-py3-none-any.whl
Algorithm Hash digest
SHA256 f03daaaf0dd5d3774dcbfa6c0b1753ff66a097b4ef1fd299cf5369d40115560c
MD5 e9dcd82b8c522200e3c91b1ff5727512
BLAKE2b-256 148853ea40f7e91470a7450889c84b3f1d7fb10fb12aac41e7f8837df20ac1aa

See more details on using hashes here.

Provenance

The following attestation bundles were made for sportsdataverse-0.0.59-py3-none-any.whl:

Publisher: python-publish.yml on sportsdataverse/sportsdataverse-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page