Skip to main content

Retrieve Sports data in Python

Project description

Table of Contents generated with DocToc

sportsdataverse-py

Lifecycle:experimental PyPIPyPI - Down
loads Contributors Twitter Follow

See CHANGELOG.md for details.

The goal of sportsdataverse-py is to provide the community with a python package for working with sports data as a companion to the cfbfastR, hoopR, and wehoop R packages. Beyond data aggregation and tidying ease, one of the multitude of services that sportsdataverse-py provides is for benchmarking open-source expected points and win probability metrics for American Football.

Supported leagues and data sources

League Module Surfaces covered
NBA sportsdataverse.nba ESPN (Site v2 + Web v3 + Core v2) + Fox Sports (Bifrost)
WNBA sportsdataverse.wnba ESPN
MBB (NCAA M) sportsdataverse.mbb ESPN + NCAA-only (rankings, recruits) + Fox Sports (Bifrost)
WBB (NCAA W) sportsdataverse.wbb ESPN + NCAA-only
CFB sportsdataverse.cfb ESPN + NCAA + football-only (QBR) + Fox Sports (Bifrost) + Yahoo Sports
NFL sportsdataverse.nfl ESPN + NFL.com API (api.nfl.com "Shield") + nflverse loaders (nflreadpy parity) + football-only (QBR)
MLB sportsdataverse.mlb ESPN + MLB Stats API (statsapi.mlb.com) + Baseball Savant / Statcast (43-endpoint mlb_statcast_* surface) + Fox Sports (Bifrost)
NHL sportsdataverse.nhl api-web.nhle.com/v1/ (game-feed) + NHL EDGE (player tracking) + Stats REST + Records site + Fox Sports (Bifrost)

Each league exports 150–340 public functions (ESPN wrappers + that league's native-API wrappers + dataset loaders + parsers); ~1,600 in total. Fox Sports adds fox_<league>_* Bifrost wrappers (pbp / boxscore / odds / roster / stats / standings / leaders) for nba, mbb, cfb, mlb, nhl; Yahoo Sports adds yahoo_cfb_* season-stats / scoreboard wrappers for college football.

Polars / pandas parser layer

Parser-backed wrappers return a tidy polars DataFrame by default (0.0.54+). Pass return_parsed=False for the raw Dict, or return_as_pandas=True for pandas. Wrappers without a registered parser return the raw Dict.

from sportsdataverse.nba import espn_nba_team_roster

df  = espn_nba_team_roster(team_id=13)                          # → polars (default)
raw = espn_nba_team_roster(team_id=13, return_parsed=False)     # → Dict
pdf = espn_nba_team_roster(team_id=13,
                            return_as_pandas=True)              # → pandas

For the NHL and MLB sibling-API wrappers, compose the wrapper with its parser:

from sportsdataverse.nhl import nhl_web_pbp, parse_nhl_web_pbp
df = parse_nhl_web_pbp(nhl_web_pbp(2023030417))                 # 331-row polars frame

See py.sportsdataverse.org/docs/architecture/espn-cross-league and py.sportsdataverse.org/docs/parsers/index for the full architecture + parser registry.

Installation

The package metadata lives entirely in pyproject.toml (PEP 621 [project] table). There is no setup.py source-of-truth.

Standard install (pip)

pip install sportsdataverse

With optional extras (defined in [project.optional-dependencies] in pyproject.toml):

pip install "sportsdataverse[all]"      # everything below
pip install "sportsdataverse[models]"   # extra deps for the EPA / WP model code
pip install "sportsdataverse[tests]"    # adds pytest, mypy, ruff, etc.

Modern install (uv — recommended)

uv is the fast, drop-in package manager we use day to day.

# Add to a uv-managed project:
uv add sportsdataverse

# With extras:
uv add "sportsdataverse[all]"

# Or install the latest dev snapshot from GitHub:
uv add "sportsdataverse @ git+https://github.com/sportsdataverse/sportsdataverse-py"

Development install

For contributing or running the test suite:

git clone https://github.com/sportsdataverse/sportsdataverse-py.git
cd sportsdataverse-py

# uv (recommended) — fully resolved editable install with every extra:
uv pip install -e ".[all]"

# Plain pip works too if uv isn't available:
pip install -e ".[all]"

Note: once we add a PEP 735 [dependency-groups] block (currently the repo only ships PEP 621 [project.optional-dependencies]), uv sync --all-extras --all-groups will become the one-shot dev incantation. Until then, uv pip install -e ".[all]" is the equivalent path.

Run the test suite:

uv run pytest                       # offline tests only
SDV_PY_LIVE_TESTS=1 uv run pytest   # include live API tests (slower; hits ESPN / nflverse)

For deeper dev-environment detail (lint, mypy, dep-bumping workflow), see CONTRIBUTING.md.

Notes

  • Python target: 3.9–3.14.
  • DataFrame engine: polars 1.x. Most loaders accept return_as_pandas=True if you prefer pandas.
  • NFL caching: loaders cache to memory by default. Set SDV_PY_NFL_CACHE=filesystem for cross-session reuse, or SDV_PY_NFL_CACHE=off to disable. See sportsdataverse.nfl.config.update_config() for runtime control.

Examples and tutorials

Every public function ships a runnable Example: block in its docstring showing a quick-start call, common parameter combinations, and a one-line pipeline next-step. Regenerate the API reference locally with uv run python tools/codegen/generate.py --docs (then cd docs && yarn build to preview the Docusaurus site) or browse the live docs at py.sportsdataverse.org.

For longer-form walkthroughs, see the intro/intermediate Jupyter notebooks under examples/notebooks/:

Notebook Covers
01_quickstart.ipynb Cross-sport intro — package layout, polars vs pandas, the download() retry layer
02_cfb_intro.ipynb College football PBP, schedule, teams, espn_cfb_play_participants
03_nfl_intro.ipynb NFL — nflreadpy parity surface, caching layer, current-season helpers
04_nba_intro.ipynb NBA — PBP, schedule, teams, game rosters, shot distribution
05_wbb_wnba_intro.ipynb Women's basketball — NCAA + WNBA parallels, multi-table stats
06_mbb_intro.ipynb Men's college basketball — PBP, schedule, conference standings
07_nhl_intro.ipynb NHL — PBP, schedule, teams, shot-event filter

Companion packages

sportsdataverse-py is one corner of the broader SportsDataverse ecosystem. The R sister packages cover the same data sources with deeper sport-specific coverage:

The NFL submodule is a near drop-in replacement for nflreadpy; the broader nflverse ecosystem is the upstream data source for many of those loaders.

Our Authors

Citations

To cite the sportsdataverse-py Python package in publications, use:

BibTex Citation

@misc{gilani_sdvpy_2021,
  author = {Gilani, Saiem},
  title = {sportsdataverse-py: The SportsDataverse's Python Package for Sports Data.},
  url = {https://py.sportsdataverse.org},
  season = {2021}
}

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sportsdataverse-0.0.66.tar.gz (8.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sportsdataverse-0.0.66-py3-none-any.whl (8.6 MB view details)

Uploaded Python 3

File details

Details for the file sportsdataverse-0.0.66.tar.gz.

File metadata

  • Download URL: sportsdataverse-0.0.66.tar.gz
  • Upload date:
  • Size: 8.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sportsdataverse-0.0.66.tar.gz
Algorithm Hash digest
SHA256 cd96e7f3dee88bb6fe846c6c9e90ef2b2c6cdc0b751596f3fc6bbc1842eea7d7
MD5 b462248bf59f6d310d2f571d7d605394
BLAKE2b-256 faa4abbb3e998ae5d8438991328482100b53f4374f54e5f389f1f1e1d1dd56a0

See more details on using hashes here.

Provenance

The following attestation bundles were made for sportsdataverse-0.0.66.tar.gz:

Publisher: python-publish.yml on sportsdataverse/sportsdataverse-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sportsdataverse-0.0.66-py3-none-any.whl.

File metadata

File hashes

Hashes for sportsdataverse-0.0.66-py3-none-any.whl
Algorithm Hash digest
SHA256 d044370786695ba664339e64a320de276567dcc068f0032089015767317df1df
MD5 655c547e7776aefeccf2b25b67c6da7b
BLAKE2b-256 6c9321ef3e4fd89aac1043df00dd320513e71d11866ca03309449722c5ad8efc

See more details on using hashes here.

Provenance

The following attestation bundles were made for sportsdataverse-0.0.66-py3-none-any.whl:

Publisher: python-publish.yml on sportsdataverse/sportsdataverse-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page