Retrieve Sports data in Python
Project description
Table of Contents generated with DocToc
sportsdataverse-py 
See CHANGELOG.md for details.
The goal of sportsdataverse-py is to provide the community with a python package for working with sports data as a companion to the cfbfastR, hoopR, and wehoop R packages. Beyond data aggregation and tidying ease, one of the multitude of services that sportsdataverse-py provides is for benchmarking open-source expected points and win probability metrics for American Football.
Installation
The package metadata lives entirely in pyproject.toml
(PEP 621 [project] table). There is no setup.py source-of-truth.
Standard install (pip)
pip install sportsdataverse
With optional extras (defined in [project.optional-dependencies] in
pyproject.toml):
pip install "sportsdataverse[all]" # everything below
pip install "sportsdataverse[models]" # extra deps for the EPA / WP model code
pip install "sportsdataverse[tests]" # adds pytest, mypy, ruff, etc.
pip install "sportsdataverse[docs]" # adds sphinx + sphinx-markdown-builder for the doc build
Modern install (uv — recommended)
uv is the fast, drop-in package manager we use day to day.
# Add to a uv-managed project:
uv add sportsdataverse
# With extras:
uv add "sportsdataverse[all]"
# Or install the latest dev snapshot from GitHub:
uv add "sportsdataverse @ git+https://github.com/sportsdataverse/sportsdataverse-py"
Conda install
Once the conda-forge feedstock is published the package is also available via:
conda install -c conda-forge sportsdataverse
# or
mamba install -c conda-forge sportsdataverse
Until then, conda users can build a local package from this repo:
conda install conda-build conda-verify
conda build recipe/
conda install --use-local sportsdataverse
See recipe/README.md for the full conda workflow.
Development install
For contributing or running the test suite:
git clone https://github.com/sportsdataverse/sportsdataverse-py.git
cd sportsdataverse-py
# uv (recommended) — fully resolved editable install with every extra:
uv pip install -e ".[all]"
# Plain pip works too if uv isn't available:
pip install -e ".[all]"
Note: once we add a PEP 735
[dependency-groups]block (currently the repo only ships PEP 621[project.optional-dependencies]),uv sync --all-extras --all-groupswill become the one-shot dev incantation. Until then,uv pip install -e ".[all]"is the equivalent path.
Run the test suite:
uv run pytest # offline tests only
SDV_PY_LIVE_TESTS=1 uv run pytest # include live API tests (slower; hits ESPN / nflverse)
For deeper dev-environment detail (lint, mypy, dep-bumping workflow), see CONTRIBUTING.md.
Notes
- Python target: 3.9–3.14.
- DataFrame engine: polars 1.x. Most loaders accept
return_as_pandas=Trueif you prefer pandas. - NFL caching: loaders cache to memory by default. Set
SDV_PY_NFL_CACHE=filesystemfor cross-session reuse, orSDV_PY_NFL_CACHE=offto disable. Seesportsdataverse.nfl.config.update_config()for runtime control.
Examples and tutorials
Every public function ships a runnable Example: block in its docstring
showing a quick-start call, common parameter combinations, and a one-line
pipeline next-step. Render the API reference locally with
bash create_docs.sh or browse the live docs at
py.sportsdataverse.org.
For longer-form walkthroughs, see the intro/intermediate Jupyter notebooks
under examples/notebooks/:
| Notebook | Covers |
|---|---|
01_quickstart.ipynb |
Cross-sport intro — package layout, polars vs pandas, the download() retry layer |
02_cfb_intro.ipynb |
College football PBP, schedule, teams, espn_cfb_play_participants |
03_nfl_intro.ipynb |
NFL — nflreadpy parity surface, caching layer, current-season helpers |
04_nba_intro.ipynb |
NBA — PBP, schedule, teams, game rosters, shot distribution |
05_wbb_wnba_intro.ipynb |
Women's basketball — NCAA + WNBA parallels, multi-table stats |
06_mbb_intro.ipynb |
Men's college basketball — PBP, schedule, conference standings |
07_nhl_intro.ipynb |
NHL — PBP, schedule, teams, shot-event filter |
Companion packages
sportsdataverse-py is one corner of the broader SportsDataverse
ecosystem. The R sister packages cover the same data sources with deeper
sport-specific coverage:
- wehoop — women's basketball (WNBA + NCAA)
- hoopR — men's basketball (NBA + NCAA)
- cfbfastR — college football
- baseballr — baseball (MLB + MiLB + NCAA)
- fastRhockey — hockey (NHL + WHL)
The NFL submodule is a near drop-in replacement for nflreadpy; the broader nflverse ecosystem is the upstream data source for many of those loaders.
Our Authors
Citations
To cite the sportsdataverse-py Python package in publications, use:
BibTex Citation
@misc{gilani_sdvpy_2021,
author = {Gilani, Saiem},
title = {sportsdataverse-py: The SportsDataverse's Python Package for Sports Data.},
url = {https://py.sportsdataverse.org},
season = {2021}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sportsdataverse-0.0.50.tar.gz.
File metadata
- Download URL: sportsdataverse-0.0.50.tar.gz
- Upload date:
- Size: 7.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f8c52368fe8d2adbd2b9fc0130b6a5de7c62ba5c75e71139f70e389257141e53
|
|
| MD5 |
ef4a3101d1ec45521cb67d9ef2e7dd24
|
|
| BLAKE2b-256 |
e38768d75bbe02ad78bd54e170417437d8ba987033fe886e884ef35b40e5ea53
|
Provenance
The following attestation bundles were made for sportsdataverse-0.0.50.tar.gz:
Publisher:
python-publish.yml on sportsdataverse/sportsdataverse-py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sportsdataverse-0.0.50.tar.gz -
Subject digest:
f8c52368fe8d2adbd2b9fc0130b6a5de7c62ba5c75e71139f70e389257141e53 - Sigstore transparency entry: 1463005141
- Sigstore integration time:
-
Permalink:
sportsdataverse/sportsdataverse-py@6ba989ab0cb220f8edd588de14325c584b8ee507 -
Branch / Tag:
refs/tags/0.0.50 - Owner: https://github.com/sportsdataverse
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@6ba989ab0cb220f8edd588de14325c584b8ee507 -
Trigger Event:
release
-
Statement type:
File details
Details for the file sportsdataverse-0.0.50-py3-none-any.whl.
File metadata
- Download URL: sportsdataverse-0.0.50-py3-none-any.whl
- Upload date:
- Size: 7.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dcc9d4b1999bf4ecf58ef7842cabcdd312918ec43abc7600250bcd9c80b3ffc4
|
|
| MD5 |
bb185b5b8d08ba4ed7f99ef8d04449e8
|
|
| BLAKE2b-256 |
da4dab0765ce28b138d9bf41b8cd58bdacc277878da5f2e9444e6e3b187752d4
|
Provenance
The following attestation bundles were made for sportsdataverse-0.0.50-py3-none-any.whl:
Publisher:
python-publish.yml on sportsdataverse/sportsdataverse-py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sportsdataverse-0.0.50-py3-none-any.whl -
Subject digest:
dcc9d4b1999bf4ecf58ef7842cabcdd312918ec43abc7600250bcd9c80b3ffc4 - Sigstore transparency entry: 1463005162
- Sigstore integration time:
-
Permalink:
sportsdataverse/sportsdataverse-py@6ba989ab0cb220f8edd588de14325c584b8ee507 -
Branch / Tag:
refs/tags/0.0.50 - Owner: https://github.com/sportsdataverse
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@6ba989ab0cb220f8edd588de14325c584b8ee507 -
Trigger Event:
release
-
Statement type: