Python package for loading and caching CSVs hosted on github into pandas dataframes

These details have not been verified by PyPI

Project links

Project description

nfelo DCM

nfelo DCM is an abstraction layer for loading and caching NFL related CSVs stored on the web. DCM stands for Dataframe-CSV Mapping. The goal of the DCM is to get pandas dataframes of fresh data loaded in a way that balances simplicity, efficiency, and performance.

import nfelodcm

## Load 2 dataframes
db = nfelodcm.load(['pbp', 'games'])

## access the PBP dataframe
db['pbp']

Maps

Maps are config files that tell the DCM where data CSVs are located, how they should be retrieved, and what fields to pull. Each CSV has its own config in Maps/{table}.json, where parameters can be set for things like freshness SLAs, CSV parsing engines, iteration strategy, and assignments (mutations).

An important characteristic of these maps is that all fields must be 1) specified in the map and 2) typed. Fields not listed in the map will not be loaded. Untyped fields will throw an error.

Here is a sample config:

{
  "name": "games",
  "description": "nflgamedata games",
  "download_url": "https://raw.githubusercontent.com/nflverse/nfldata/master/data/games.csv",
  "compression": null,
  "engine": "c",
  "freshness": {
    "type": "gh_commit",
    "gh_api_endpoint": "https://api.github.com/repos/nflverse/nfldata/commits",
    "gh_release_tag": null,
    "sla_seconds": 500
  },
  "iter": {
    "type": null,
    "start": null
  },
  "assignments": [
    "fastr_team_id_repl",
    "score_clean"
  ],
  "map": {
    "game_id": "object",
    "season": "int32",
    "week": "int32",
    ...
  }
}

Config Fields

Field	Description
`name`	Table identifier
`description`	Human-readable description
`download_url`	URL to fetch CSV (use `{0}` placeholder for season in iter tables)
`compression`	Compression type (`"gzip"`, `null`)
`engine`	Pandas CSV engine (`"c"`, `"python"`)
`freshness.type`	`"gh_release"` or `"gh_commit"`
`freshness.gh_api_endpoint`	GitHub API endpoint for freshness checks
`freshness.gh_release_tag`	Release tag for `gh_release` type
`freshness.sla_seconds`	Seconds before re-checking freshness
`iter.type`	`"season"` for multi-file tables, `null` for single file
`iter.start`	Starting year for season iteration
`iter.accept_partial`	Allow success if some season files fail
`assignments`	List of assignment function names to apply
`map`	Column name → dtype mapping

Freshness

The DCM uses a two-tier freshness strategy:

SLA Check: If the last freshness check was within sla_seconds, skip the remote check entirely
Remote Check: Query GitHub API to compare remote timestamps against local state

For gh_release tables, freshness is determined by the updated_at timestamp of release assets. For gh_commit tables, freshness is based on the latest commit date.

Per-File Freshness (v0.2.1+)

For season-iterated tables (pbp, rosters, player_stats, etc.), the DCM tracks freshness per-season. When an update is needed, only stale seasons are re-downloaded - cached seasons are read from Data/Parts/{table}/. This significantly reduces bandwidth for incremental updates.

Data Storage

Data/
  games.csv
  pbp.csv                # Combined table CSV
  Parts/
    pbp/
      1999.csv           # Per-season cache (iter tables only)
      2000.csv
      ...
State/
  Tables/
    games.json           # Per-table state (last_local_update, last_freshness_check)
    pbp.json
  Parts/
    pbp.json             # Per-season timestamps (iter tables only)
  Global/
    season_state.json    # Current NFL season state

Assignments

Assignments are DataFrame transformations applied after data is pulled. They take a DataFrame as input and return a mutated DataFrame. Assignments are defined in Engine/Assignments/ and referenced by name in config files.

Common assignments include:

fastr_team_id_repl - Standardize team abbreviations
score_clean - Fix known data errors in game scores
penalty_formatting - Parse penalty descriptions

GitHub Token (Optional)

To increase GitHub API rate limits from 60/hr to 5,000/hr, create a .env file in your working directory:

GITHUB_TOKEN=ghp_your_token_here

The token is used for freshness checks only, not for downloading CSVs. A token is only relevant/needed when pulling many tables with extremely fast processing times. In most use cases, the default rate limit is sufficient.

API

import nfelodcm

# Load tables
db = nfelodcm.load(['pbp', 'games'])

# Get a single DataFrame
df = nfelodcm.get_df('pbp')

# Get table config
config = nfelodcm.get_map('games')

# List available tables
tables = nfelodcm.list_tables()

# Get current season state
season, week = nfelodcm.get_season_state('last_full_week')

Further Detailed Documentation

File	Description
`nfelodcm/Engine/Primatives/README.md`	Core architecture of the DCM data pipeline
`tests/README.md`	Test suite for the DCM

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.1

Feb 7, 2026

0.2.0

Feb 1, 2026

0.1.24

Sep 8, 2025

0.1.23

Aug 25, 2025

0.1.22

Aug 24, 2025

0.1.21

May 8, 2025

0.1.20

Apr 12, 2025

0.1.19

Apr 12, 2025

0.1.18

Apr 11, 2025

0.1.17

Feb 21, 2025

0.1.16

Feb 16, 2025

0.1.15

Jan 3, 2025

0.1.14

Oct 14, 2024

0.1.13

Sep 1, 2024

0.1.12

Aug 18, 2024

0.1.11

Aug 17, 2024

0.1.10

Aug 16, 2024

0.1.9

Aug 7, 2024

0.1.8

Jul 10, 2024

0.1.7

Jun 28, 2024

0.1.5

Feb 19, 2024

0.1.4

Feb 10, 2024

0.1.3

Dec 31, 2023

0.1.2

Dec 31, 2023

0.1.1

Dec 29, 2023

0.0.1

Dec 28, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nfelodcm-0.2.1.tar.gz (39.8 kB view details)

Uploaded Feb 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

nfelodcm-0.2.1-py3-none-any.whl (52.3 kB view details)

Uploaded Feb 7, 2026 Python 3

File details

Details for the file nfelodcm-0.2.1.tar.gz.

File metadata

Download URL: nfelodcm-0.2.1.tar.gz
Upload date: Feb 7, 2026
Size: 39.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for nfelodcm-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`b0d203347f18e7128dc70501ee3f55b513df057c7d8b9eb6ce31a1d962facef1`
MD5	`afd6d0db9e684bf4a77f51ede2d5cf7c`
BLAKE2b-256	`d2e741f439fe01c3004c776d1fe1ba64f6378a0922affc6d036d6b6bc0814673`

See more details on using hashes here.

File details

Details for the file nfelodcm-0.2.1-py3-none-any.whl.

File metadata

Download URL: nfelodcm-0.2.1-py3-none-any.whl
Upload date: Feb 7, 2026
Size: 52.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for nfelodcm-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5b9cb92141e7f7190815137ab276e8c43ac83f5232046324cf28e12eeeb6c446`
MD5	`debc8ba10d6698a8d41cd1235fdf37dc`
BLAKE2b-256	`d5a3b673394e27c22d1dcf35ce70e8c0414e9fbda12016aae5748c7bcf26bd44`

See more details on using hashes here.

nfelodcm 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

nfelo DCM

Maps

Config Fields

Freshness

Per-File Freshness (v0.2.1+)

Data Storage

Assignments

GitHub Token (Optional)

API

Further Detailed Documentation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes