python package to clean up ETL functions using nba_scraper output as input

These details have not been verified by PyPI

Project links

Homepage

Project description

`nba_parser`

This will be a repository where I store all my scripts and tests for compiling and calculating NBA game data from play by play dataframe objects.

The main hook of the nba_parser will the the PbP class that will take a play by play Pandas dataframe either as the direct output from my nba_scraper or as a Pandas dataframe created from the csv output of the nba_scraper saved to file.

Player Stats

Player stats can be calculated from a play by play dataframe with just a few lines of code.

import nba_scraper.nba_scraper as ns
import nba_parser as npar

game_df = ns.scrape_game([20700233])
pbp = npar.PbP(game_df)
player_stats = pbp.playerbygamestats()

#can also derive single possessions for RAPM calculations

rapm_shifts = pbp.rapm_possessions()

Which produces a dataframe containing the stats of field goals made, field goals attempted, three points made, three points attempted, free throws made, free throws attempted, steals, turnovers, blocks, personal fouls, minutes played(toc), offensive rebounds, possessions and defensive rebounds.

Team Stats

Team stats are called very similar to player stats.

import nba_scraper.nba_scraper as ns
import nba_parser as npar

game_df = ns.scrape_game([20700233])
pbp = npar.PbP(game_df)
team_stats = pbp.teambygamestats()

The team stats that will be calculation are field goals made, field goals attempted, three points made, three points attempted, free throws made, free throws attempted, steals, turnovers, blocks, personal fouls, minutes played(toc), offensive rebounds, possessions, home team, winning team, fouls drawn, shots blocked, total points for, total points against, and defensive rebounds.

Team Totals

I've grouped together other stat calculations that work better with larger sample sizes. This class takes a list of outputs from PbP.teambygamestats() but really it could take a list of dataframes that are the same structure as that method output. Here's an example of how it could work in conjunction with nba_scraper. I suggest writing the pbp returns to file and then importing them as the nba_scraper could time out due to the NBA api timing out from being hit too many times.

import nba_scraper.nba_scraper as ns
import nba_parser as npar

tbg_dfs = []
for game_id in range(20700001, 20700010):
    game_df = ns.scrape_game([game_id])
    pbp = np.PbP(game_df)
    team_stats = pbp.teambygamestats()
    tbg_dfs.append(team_stats)

team_totals = npar.TeamTotals(tbg_dfs)

#produce a dataframe of eFG%, TS%, TOV%, OREB%, FT/FGA, Opponent eFG%,
#Opponent TOV%, DREB%, Opponent FT/FGA, along with summing the other
#stats produced by the teambygamestats() method to allow further
#calculations

team_adv_stats = team_totals.team_advanced_stats()


#to calculate a RAPM regression for teams use this method

team_rapm_df = team_totals.team_rapm_results()

Player Totals

Like with TeamTotals i've grouped player stat calculations that work better with a larger sample size into its own class. A lot of the hooks are similar except for the RAPM calculation which is a static method due to the time to calculate player RAPM shifts is much longer than team shifts so its best to have them precalculated before attempting a RAPM regression to reduce time.

import nba_scraper.nba_scraper as ns
import nba_parser as npar

pbg_dfs = []
pbp_objects = []
for game_id in range(20700001, 2070010):
    game_df = ns.scrape_game([game_id])
    pbp = np.PbP(game_df)
    pbp_objects.append(pbp)
    player_stats = pbp.playerbygamestats()
    pbg_dfs.append(player_stats)

player_totals = npar.PlayerTotals(pbg_dfs)

#produce a dataframe of eFG%, TS%, TOV%, OREB%, AST%, DREB%,
#STL%, BLK%, USG%, along with summing the other
#stats produced by the playerbygamestats() method to allow further
#calculations

player_adv_stats = player_totals.player_advanced_stats()


#to calculate a RAPM regression for players first have to calculate
#RAPM possessions from the list of PbP objects we collected above

rapm_possession = pd.concat([x.rapm_possessions() for x in pbp_objects])

player_rapm_df = npar.PlayerTotals.player_rapm_results(rapm_possession)

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.2.1

Aug 25, 2020

0.2

Apr 1, 2020

0.1

Mar 30, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nba_parser-0.2.1.tar.gz (15.1 kB view details)

Uploaded Aug 25, 2020 Source

File details

Details for the file nba_parser-0.2.1.tar.gz.

File metadata

Download URL: nba_parser-0.2.1.tar.gz
Upload date: Aug 25, 2020
Size: 15.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.1 CPython/3.8.5

File hashes

Hashes for nba_parser-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`892ea97952198e9fd44c03e057af67ab1e8e296b3779f032aacd32d6b40f45d4`
MD5	`2a4af7d94a99b642a1f57545e4755532`
BLAKE2b-256	`0a5c06c7f3626d85521e76cbff33454d4ab982221e3a122dd63715de9c594b16`

See more details on using hashes here.

nba-parser 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

`nba_parser`

Player Stats

Team Stats

Team Totals

Player Totals

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes