Skip to main content

Allows a user to download and parse data from the National Collegiate Athletics Association (NCAA), and it's member sports.

Project description

ncaa_stats_py

Allows a user to download and parse data from the National Collegiate Athletics Association (NCAA), and it's member sports.

Basic Setup

How to Install

This package is is available through the pip package manager, and can be installed through one of the following commands in your terminal/shell:

pip install ncaa_stats_py

OR

python -m pip install ncaa_stats_py

If you are using a Linux/Mac instance, you may need to specify python3 when installing, as shown below:

python3 -m pip install ncaa_stats_py

Alternatively, cfbd-json-py can be installed from this GitHub repository with the following command through pip:

pip install git+https://github.com/armstjc/ncaa_stats_py

OR

python -m pip install git+https://github.com/armstjc/ncaa_stats_py

OR

python3 -m pip install git+https://github.com/armstjc/ncaa_stats_py

How to Use

ncaa_stats_py separates itself by doing the following things when attempting to get data:

  1. Automatically caching any data that is already parsed
  2. Automatically forcing a 5 second sleep timer for any HTML call, to ensure that any function call from this package won't result in you getting IP banned (you do not need to add sleep timers if you're looping through, and calling functions in this python package).
  3. Automatically refreshing any cached data if the data hasn't been refreshed in a while.

For example, the following code will work as-is, and in the second loop, the code will load in the teams even faster because the data is cached on the device you're running this code.

from timeit import default_timer as timer

from ncaa_stats_py.baseball import (
    get_baseball_team_roster,
    get_baseball_teams
)

start_time = timer()

# Loads in a table with every DI NCAA baseball team in the 2024 season.
# If this is the first time you run this script,
# it may take some time to repopulate the NCAA baseball team information data.

teams_df = get_baseball_teams(season=2024, level="I")

end_time = timer()

time_elapsed = end_time - start_time
print(f"Elapsed time: {time_elapsed:03f} seconds.\n\n")

# Gets 5 random D1 teams from 2024
teams_df = teams_df.sample(5)
print(teams_df)
print()


# Let's send this to a list to make the loop slightly faster
team_ids_list = teams_df["team_id"].to_list()

# First loop
# If the data isn't cached, it should take 35-40 seconds to do this loop
start_time = timer()

for t_id in team_ids_list:
    print(f"On Team ID: {t_id}")
    df = get_baseball_team_roster(team_id=t_id)
    # print(df)

end_time = timer()

time_elapsed = end_time - start_time
print(f"Elapsed time: {time_elapsed:03f} seconds.\n\n")

# Second loop
# Because the data has been parsed and cached,
# this shouldn't take that long to loop through
start_time = timer()

for t_id in team_ids_list:
    print(f"On Team ID: {t_id}")
    df = get_baseball_team_roster(team_id=t_id)
    # print(df)

end_time = timer()
time_elapsed = end_time - start_time
print(f"Elapsed time: {time_elapsed:03f} seconds.\n\n")

Dependencies

ncaa_stats_py is dependent on the following python packages:

  • beautifulsoup4: To assist with parsing HTML data.
  • lxml: To work with beautifulsoup4 in assisting with parsing HTML data.
  • pandas: For DataFrame creation within package functions.
  • pytz: Used to attach timezone information for any date/date time objects encountered by this package.
  • requests: Used to make HTTPS requests.
  • tqdm: Used to show progress bars for actions in functions that are known to take minutes to load.

License

This package is licensed under the MIT license. You can view the package's license here.

Documentation

For more information about this package, its functions, and ways you can use said functions can be found at https://armstjc.github.io/ncaa_stats_py/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ncaa_stats_py-0.0.12.tar.gz (175.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ncaa_stats_py-0.0.12-py3-none-any.whl (185.9 kB view details)

Uploaded Python 3

File details

Details for the file ncaa_stats_py-0.0.12.tar.gz.

File metadata

  • Download URL: ncaa_stats_py-0.0.12.tar.gz
  • Upload date:
  • Size: 175.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ncaa_stats_py-0.0.12.tar.gz
Algorithm Hash digest
SHA256 c1452c8e6529d925d7cd2251ac0bf0080d7b146ffa9f6c44dca2a2b0477ea3ea
MD5 18f18a76c572d70129aca10e87dbdd50
BLAKE2b-256 2b74520445e79279d748165e64a015bfbd1ec6e90dc4e8014fdbbbc41795cc9e

See more details on using hashes here.

Provenance

The following attestation bundles were made for ncaa_stats_py-0.0.12.tar.gz:

Publisher: python-publish.yml on armstjc/ncaa_stats_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ncaa_stats_py-0.0.12-py3-none-any.whl.

File metadata

  • Download URL: ncaa_stats_py-0.0.12-py3-none-any.whl
  • Upload date:
  • Size: 185.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ncaa_stats_py-0.0.12-py3-none-any.whl
Algorithm Hash digest
SHA256 476df05cbd88a809f4729c69ac3a7bc280c5716b9301c12c48b883ed41282c59
MD5 2ca9ec15fc02af74f5a86af961770515
BLAKE2b-256 9de6119e07a6214379223e73f70619cc3e182be513f90376d3811fe1361493e0

See more details on using hashes here.

Provenance

The following attestation bundles were made for ncaa_stats_py-0.0.12-py3-none-any.whl:

Publisher: python-publish.yml on armstjc/ncaa_stats_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page