Skip to main content

Internal functions for NCAA March Madness 2020

Project description

ncaa-march-madness-2020

The goal of ncaa-march-madness-2020 is to store the notebooks for this Kaggle Competition, see GitBook including

  • Baseline
  • XGBOOST \u8d85\u53c2\u6570\u8c03\u6574
  • Target encoding
  • ID embedding
  • GBDT + LR
  • GBDT + LR k-fold
  • \u53d8\u91cf\u91cd\u8981\u6027
  • Linear vs.\u00a0Tree linear?
  • Auto-encoder \u67e5\u8be2\u5f02\u5e38\u503c
  • Python \u5305\u8bf4\u660e

We publish our package with some internal functions, install with

pip install ncaa-march-madness-2020

How to use

All notebooks work in the analysis directory, and save all data files in input, output and data directories.

fs::dir_tree(\"analysis\", recurse = TRUE, regexp = \"ipynb\")
#> analysis
#> +-- baseline.ipynb
#> +-- evaluate-features.ipynb
#> +-- gbdt_lr.ipynb
#> +-- gbdt_lr_CV.ipynb
#> +-- id2vec.ipynb
#> +-- linear-base-learner.ipynb
#> +-- march-madness-2020-ncaam-simple-lightgbm-on-kfold.ipynb
#> +-- Obtain_Answer.ipynb
#> +-- outliers.ipynb
#> +-- params_tuning.ipynb
#> +-- paris-madness.ipynb
#> +-- pkg_test.ipynb
#> \\-- target-encoding.ipynb
fs::dir_tree(recurse = TRUE, regexp = \"input|output|data\")
#> .
#> +-- data
#> |   +-- feature_importances.csv
#> |   +-- id2vec.npy
#> |   +-- NCAA2020_Kenpom.csv
#> |   +-- outlier_df.csv
#> |   +-- submission_True.csv
#> |   +-- team_strength_embedding.csv
#> |   +-- Tourney_Reuslt.csv
#> |   \\-- Tourney_Reuslt_inputs.csv
#> +-- input
#> |   +-- google-cloud-ncaa-march-madness-2020-division-1-mens-tournament
#> |   |   +-- MDataFiles_Stage1
#> |   |   |   +-- Cities.csv
#> |   |   |   +-- Conferences.csv
#> |   |   |   +-- MConferenceTourneyGames.csv
#> |   |   |   +-- MGameCities.csv
#> |   |   |   +-- MMasseyOrdinals.csv
#> |   |   |   +-- MNCAATourneyCompactResults.csv
#> |   |   |   +-- MNCAATourneyDetailedResults.csv
#> |   |   |   +-- MNCAATourneySeedRoundSlots.csv
#> |   |   |   +-- MNCAATourneySeeds.csv
#> |   |   |   +-- MNCAATourneySlots.csv
#> |   |   |   +-- MRegularSeasonCompactResults.csv
#> |   |   |   +-- MRegularSeasonDetailedResults.csv
#> |   |   |   +-- MSeasons.csv
#> |   |   |   +-- MSecondaryTourneyCompactResults.csv
#> |   |   |   +-- MSecondaryTourneyTeams.csv
#> |   |   |   +-- MTeamCoaches.csv
#> |   |   |   +-- MTeamConferences.csv
#> |   |   |   +-- MTeams.csv
#> |   |   |   \\-- MTeamSpellings.csv
#> |   |   +-- MEvents2015.csv
#> |   |   +-- MEvents2016.csv
#> |   |   +-- MEvents2017.csv
#> |   |   +-- MEvents2018.csv
#> |   |   +-- MEvents2019.csv
#> |   |   +-- MPlayers.csv
#> |   |   \\-- MSampleSubmissionStage1_2020.csv
#> |   \\-- google-cloud-ncaa-march-madness-2020-division-1-mens-tournament.zip
#> +-- large_data
#> \\-- output
#>     \\-- paris-submission.csv

Download Data

From https://github.com/Kaggle/kaggle-api

kaggle competitions download -c google-cloud-ncaa-march-madness-2020-division-1-mens-tournament -p input
mkdir input/google-cloud-ncaa-march-madness-2020-division-1-mens-tournament
unzip input/google-cloud-ncaa-march-madness-2020-division-1-mens-tournament.zip -d input/google-cloud-ncaa-march-madness-2020-division-1-mens-tournament

Code of Conduct

Please note that the ncaa-march-madness-2020 project is released with a Contributor Code of Conduct.
By contributing to this project, you agree to abide by its terms.

License

Apache License (\>= 2.0) \u00a9 Jiaxiang Li;Jiatao Li;Zhipeng Liang;Yue Pan

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ncaa_march_madness_2020-0.0.2.tar.gz (5.3 kB view hashes)

Uploaded Source

Built Distribution

ncaa_march_madness_2020-0.0.2-py3-none-any.whl (5.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page