Internal functions for NCAA March Madness 2020
Project description
ncaa-march-madness-2020
The goal of ncaa-march-madness-2020 is to store the notebooks for this Kaggle Competition, see GitBook including
- Baseline
- XGBOOST \u8d85\u53c2\u6570\u8c03\u6574
- Target encoding
- ID embedding
- GBDT + LR
- GBDT + LR k-fold
- \u53d8\u91cf\u91cd\u8981\u6027
- Linear vs.\u00a0Tree linear?
- Auto-encoder \u67e5\u8be2\u5f02\u5e38\u503c
- Python \u5305\u8bf4\u660e
We publish our package with some internal functions, install with
pip install ncaa-march-madness-2020
How to use
All notebooks work in the analysis
directory, and save all data files
in input
, output
and data
directories.
fs::dir_tree(\"analysis\", recurse = TRUE, regexp = \"ipynb\")
#> analysis
#> +-- baseline.ipynb
#> +-- evaluate-features.ipynb
#> +-- gbdt_lr.ipynb
#> +-- gbdt_lr_CV.ipynb
#> +-- id2vec.ipynb
#> +-- linear-base-learner.ipynb
#> +-- march-madness-2020-ncaam-simple-lightgbm-on-kfold.ipynb
#> +-- Obtain_Answer.ipynb
#> +-- outliers.ipynb
#> +-- params_tuning.ipynb
#> +-- paris-madness.ipynb
#> +-- pkg_test.ipynb
#> \\-- target-encoding.ipynb
fs::dir_tree(recurse = TRUE, regexp = \"input|output|data\")
#> .
#> +-- data
#> | +-- feature_importances.csv
#> | +-- id2vec.npy
#> | +-- NCAA2020_Kenpom.csv
#> | +-- outlier_df.csv
#> | +-- submission_True.csv
#> | +-- team_strength_embedding.csv
#> | +-- Tourney_Reuslt.csv
#> | \\-- Tourney_Reuslt_inputs.csv
#> +-- input
#> | +-- google-cloud-ncaa-march-madness-2020-division-1-mens-tournament
#> | | +-- MDataFiles_Stage1
#> | | | +-- Cities.csv
#> | | | +-- Conferences.csv
#> | | | +-- MConferenceTourneyGames.csv
#> | | | +-- MGameCities.csv
#> | | | +-- MMasseyOrdinals.csv
#> | | | +-- MNCAATourneyCompactResults.csv
#> | | | +-- MNCAATourneyDetailedResults.csv
#> | | | +-- MNCAATourneySeedRoundSlots.csv
#> | | | +-- MNCAATourneySeeds.csv
#> | | | +-- MNCAATourneySlots.csv
#> | | | +-- MRegularSeasonCompactResults.csv
#> | | | +-- MRegularSeasonDetailedResults.csv
#> | | | +-- MSeasons.csv
#> | | | +-- MSecondaryTourneyCompactResults.csv
#> | | | +-- MSecondaryTourneyTeams.csv
#> | | | +-- MTeamCoaches.csv
#> | | | +-- MTeamConferences.csv
#> | | | +-- MTeams.csv
#> | | | \\-- MTeamSpellings.csv
#> | | +-- MEvents2015.csv
#> | | +-- MEvents2016.csv
#> | | +-- MEvents2017.csv
#> | | +-- MEvents2018.csv
#> | | +-- MEvents2019.csv
#> | | +-- MPlayers.csv
#> | | \\-- MSampleSubmissionStage1_2020.csv
#> | \\-- google-cloud-ncaa-march-madness-2020-division-1-mens-tournament.zip
#> +-- large_data
#> \\-- output
#> \\-- paris-submission.csv
Download Data
From https://github.com/Kaggle/kaggle-api
kaggle competitions download -c google-cloud-ncaa-march-madness-2020-division-1-mens-tournament -p input
mkdir input/google-cloud-ncaa-march-madness-2020-division-1-mens-tournament
unzip input/google-cloud-ncaa-march-madness-2020-division-1-mens-tournament.zip -d input/google-cloud-ncaa-march-madness-2020-division-1-mens-tournament
Code of Conduct
Please note that the ncaa-march-madness-2020
project is released with
a Contributor Code of
Conduct.
By
contributing to this project, you agree to abide by its terms.
License
Apache License (\>= 2.0) \u00a9 Jiaxiang Li;Jiatao Li;Zhipeng Liang;Yue Pan
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ncaa_march_madness_2020-0.0.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6d04ec3581b5f46f6f96b7442c1a1badaa481184056c20ffad9737d2f6d6b3d9 |
|
MD5 | 71ca6e39e0d523a609aa4d46acbb8d7a |
|
BLAKE2b-256 | fa5ea489462809e8bca8e0168c6d05d57e1be578920c194e2e195403e538c5b9 |
Hashes for ncaa_march_madness_2020-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7f504b5600342c4544b39019ab5af57da26bc26eb9a728c00c41fd10e9fe2658 |
|
MD5 | d474244710d9873c77fcaf0fd3d51cf2 |
|
BLAKE2b-256 | db9f6d3740703338b163669867c7a16905f1633610afde45ac2d44043772bc8c |