Skip to main content

Internal functions for NCAA March Madness 2020

Project description

ncaa-march-madness-2020

The goal of ncaa-march-madness-2020 is to store the notebooks for this Kaggle Competition, see GitBook including

  • Baseline
  • XGBOOST \u8d85\u53c2\u6570\u8c03\u6574
  • Target encoding
  • ID embedding
  • GBDT + LR
  • GBDT + LR k-fold
  • \u53d8\u91cf\u91cd\u8981\u6027
  • Linear vs.\u00a0Tree linear?
  • Auto-encoder \u67e5\u8be2\u5f02\u5e38\u503c
  • Python \u5305\u8bf4\u660e

We publish our package with some internal functions, install with

pip install ncaa-march-madness-2020

How to use

All notebooks work in the analysis directory, and save all data files in input, output and data directories.

fs::dir_tree(\"analysis\", recurse = TRUE, regexp = \"ipynb\")
#> analysis
#> +-- baseline.ipynb
#> +-- evaluate-features.ipynb
#> +-- gbdt_lr.ipynb
#> +-- gbdt_lr_CV.ipynb
#> +-- id2vec.ipynb
#> +-- linear-base-learner.ipynb
#> +-- march-madness-2020-ncaam-simple-lightgbm-on-kfold.ipynb
#> +-- Obtain_Answer.ipynb
#> +-- outliers.ipynb
#> +-- params_tuning.ipynb
#> +-- paris-madness.ipynb
#> +-- pkg_test.ipynb
#> \\-- target-encoding.ipynb
fs::dir_tree(recurse = TRUE, regexp = \"input|output|data\")
#> .
#> +-- data
#> |   +-- feature_importances.csv
#> |   +-- id2vec.npy
#> |   +-- NCAA2020_Kenpom.csv
#> |   +-- outlier_df.csv
#> |   +-- submission_True.csv
#> |   +-- team_strength_embedding.csv
#> |   +-- Tourney_Reuslt.csv
#> |   \\-- Tourney_Reuslt_inputs.csv
#> +-- input
#> |   +-- google-cloud-ncaa-march-madness-2020-division-1-mens-tournament
#> |   |   +-- MDataFiles_Stage1
#> |   |   |   +-- Cities.csv
#> |   |   |   +-- Conferences.csv
#> |   |   |   +-- MConferenceTourneyGames.csv
#> |   |   |   +-- MGameCities.csv
#> |   |   |   +-- MMasseyOrdinals.csv
#> |   |   |   +-- MNCAATourneyCompactResults.csv
#> |   |   |   +-- MNCAATourneyDetailedResults.csv
#> |   |   |   +-- MNCAATourneySeedRoundSlots.csv
#> |   |   |   +-- MNCAATourneySeeds.csv
#> |   |   |   +-- MNCAATourneySlots.csv
#> |   |   |   +-- MRegularSeasonCompactResults.csv
#> |   |   |   +-- MRegularSeasonDetailedResults.csv
#> |   |   |   +-- MSeasons.csv
#> |   |   |   +-- MSecondaryTourneyCompactResults.csv
#> |   |   |   +-- MSecondaryTourneyTeams.csv
#> |   |   |   +-- MTeamCoaches.csv
#> |   |   |   +-- MTeamConferences.csv
#> |   |   |   +-- MTeams.csv
#> |   |   |   \\-- MTeamSpellings.csv
#> |   |   +-- MEvents2015.csv
#> |   |   +-- MEvents2016.csv
#> |   |   +-- MEvents2017.csv
#> |   |   +-- MEvents2018.csv
#> |   |   +-- MEvents2019.csv
#> |   |   +-- MPlayers.csv
#> |   |   \\-- MSampleSubmissionStage1_2020.csv
#> |   \\-- google-cloud-ncaa-march-madness-2020-division-1-mens-tournament.zip
#> +-- large_data
#> \\-- output
#>     \\-- paris-submission.csv

Download Data

From https://github.com/Kaggle/kaggle-api

kaggle competitions download -c google-cloud-ncaa-march-madness-2020-division-1-mens-tournament -p input
mkdir input/google-cloud-ncaa-march-madness-2020-division-1-mens-tournament
unzip input/google-cloud-ncaa-march-madness-2020-division-1-mens-tournament.zip -d input/google-cloud-ncaa-march-madness-2020-division-1-mens-tournament

Code of Conduct

Please note that the ncaa-march-madness-2020 project is released with a Contributor Code of Conduct.
By contributing to this project, you agree to abide by its terms.

License

Apache License (\>= 2.0) \u00a9 Jiaxiang Li;Jiatao Li;Zhipeng Liang;Yue Pan

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ncaa_march_madness_2020-0.0.2.tar.gz (5.3 kB view details)

Uploaded Source

Built Distribution

ncaa_march_madness_2020-0.0.2-py3-none-any.whl (5.3 kB view details)

Uploaded Python 3

File details

Details for the file ncaa_march_madness_2020-0.0.2.tar.gz.

File metadata

  • Download URL: ncaa_march_madness_2020-0.0.2.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.7.3

File hashes

Hashes for ncaa_march_madness_2020-0.0.2.tar.gz
Algorithm Hash digest
SHA256 6d04ec3581b5f46f6f96b7442c1a1badaa481184056c20ffad9737d2f6d6b3d9
MD5 71ca6e39e0d523a609aa4d46acbb8d7a
BLAKE2b-256 fa5ea489462809e8bca8e0168c6d05d57e1be578920c194e2e195403e538c5b9

See more details on using hashes here.

File details

Details for the file ncaa_march_madness_2020-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: ncaa_march_madness_2020-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 5.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.7.3

File hashes

Hashes for ncaa_march_madness_2020-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7f504b5600342c4544b39019ab5af57da26bc26eb9a728c00c41fd10e9fe2658
MD5 d474244710d9873c77fcaf0fd3d51cf2
BLAKE2b-256 db9f6d3740703338b163669867c7a16905f1633610afde45ac2d44043772bc8c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page