Skip to main content

A CLI to tune NGBoost

Project description

ngboost-tuner

A CLI Tuner of NGBoost

Install

pip install ngboost-tuner

Build from source

# Pull the code
git clone git@github.com:ryan-wolbeck/ngboost-tuner.git
# Build the container and detach
docker-compose up --build -d
# Exec into the container
docker-compose exec tuner bash
# Run the tuner
python -m ngboost_tuner tune -i file.tsv

Example docker-compose.yml

version: '3.7' 
services:
  tuner:
    container_name: tuner
    build: .
    volumes:
      - .:/usr/src/app
    environment:
      - TARGET=target
      - ID=userid
      - TRAIN_COLUMNS=var1,var2
      - INPUT_FILE=/usr/src/app/file.tsv
      - TRAIN_FILE=/usr/src/data/train_dataset.csv
      - TEST_FILE=/usr/src/data/test_dataset.csv
      - VALIDATION_FILE=/usr/src/data/val_dataset.csv
      - SEPERATOR=,
      - COMPRESSION=gzip
      - LIGHTGBM=True
    restart: unless-stopped
    command: tail -f /dev/null

Run

ngboost_tuner tune 2> file.log

usage: ngboost_tuner tune [-h] [-i INPUT] [-s INPUT_FILE_SEPERATOR]
                          [-ct COMPRESSION_TYPE] [-tf TRAIN_FILE]
                          [-tef TEST_FILE] [-vf VALIDATION_FILE] [-l LIMIT]
                          [-id ID_KEY] [-t TARGET] [-c COLUMN]
                          [-ef EVALUATION_FRACTION] [-m MINIBATCH_FRAC]
                          [-d MAX_DEPTH_RANGE] [-n N_SEARCH_BOOSTERS]
                          [-nf FINAL_BOOSTERS] [-lgbm LIGHTGBM]

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT, --input INPUT, --input-file INPUT
                        Input file data; defaults to $INPUT_FILE if not set
  -s INPUT_FILE_SEPERATOR, --input-file-seperator INPUT_FILE_SEPERATOR
                        Input data file seperator, i.e. commas or tabs;
                        defaults to $SEPERATOR if not set
  -ct COMPRESSION_TYPE, --compression-type COMPRESSION_TYPE
                        Input data compression, i.e. gzip or None; defaults to
                        $COMPRESSION if not set
  -tf TRAIN_FILE, --train-file TRAIN_FILE
                        Input train data; defaults to $TRAIN_FILE if not set
  -tef TEST_FILE, --test-file TEST_FILE
                        Input test data; defaults to $TEST_FILE if not set
  -vf VALIDATION_FILE, --validation-file VALIDATION_FILE
                        Input validation data; defaults to $VALIDATION_FILE if
                        not set
  -l LIMIT, --limit LIMIT
                        Proportion of input tsv to use, .2 is 20 percent.
                        Default: 1.0 or all of input
  -id ID_KEY, --id-key ID_KEY
                        ID to consider for splits to prevent leakage. Default:
                        ID environment variable
  -t TARGET, --target TARGET
                        Target variable (predicted variable). Default value:
                        TARGET environment variable
  -c COLUMN, --column COLUMN
                        The full list of columns: Defaults to TRAIN_COLUMNS
                        environment variable
  -ef EVALUATION_FRACTION, --evaluation-fraction EVALUATION_FRACTION
                        Proportion of loadnums used for evaluation .2 is 20
                        percent of training leaving 80 percent train, 10
                        percent test, 10 percent validation. Default = .2
  -m MINIBATCH_FRAC, --minibatch-frac MINIBATCH_FRAC
                        Sample proportion for each boosting round during
                        hyperopt. Default = 1.0 or 100 percent
  -d MAX_DEPTH_RANGE, --max-depth-range MAX_DEPTH_RANGE
                        The range to test the max depth of the base learner.
                        Default 5 tests max_depth 2-5
  -n N_SEARCH_BOOSTERS, --n-search-boosters N_SEARCH_BOOSTERS
                        Number of n_estimators(booster) to use when searching.
                        Default = 20
  -nf FINAL_BOOSTERS, --final-boosters FINAL_BOOSTERS
                        Number of n_estimators(booster) to use to run the
                        final model. Default = 500
  -lgbm LIGHTGBM, --lightgbm LIGHTGBM
                        Set to true for lightgbm as base regresor. Default
                        $LIGHTGBM

Reference

[1] T. Duan, et al., NGBoost: Natural Gradient Boosting for Probabilistic Prediction (2019), ArXiv 1910.03225

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ngboost-tuner-0.0.2.tar.gz (3.1 kB view details)

Uploaded Source

File details

Details for the file ngboost-tuner-0.0.2.tar.gz.

File metadata

  • Download URL: ngboost-tuner-0.0.2.tar.gz
  • Upload date:
  • Size: 3.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.7

File hashes

Hashes for ngboost-tuner-0.0.2.tar.gz
Algorithm Hash digest
SHA256 6fde73214ac219f7997a618aa8ad1d1da9b1b5ba494178e039128e45ea35e621
MD5 153f096805e16287cb108290e6faffdd
BLAKE2b-256 2136bb5f303b37ff9034932a39ee0536eb9c1ac34ea937df6c2b37f01ce17b81

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page