Skip to main content

Run configuration management utils: combines configparser, argparse, and wandb.API

Project description

prefigure

Run-configuration management utils: combines configparser, argparse, and wandb.API

Capabilities for archiving run settings and pulling configurations from previous runs. With just 3 lines of code 😎 : the import, the arg setup, & the wandb push.

Combines argparse, configparser, and wandb.API. WandB logging is done via pytorch_lightning's WandBLogger.

Install:

pip install prefigure

Instructions:

All your usual command line args (with the exception of --name and --training-dir) are now to be specified in a defaults.ini file -- see examples/ for an example.
A different .ini file can be specified via --config-file.

Versions 0.0.9 and later: A .gin can be instead be used for --config-file, in which case the sytem only runs gin and nothing else.

The option --wandb-config <url> pulls previous runs' configs off wandb, where <url> is the url of any one of your runs to override those defaults: e.g. --wandb-config='https://wandb.ai/drscotthawley/delete-me/runs/1m2gh3o1?workspace=user-drscotthawley'` (i.e., whatever URL you grab from your browser window when looking at an individual run.)

NOTE: the --wandb-config thing can only pull from WandB runs that used prefigure, i.e. that have logged a "wandb config push".

Any command line args you specify will override any settings from WandB and/or the .ini file.

The order of precedence is "command line args override WandB, which overrides the .ini file".

1st line to add

In your run/training code, add this near the top:

from prefigure import get_all_args, push_wandb_config

2nd line to add

Near the top of your main(), add this:

args = get_all_args()

Further down in your code, comment-out (or delete) all your command-line arguments (e.g. ArgParse calls). If you want different command-line arguments, then add or change them in defaults.ini. The 'help' string for these is provided via comment in the line preceding your variable. See examples/defaults.ini for examples.

3rd line to add

and then right after you define the wandb logger, run

push_wandb_config(wandb_logger, args)

(Optional:) 4th & 5ths line to add: OFC

Starting with prefigure v0.0.8, there is an On-the-Fly Control (OFC, pronounced like what you say when you realize you forget to set a variable properly). This tracks any changes to arguments listed as "steerable" by logging to a separate file (by default ofc.ini) and updates those args dyanmically when changes to that file are made. It can also (optionally) log those changes to WandB (and when they occur); see sample usage below.

from prefigure import OFC
...
ofc = OFC(args, steerables=vars(args).keys())  # allow all args to be steerable

or fancier: with the Gradio GUI, and only allowing OFC steering for certain variables (default is all are steerable), and only launch one GUI for a DDP PyTorch Lightning process:

ofc = OFC(args, gui=(trainer.global_rank==0), steerables=['lr','demo_every','demo_steps', 'num_demos','checkpoint_every']) 

If the GUI is enabled, you get a Gradio URL, which is also pushed to wandb (as "Media"). By default this URL is on localhost, however, if environment variables OFC_USERNAME and OFC_PASSWORD are set, then a temporary public Gradio is obtained. (Since these temporary public URLs expire after 72 hours, we re-launch the GUI every 71 hours and update the link on WandB.)

Also, if you set sliders=True when calling OFC(), the float and int variables will get sliders (with max & min guessed at by arg values). Otherwise, the default is that all variables (excep bool types) are expressed via text fields.

Sample usage:

Here's a rough outline of some pytorch code. See examples/ for more.

import torch
import torch.utils.data as data
from prefigure import get_all_args, push_wandb_config, OFC
import pytorch_lightning as pl
import wandb

def main():

    # Config setup. Order of preference will be:
    #   1. Default settings are in defaults.ini file or whatever you specify via --config-file
    #   2. if --wandb-config is given, pull config from wandb to override defaults
    #   3. Any new command-line arguments override whatever was set earlier
    args = get_all_args()

    ofc = OFC(args)  # optional

    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    torch.manual_seed(args.seed)

    train_set = SampleDataset([args.training_dir], args)
    train_dl = data.DataLoader(train_set, args.batch_size, shuffle=True,
                               num_workers=args.num_workers, persistent_workers=True, pin_memory=True)
    wandb_logger = pl.loggers.WandbLogger(project=args.name)

    # push config to wandb for archiving, but don't push --training-dir value to WandB
    push_wandb_config(wandb_logger, args, omit=['training_dir']) 

    demo_dl = data.DataLoader(train_set, args.num_demos, shuffle=True)
    ...
        #inside training loop

        # OFC usage (optional)
        if hasattr(args,'check_ofc_every') and (step > 0) and (step % args.check_ofc_every == 0):
            changes_dict = ofc.update()   # check for changes. NOTE: all "args" updated automatically
            if {} != changes_dict:        # other things to do with changes: log to wandb
                wandb.log({'args/'+k:v for k,v in changes_dict.items()}, step=step) 

        # For easy drop-in OFC capability, keep using args.XXXX for all variables....)
        if (step > 0) and (step % args.checkpoint_every == 0):... 
            lr = args.learning_rate   # for example. args.learning_rate gets updated by ofc.update()
            do_stuff(lr)              # your code here

Extra Tricks

Imports & Other File Formats

prefigure defaults to .ini files, but will also read .json and .gin files. It will also import said files that are specified as values -- if these parameters are listed via a separate "imports" parameter, as in the following example:

$ cat examples/harmonai-tools.ini 
[DEFAULTS]
# model config fle
model_config = ../../harmonai-tools/harmonai_tools/configs/model_configs/diffusion_autoencoders/seanet_32_32_diffae.json

# dataset config file 
dataset_config = ../../harmonai-tools/harmonai_tools/configs/dataset_configs/s3_wds_example.json

imports = model_config, dataset_config

In this case, both args.model_config and args.dataset_config will have their filename value string replaced by the dict(s) specified in the .json files given. If they were not listed under imports, then the filename value will remain and no import will occur.

Lightning

If you want to pass around the ofc object deep inside other libraries, e.g., PyTorch Lightning, I've had success overloading Lightning's Trainer object, e.g. trainer.ofc = ofc. Then do something like module.ofc.update() inside the training routine. For example, cf. my tweet about this.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

prefigure-0.0.10-py3-none-any.whl (11.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page