Skip to main content

A yaml-based configuration for reproducible python experiments.

Project description

Installation

Install with pip:

pip install experiment-config

Usage

expfig.Config allows for straightforward hyperparameter selection and logging.

It reads hyperparameters from YAML files, the command line, and user inputs and makes them available as both attributes and keys. It can be embedded in both a script or a class.

Quick Start

We will build a simple version of FizzBuzz that allows custom replacement of the words Fizz and Buzz.

A simple solution of FizzBuzz looks like this:

# examples/quick_start/fizz_buzz.py

class Solution:
    n = 15

    def fizzBuzz(self):
        out = []

        for j in range(1, self.n+1):
            val = ''
            if j % 3 == 0:
                val = 'Fizz'
            if j % 5 == 0:
                val += 'Buzz'
            elif not val:
                val = str(j)
            out.append(val)
        
        print(out)
        return out


Solution().fizzBuzz()

Calling python examples/quick_start/fizz_buzz.py at the command line will print

['1', '2', 'Fizz', '4', 'Buzz', 'Fizz', '7', '8', 'Fizz', 'Buzz', '11', 'Fizz', '13', '14', 'FizzBuzz']

We can use expfig.Config to quickly replace 'Fizz', 'Buzz', and the integer n at the command line.

Let's define a file fizz_buzz_default_config.yaml:

n: 15
words:
  buzz: Buzz
  fizz: Fizz

and replace the values of 'Fizz', 'Buzz', and n in our script with the corresponding values in the config. We will also add a pretty-print of our config, just to keep track of what is going on:

# examples/quick_start/fizz_buzz.py
from expfig import Config


class Solution:
    config = Config(default='fizz_buzz_default_config.yaml')
    
    def fizzBuzz(self):
        self.config.pprint()
        out = []

        for j in range(1, self.config.n+1):
            val = ''
            if j % 3 == 0:
                val = self.config.words.fizz
            if j % 5 == 0:
                val += self.config.words.buzz
            elif not val:
                val = str(j)
            out.append(val)
        
        print(out)
        return out


Solution().fizzBuzz()

Calling python examples/quick_start/fizz_buzz.py at the command line will now print

config:
    n: 15
    words:
        buzz: Buzz
        fizz: Fizz
['1', '2', 'Fizz', '4', 'Buzz', 'Fizz', '7', '8', 'Fizz', 'Buzz', '11', 'Fizz', '13', '14', 'FizzBuzz']

which is, as expected, our config followed by the solution.

We can now easily modify any combination of our values:

$ python examples/quick_start/fizz_buzz.py --n 10 --words.buzz Buzzword

config:
    n: 10
    words:
        buzz: Buzzword
        fizz: Fizz
['1', '2', 'Fizz', '4', 'Buzzword', 'Fizz', '7', '8', 'Fizz', 'Buzzword'].

This example can be viewed in the examples/quick_start directory.

verbose

--verbose is a special key that expfig.Config will read. It accepts positive integer values and will print the config is increasing verbosity depending on its value.

  • --verbose 0: nothing is printed (this is the default).
  • --verbose 1: the symmetric difference between the config and the default config is printed.
  • --verbose 2: the entire config is printed.

For example:

$ python examples/quick_start/fizz_buzz.py --n 10 --words.buzz Buzzword --verbose 1

 config:
     n: 10
     words:
         buzz: Buzzword
config:
    n: 10
    words:
        buzz: Buzzword
        fizz: Fizz
['1', '2', 'Fizz', '4', 'Buzzword', 'Fizz', '7', '8', 'Fizz', 'Buzzword']

The first block is the difference between the config and the default config, while the second is the pretty-print of the entire config.

Saving a Config

expfig.Config takes advantage of YAML-serialization (and de-serialization) for reproducibility.

You can use both expfig.Config.serialize and expfig.Config.serialize_to_dir for serialization.

expfig.Config.serialize performs a simple YAML-dump of the underlying dictionary:

# python examples/quick_start/serialize_fizz_buzz.py

from fizz_buzz import Solution

with open('simple_serialization.yaml', 'w') as f:
    Solution().config.serialize(f)

expfig.Config.serialize_to_dir will ensure that you are not overwriting any existing directories (if desired), and can also handle serializing the default config and the difference:

# python examples/quick_start/serialize_fizz_buzz.py


from fizz_buzz import Solution

# Serialize the underlying dict. Makes sure it does not overwrite any existing `fizz_buzz_config` directory
# by appending an integer on the end if one exists.
Solution().config.serialize_to_dir('fizz_buzz_configs')

# Same as the above, but also serialize the default config and the difference.
Solution().config.serialize_to_dir('fizz_buzz_configs_with_default', with_default=True)

You can then use expfig.Config.deserialize to load your saved serialization and reproduce your settings, for example:

from expfig import Config

with open('simple_serialization.yaml', 'r') as f:
    config = Config.deserialize(f)

Note that doing so effectively treats simple_serialization.yaml as a default config; you can use command-line arguments to update it upon loading.

Additional methods of inputting custom hyperparameters

There are three other ways to define custom settings/hyperparameters:

  1. You can pass the --config path_to_a_config.yaml argument at the command line.

    path_to_a_config.yaml may contain any combination of values as defined in your default config file; they must be in the same format. You may pass any number of config files this way:

    --config path_to_a_config.yaml path_to_another_config.yaml
    

    If you pass multiple config files with conflicting values, the value from the last config file will be used.

    Note that any values passed this way will be overridden by explicit arguments or arguments passed by the below two methods.

  2. You can pass a path to a yaml file containing settings to expfig.Config. You may pass any combination of values as defined in config/default_config.yaml; they must be in the same format. This is equivalent to the above except it is done within a script and not at the command line:

    from expfig import Config
    
    config = Config(config='path_to_a_config.yaml')
    
  3. You can pass a nested dictionary defining configuration settings to expfig.Config. For example:

    from expfig import Config
    
    config_dict = {
        'microgrid': {'config': {'scenario': 1}},
        'algo': {'sampler': {'type': 'local', 'n_workers': 4}},
        'context': {'verbose': True}
    }
    config = Config(config_dict)
    

Hyperparameter Resolution Order

You may encounter the situation where you pass the same key in different ways, with different values. For example, you may have a key in both your default config, a config passed via --config path_to_a_config.yaml, and by direct argument at the command line: --key value.

For any key set in your default config, the resolution order is as follows:

  1. Values passed directly to expfig.Config upon object initialization. This includes values defined within a config file passed to expfig.Config: expfig.Config(config='path_to_a_config.yaml).

  2. Values passed explicitly at the command line.

  3. Values within a config file passed at the command line. If multiple config files are passed and the key is contained in more than one of said files, the value from the last file will be used.

  4. Values within your default config.

Variations between different methods of setting parameters

1. Type casting

Values passed at the command line are casted to the type of the default value as defined by the yaml-load of the default value. For example, a default config containing value: 1 will result in the expectation that value is an int. This is not true with values passed in python code or in separate config files.

There are two exceptions to this:

  1. Values where the default value is None parse command line arguments to string.

  2. The string null passed at the command line results in the value None.

Additional Examples

An example of using expfig.Config to set hyperparameters for a machine learning problem is available in examples/knn. This example demonstrates a simple class to run a classification problem using scikit-learn's KNeighborsClassifier.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

experiment-config-0.3.0.tar.gz (12.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page