Skip to main content

A configue extension that adds the ability to dynamically configure your application via the command line.

Project description

Configue CLI

A configue extension that adds the ability to dynamically configure your application via the command line.

Configue CLI overlaps in functionality with Hydra but without all the unnecessary boilerplate and with the benefit of being compatible with configue.

Table of contents

Installation

To install the library, use

pip install configue-cli

To develop locally, clone the repository and use

pip install -r requirements-dev.txt

Quick start

With configue-cli, configurations are defined with structured and arbitrarily nested Python objects (both native dataclasses and attr dataclasses are supported and can be nested).

import dataclasses
import attrs


@dataclasses.dataclass
class DatasetConfig:
    name: str
    n_samples: int = 10_000


@dataclasses.dataclass
class OptimizerConfig:
    learning_rate: float = 0.001
    weight_decay: float = 1e-2


@attrs.define
class ModelConfig:
    name: str
    batch_size: int = 12
    optimizer: OptimizerConfig = attrs.Factory(
        lambda self: OptimizerConfig(learning_rate=0.001 * self.batch_size), takes_self=True
    )


@dataclasses.dataclass
class ExperimentConfig:
    model: ModelConfig
    dataset: DatasetConfig

These objects are injected at configuration time in your application entrypoint by the inject_from_cli decorator. To use configue-cli, simply wrap a click entrypoint with the configue_cli.click.inject_from_cli decorator and provide a target type to be injected.

import click
from configue_cli.click import inject_from_cli

@click.command()
@inject_from_cli(ExperimentConfig)
def main(config: ExperimentConfig) -> None:
    print("Passed configuration: ", config)


if __name__ == "__main__":
    main()

To display a help message, use the following:

python main.py --help

Inspection of the configuration state

To visually inspect your application configuration state, use the following command:

$ python main.py --dry-run

╭─ Configuration helper ────────────────────────────────╮
│                                                       │
│  model                                                │
│  ├── (): __main__.ModelConfig                         │
│  ├── name: Missing                                    │
│  ├── batch_size: 12                                   │
│  └── optimizer                                        │
│      ├── (): __main__.OptimizerConfig                 │
│      ├── learning_rate: 0.012                         │
│      └── weight_decay: 0.01                           │
│                                                       │
│  dataset                                              │
│  ├── (): __main__.DatasetConfig                       │
│  ├── name: Missing                                    │
│  └── n_samples: 10000                                 │
│                                                       │
╰───────────────────────────────────────────────────────╯

This is useful to quickly identify which parameters are not yet defined (those marked with a Missing) and which values are used in the other parameters without inspecting the code.

Configuration from the command line

Parameters can be specified from the command line using dotted notation.

$ python main.py model.name=camembert-base dataset.name=fquad model.batch_size=48

╭─ Configuration ───────────────────────────────────────────────────────────────────────────╮
│                                                                                           │
│  model                                                                                    │
│  ├── (): __main__.ModelConfig                                                             │
│  ├── name: camembert-base                                                                 │
│  ├── batch_size: 48                                                                       │
│  └── optimizer                                                                            │
│      ├── (): __main__.OptimizerConfig                                                     │
│      ├── learning_rate: 0.048                                                             │
│      └── weight_decay: 0.01                                                               │
│                                                                                           │
│  dataset                                                                                  │
│  ├── (): __main__.DatasetConfig                                                           │
│  ├── name: fquad                                                                          │
│  └── n_samples: 10000                                                                     │
│                                                                                           │
╰───────────────────────────────────────────────────────────────────────────────────────────╯
Passed configuration: ExperimentConfig(model=ModelConfig(name='camembert-base', batch_size=48, optimizer=OptimizerConfig(learning_rate=0.048, weight_decay=0.01)), dataset=DatasetConfig(name='fquad', n_samples=10000))

Any missing required parameter at configuration time will result in an exception:

$ python main.py model.batch_size=3

Traceback (most recent call last):
  ...
configue_cli.core.exceptions.MissingMandatoryValueError: Missing mandatory value: dataset.name

Configuration with YAML files

Any parameter can be overridden using a configue compliant YAML file. Suppose the model is configured in the following model.yml file:

model:
  (): __main__.ModelConfig
  name: camembert-large
  batch_size: 72
  optimizer:
    (): __main__.OptimizerConfig
    learning_rate: 0.01
    weight_decay: 0.0

This configuration file can be loaded from the CLI using the -c flag:

$ python main.py -c model.yml --dry-run

╭─ Configuration helper ────────────────────────────────────╮
│                                                           │
│  model                                                    │
│  ├── (): __main__.ModelConfig                             │
│  ├── name: camembert-large                                │
│  ├── batch_size: 72                                       │
│  └── optimizer                                            │
│      ├── (): __main__.OptimizerConfig                     │
│      ├── learning_rate: 0.01                              │
│      └── weight_decay: 0.0                                │
│                                                           │
│  dataset                                                  │
│  ├── (): __main__.DatasetConfig                           │
│  ├── name: Missing                                        │
│  └── n_samples: 10000                                     │
│                                                           │
╰───────────────────────────────────────────────────────────╯

Multiple configuration files can be used simultaneously, the final configuration is assembled by merging all files in the order they are provided. For instance, let's suppose we have the following large_batch.yml file:

model:
  batch_size: 512

This file can be merged into our previous configuration using the following:

$ python main.py -c model.yml -c large_batch.yml --dry-run 

╭─ Configuration helper ────────────────────────────────────╮
│                                                           │
│  model                                                    │
│  ├── (): __main__.ModelConfig                             │
│  ├── name: camembert-large                                │
│  ├── batch_size: 512                                      │
│  └── optimizer                                            │
│      ├── (): __main__.OptimizerConfig                     │
│      ├── learning_rate: 0.01                              │
│      └── weight_decay: 0.0                                │
│                                                           │
│  dataset                                                  │
│  ├── (): __main__.DatasetConfig                           │
│  ├── name: Missing                                        │
│  └── n_samples: 10000                                     │
│                                                           │
╰───────────────────────────────────────────────────────────╯

Parameters specified with the command line take precedence over the ones specified in YAML files:

$ python main.py model.batch_size=32 -c model.yml -c large_batch.yml --dry-run

╭─ Configuration helper ────────────────────────────────────╮
│                                                           │
│  model                                                    │
│  ├── (): __main__.ModelConfig                             │
│  ├── name: camembert-large                                │
│  ├── batch_size: 32                                       │
│  └── optimizer                                            │
│      ├── (): __main__.OptimizerConfig                     │
│      ├── learning_rate: 0.01                              │
│      └── weight_decay: 0.0                                │
│                                                           │
│  dataset                                                  │
│  ├── (): __main__.DatasetConfig                           │
│  ├── name: Missing                                        │
│  └── n_samples: 10000                                     │
│                                                           │
╰───────────────────────────────────────────────────────────╯

This feature encourages a modular configuration pattern where different subparts of the application (the model and the dataset in this example) are configured in separate YAML files and are dynamically assembled at configuration time. Different variations of these subparts can easily be assembled. All arguments can be overridden using the command line without having to edit the config files.

Exporting the final configuration

To ease reproducibility, the final configuration used for the run can be exported by using the -o flag and specifying an output YAML file:

$ python main.py dataset.name=hello-world -c model.yml -c large_batch.yml -o output.yml

╭─ Configuration ───────────────────────────────────────────╮
│                                                           │
│  model                                                    │
│  ├── (): __main__.ModelConfig                             │
│  ├── name: camembert-large                                │
│  ├── batch_size: 512                                      │
│  └── optimizer                                            │
│      ├── (): __main__.OptimizerConfig                     │
│      ├── learning_rate: 0.01                              │
│      └── weight_decay: 0.0                                │
│                                                           │
│  dataset                                                  │
│  ├── (): __main__.DatasetConfig                           │
│  ├── name: hello-world                                    │
│  └── n_samples: 10000                                     │
│                                                           │
╰───────────────────────────────────────────────────────────╯
Passed configuration ExperimentConfig(model=ModelConfig(name='camembert-large', batch_size=512, optimizer=OptimizerConfig(learning_rate=0.01, weight_decay=0.0)), dataset=DatasetConfig(name='hello-world', n_samples=10000))

$ cat output.yml
model:
  (): __main__.ModelConfig
  name: camembert-large
  batch_size: 512
  optimizer:
    (): __main__.OptimizerConfig
    learning_rate: 0.01
    weight_decay: 0.0
dataset:
  (): __main__.DatasetConfig
  name: hello-world
  n_samples: 10000

Unstructured configuration

It is possible to use the inject_from_cli decorator without specifying a target type:

@click.command()
@inject_from_cli()
def main(config: configue_cli.core.dict_config.DictConfig) -> None:
    ...

In that case, the wrapped entrypoint will be passed a configue_cli.core.dict_config.DictConfig object upon injection.

Configuring the logging

To load a logging configuration located under the "logging" key in your final configuration, use the following:

@click.command()
@inject_from_cli(ExperimentConfig, logging_config_path="logging")
def main(config: ExperimentConfig) -> None:
    ...

Integration with Skypilot

SkyPilot is a framework for easily running jobs on any cloud through a unified interface. Any function decorated with inject_from_cli can easily be executed remotely by providing a Skypilot configuration.

The following configuration defines a job to be executed in a SkyPilot cluster named test-cluster. The job is defined under the task key, we refer to the SkyPilot YAML specification for more details on this section.

The Python command and all its arguments are captured and interpolated inside the run command, respectively in a {command} and {parameters} placeholder.

# skypilot.yml
skypilot:
  cluster-name: test-cluster
  task:
    resources:
      cloud: gcp
      accelerators: K80:1
    workdir: .
    setup: |
      echo 'Setup the job...'
    run: |
      set -e
      cd ~/sky_workdir
      {command} {parameters}

To load the SkyPilot configuration in your final configuration, use the following:

@click.command()
@inject_from_cli(ExperimentConfig, skypilot_config_path="skypilot")
def main(config: ExperimentConfig) -> None:
    ...

As with the other arguments, all SkyPilot configuration arguments can be redefined on the fly:

python main.py -c skypilot.yml skypilot.cluster-name=another-cluster

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

configue-cli-0.2.0.tar.gz (21.8 kB view details)

Uploaded Source

Built Distribution

configue_cli-0.2.0-py3-none-any.whl (15.6 kB view details)

Uploaded Python 3

File details

Details for the file configue-cli-0.2.0.tar.gz.

File metadata

  • Download URL: configue-cli-0.2.0.tar.gz
  • Upload date:
  • Size: 21.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.6 readme-renderer/37.3 requests/2.28.2 requests-toolbelt/0.10.1 urllib3/1.26.15 tqdm/4.65.0 importlib-metadata/6.1.0 keyring/23.13.1 rfc3986/2.0.0 colorama/0.4.6 CPython/3.11.2

File hashes

Hashes for configue-cli-0.2.0.tar.gz
Algorithm Hash digest
SHA256 9193767ed33215cf9f3e22722294a636a0d90f82e4a076a6c0b73a49ccd653ad
MD5 df9723d834804a65053f96f746baca86
BLAKE2b-256 31f8bca091a70a5bf1586436d200edbda9eafe50a93c2ccf6afd43f0f43ab9bc

See more details on using hashes here.

File details

Details for the file configue_cli-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: configue_cli-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 15.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.6 readme-renderer/37.3 requests/2.28.2 requests-toolbelt/0.10.1 urllib3/1.26.15 tqdm/4.65.0 importlib-metadata/6.1.0 keyring/23.13.1 rfc3986/2.0.0 colorama/0.4.6 CPython/3.11.2

File hashes

Hashes for configue_cli-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4a2e88f5d51bedb4b6eae85cadeb28b512fab1418964bbfe0df83ddcf8d0eac2
MD5 0a3b15097aaf0ae703e6fd86b8ec55ce
BLAKE2b-256 2e9f6475ac72e8bd76cb3a491c48f94fbbff824b19e29007e9d75427e2878979

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page