Skip to main content

Strongly typed, zero-effort CLIs

Project description

dcargs

build mypy lint codecov

Overview

pip install dcargs

dcargs is a library for typed CLI interfaces and configuration objects.

Our core interface generates an argument parser from type-annotated callables, which may be functions, classes, or dataclasses:

dcargs.cli
dcargs.cli(
    f: Callable[..., T],
    *,
    description: Optional[str]=None,
    args: Optional[Sequence[str]]=None,
    default_instance: Optional[T]=None,
) -> T
Call `f(...)`, with arguments populated from an automatically generated CLI
interface.

`f` should have type-annotated inputs, and can be a function, class, or dataclass.
Note that if `f` is a class, `dcargs.cli()` returns an instance.

The parser is generated by populating helptext from docstrings and types from
annotations; a broad range of core type annotations are supported...
    - Types natively accepted by `argparse`: str, int, float, pathlib.Path, etc.
    - Default values for optional parameters.
    - Booleans, which are automatically converted to flags when provided a default
      value.
    - Enums (via `enum.Enum`).
    - Various container types. Some examples:
      - `typing.ClassVar`.
      - `typing.Optional`.
      - `typing.Literal`.
      - `typing.Sequence`.
      - `typing.List`.
      - `typing.Tuple`, such as `typing.Tuple[T1, T2, T3]` or
        `typing.Tuple[T, ...]`.
      - `typing.Set`.
      - `typing.Final` and `typing.Annotated`.
      - Nested combinations of the above: `Optional[Literal[T]]`,
        `Final[Optional[Sequence[T]]]`, etc.
    - Nested dataclasses.
      - Simple nesting.
      - Unions over nested dataclasses (subparsers).
      - Optional unions over nested dataclasses (optional subparsers).
    - Generic dataclasses (including nested generics).

Args:
    f: Callable.

Keyword Args:
    description: Description text for the parser, displayed when the --help flag is
        passed in. If not specified, `f`'s docstring is used. Mirrors argument
        from `argparse.ArgumentParser()`.
    args: If set, parse arguments from a sequence of strings instead of the
        commandline. Mirrors argument from `argparse.ArgumentParser.parse_args()`.
    default_instance: An instance of `T` to use for default values; only supported
        if `T` is a dataclass type. Helpful for merging CLI arguments with values loaded
        from elsewhere. (for example, a config object loaded from a yaml file)

Returns:
    The output of `f(...)`.

The goal is a tool that's lightweight enough for simple interactive scripts, but flexible enough to replace heavier configuration frameworks. Notably, dcargs.cli() supports nested classes and dataclasses, which enable expressive hierarchical configuration objects built on standard Python features.

The ultimate goal is the ability build interfaces that are:

  • Low-effort. Type annotations, docstrings, and default values can be used to automatically generate argument parsers with informative helptext. This includes bells and whistles like enums, containers, etc.
  • Strongly typed. Unlike dynamic configuration namespaces produced by libraries like argparse, YACS, abseil, hydra, or ml_collections, typed outputs mean that IDE-assisted autocomplete, rename, refactor, go-to-definition operations work out-of-the-box, as do static checking tools like mypy and pyright.
  • Modular. Most approaches to configuration objects require a centralized definition of all configurable fields. Supporting hierarchically nested configuration classes/dataclasses, however, makes it easy to distribute definitions, defaults, and documentation of configurable fields across modules or source files. A model configuration dataclass, for example, can be co-located in its entirety with the model implementation and dropped into any experiment configuration with an import — this eliminates redundancy and makes the entire module easy to port across codebases.

Examples

A series of example scripts covering core features are included below.

1. Simple Functions

examples/01_simple_functions.py

"""CLI generation example from a simple annotated function. `dcargs.cli()` will call
`main()`, with arguments populated from the CLI."""

import dcargs


def main(
    field1: str,
    field2: int = 3,
    flag: bool = False,
) -> None:
    """Function, whose arguments will be populated from a CLI interface.

    Args:
        field1: A string field.
        field2: A numeric field, with a default value.
        flag: A boolean flag.
    """
    print(field1, field2, flag)


if __name__ == "__main__":
    dcargs.cli(main)

$ python examples/01_simple_functions.py --help
usage: 01_simple_functions.py [-h] --field1 STR [--field2 INT] [--flag]

Function, whose arguments will be populated from a CLI interface.

required arguments:
  --field1 STR  A string field.

optional arguments:
  -h, --help    show this help message and exit
  --field2 INT  A numeric field, with a default value. (default: 3)
  --flag        A boolean flag.
2. Simple Dataclasses

examples/02_simple_dataclasses.py

"""Example using dcargs.cli() to instantiate a dataclass."""

import dataclasses

import dcargs


@dataclasses.dataclass
class Args:
    """Description.
    This should show up in the helptext!"""

    field1: str  # A string field.
    field2: int = 3  # A numeric field, with a default value.
    flag: bool = False  # A boolean flag.


if __name__ == "__main__":
    args = dcargs.cli(Args)
    print(args)
    print()
    print(dcargs.to_yaml(args))

$ python examples/02_simple_dataclasses.py --help
usage: 02_simple_dataclasses.py [-h] --field1 STR [--field2 INT] [--flag]

Description.
This should show up in the helptext!

required arguments:
  --field1 STR  A string field.

optional arguments:
  -h, --help    show this help message and exit
  --field2 INT  A numeric field, with a default value. (default: 3)
  --flag        A boolean flag.
3. Enums And Containers

examples/03_enums_and_containers.py

"""Examples of more advanced type annotations: enums and containers types.

For collections, we only showcase Tuple here, but List, Sequence, Set, etc are all
supported as well."""

import dataclasses
import enum
import pathlib
from typing import Optional, Tuple

import dcargs


class OptimizerType(enum.Enum):
    ADAM = enum.auto()
    SGD = enum.auto()


@dataclasses.dataclass(frozen=True)
class TrainConfig:
    # Example of a variable-length tuple:
    dataset_sources: Tuple[pathlib.Path, ...]
    """Paths to load training data from. This can be multiple!"""

    # Fixed-length tupels are also okay:
    image_dimensions: Tuple[int, int]
    """Height and width of some image data."""

    # Enums are handled seamlessly.
    optimizer_type: OptimizerType
    """Gradient-based optimizer to use."""

    # We can also explicitly mark arguments as optional.
    checkpoint_interval: Optional[int]
    """Interval to save checkpoints at."""


if __name__ == "__main__":
    config = dcargs.cli(TrainConfig)
    print(config)

$ python examples/03_enums_and_containers.py --help
usage: 03_enums_and_containers.py [-h] --dataset-sources PATH [PATH ...]
                                  --image-dimensions INT INT --optimizer-type
                                  {ADAM,SGD} [--checkpoint-interval INT]

required arguments:
  --dataset-sources PATH [PATH ...]
                        Paths to load training data from. This can be multiple!
  --image-dimensions INT INT
                        Height and width of some image data.
  --optimizer-type {ADAM,SGD}
                        Gradient-based optimizer to use.

optional arguments:
  -h, --help            show this help message and exit
  --checkpoint-interval INT
                        Interval to save checkpoints at. (default: None)
4. Flags

examples/04_flags.py

"""Example of how booleans are handled and automatically converted to flags."""

import dataclasses
from typing import Optional

import dcargs


@dataclasses.dataclass
class Args:
    # Boolean. This expects an explicit "True" or "False".
    boolean: bool

    # Optional boolean. Same as above, but can be omitted.
    optional_boolean: Optional[bool]

    # Pass --flag-a in to set this value to True.
    flag_a: bool = False

    # Pass --no-flag-b in to set this value to False.
    flag_b: bool = True


if __name__ == "__main__":
    args = dcargs.cli(Args)
    print(args)
    print()
    print(dcargs.to_yaml(args))

$ python examples/04_flags.py --help
usage: 04_flags.py [-h] --boolean {True,False}
                   [--optional-boolean {True,False}] [--flag-a] [--no-flag-b]

required arguments:
  --boolean {True,False}
                        Boolean. This expects an explicit "True" or "False".

optional arguments:
  -h, --help            show this help message and exit
  --optional-boolean {True,False}
                        Optional boolean. Same as above, but can be omitted. (default: None)
  --flag-a              Pass --flag-a in to set this value to True.
  --no-flag-b           Pass --no-flag-b in to set this value to False.
5. Hierarchical Configs

examples/05_hierarchical_configs.py

"""An example of how we can create hierarchical configuration interfaces by nesting
dataclasses."""

import dataclasses
import enum
import pathlib

import dcargs


class OptimizerType(enum.Enum):
    ADAM = enum.auto()
    SGD = enum.auto()


@dataclasses.dataclass(frozen=True)
class OptimizerConfig:
    # Gradient-based optimizer to use.
    algorithm: OptimizerType = OptimizerType.ADAM

    # Learning rate to use.
    learning_rate: float = 3e-4

    # Coefficient for L2 regularization.
    weight_decay: float = 1e-2


@dataclasses.dataclass(frozen=True)
class ExperimentConfig:
    # Various configurable options for our optimizer.
    optimizer: OptimizerConfig

    # Batch size.
    batch_size: int = 32

    # Total number of training steps.
    train_steps: int = 100_000

    # Random seed. This is helpful for making sure that our experiments are all
    # reproducible!
    seed: int = 0


def train(
    out_dir: pathlib.Path,
    /,
    config: ExperimentConfig,
    restore_checkpoint: bool = False,
    checkpoint_interval: int = 1000,
) -> None:
    """Train a model.

    Args:
        out_dir: Where to save logs and checkpoints.
        config: Experiment configuration.
        restore_checkpoint: Set to restore an existing checkpoint.
        checkpoint_interval: Training steps between each checkpoint save.
    """
    print(out_dir)
    print("---")
    print(dcargs.to_yaml(config))
    print("---")
    print(restore_checkpoint)
    print(checkpoint_interval)


if __name__ == "__main__":
    dcargs.cli(train)

$ python examples/05_hierarchical_configs.py --help
usage: 05_hierarchical_configs.py [-h]
                                  [--config.optimizer.algorithm {ADAM,SGD}]
                                  [--config.optimizer.learning-rate FLOAT]
                                  [--config.optimizer.weight-decay FLOAT]
                                  [--config.batch-size INT]
                                  [--config.train-steps INT]
                                  [--config.seed INT] [--restore-checkpoint]
                                  [--checkpoint-interval INT]
                                  OUT_DIR

Train a model.

positional arguments:
  OUT_DIR               Where to save logs and checkpoints.

optional arguments:
  -h, --help            show this help message and exit
  --restore-checkpoint  Set to restore an existing checkpoint.
  --checkpoint-interval INT
                        Training steps between each checkpoint save. (default: 1000)

optional config.optimizer arguments:
  Various configurable options for our optimizer.

  --config.optimizer.algorithm {ADAM,SGD}
                        Gradient-based optimizer to use. (default: ADAM)
  --config.optimizer.learning-rate FLOAT
                        Learning rate to use. (default: 0.0003)
  --config.optimizer.weight-decay FLOAT
                        Coefficient for L2 regularization. (default: 0.01)

optional config arguments:
  Experiment configuration.

  --config.batch-size INT
                        Batch size. (default: 32)
  --config.train-steps INT
                        Total number of training steps. (default: 100000)
  --config.seed INT     Random seed. This is helpful for making sure that our experiments are all
                        reproducible! (default: 0)
6. Literals

examples/06_literals.py

"""typing.Literal[] can be used to specify accepted input choices."""

import dataclasses
import enum
from typing import Literal

import dcargs


class Color(enum.Enum):
    RED = enum.auto()
    GREEN = enum.auto()
    BLUE = enum.auto()


@dataclasses.dataclass(frozen=True)
class Args:
    enum: Color
    restricted_enum: Literal[Color.RED, Color.GREEN]

    integer: Literal[0, 1, 2, 3]
    string: Literal["red", "green"]

    restricted_enum_with_default: Literal[Color.RED, Color.GREEN] = Color.GREEN
    integer_with_default: Literal[0, 1, 2, 3] = 3
    string_with_Default: Literal["red", "green"] = "red"


if __name__ == "__main__":
    args = dcargs.cli(Args)
    print(args)
    print()
    print(dcargs.to_yaml(args))

$ python examples/06_literals.py --help
usage: 06_literals.py [-h] --enum {RED,GREEN,BLUE} --restricted-enum
                      {RED,GREEN} --integer {0,1,2,3} --string {red,green}
                      [--restricted-enum-with-default {RED,GREEN}]
                      [--integer-with-default {0,1,2,3}]
                      [--string-with-Default {red,green}]

required arguments:
  --enum {RED,GREEN,BLUE}
  --restricted-enum {RED,GREEN}
  --integer {0,1,2,3}
  --string {red,green}

optional arguments:
  -h, --help            show this help message and exit
  --restricted-enum-with-default {RED,GREEN}
                        (default: GREEN)
  --integer-with-default {0,1,2,3}
                        (default: 3)
  --string-with-Default {red,green}
                        (default: red)
7. Positional Args

examples/07_positional_args.py

"""Positional-only arguments in functions are converted to positional CLI arguments."""

from __future__ import annotations

import dataclasses
import enum
import pathlib
from typing import Tuple

import dcargs


def main(
    source: pathlib.Path,
    dest: pathlib.Path,
    /,  # Mark the end of positional arguments.
    optimizer: OptimizerConfig,
    force: bool = False,
    verbose: bool = False,
    background_rgb: Tuple[float, float, float] = (1.0, 0.0, 0.0),
) -> None:
    """Command-line interface defined using a function signature. Note that this
    docstring is parsed to generate helptext.

    Args:
        optimizer: Configuration for our optimizer object.
        force: Do not prompt before overwriting.
        verbose: Explain what is being done.
        background_rgb: Background color. Red by default.
    """
    print(
        f"{source.absolute()=}"
        "\n"
        f"{dest.absolute()=}"
        "\n"
        f"{optimizer=}"
        "\n"
        f"{force=}"
        "\n"
        f"{verbose=}"
        "\n"
        f"{background_rgb=}"
    )


class OptimizerType(enum.Enum):
    ADAM = enum.auto()
    SGD = enum.auto()


@dataclasses.dataclass(frozen=True)
class OptimizerConfig:
    algorithm: OptimizerType = OptimizerType.ADAM
    """Gradient-based optimizer to use."""

    learning_rate: float = 3e-4
    """Learning rate to use."""

    weight_decay: float = 1e-2
    """Coefficient for L2 regularization."""


if __name__ == "__main__":
    dcargs.cli(main)

$ python examples/07_positional_args.py --help
usage: 07_positional_args.py [-h] [--optimizer.algorithm {ADAM,SGD}]
                             [--optimizer.learning-rate FLOAT]
                             [--optimizer.weight-decay FLOAT] [--force]
                             [--verbose] [--background-rgb FLOAT FLOAT FLOAT]
                             SOURCE DEST

Command-line interface defined using a function signature. Note that this
docstring is parsed to generate helptext.

positional arguments:
  SOURCE                PATH
  DEST                  PATH

optional arguments:
  -h, --help            show this help message and exit
  --force               Do not prompt before overwriting.
  --verbose             Explain what is being done.
  --background-rgb FLOAT FLOAT FLOAT
                        Background color. Red by default. (default: 1.0 0.0 0.0)

optional optimizer arguments:
  Configuration for our optimizer object.

  --optimizer.algorithm {ADAM,SGD}
                        Gradient-based optimizer to use. (default: ADAM)
  --optimizer.learning-rate FLOAT
                        Learning rate to use. (default: 0.0003)
  --optimizer.weight-decay FLOAT
                        Coefficient for L2 regularization. (default: 0.01)
8. Standard Classes

examples/08_standard_classes.py

"""In addition to functions and dataclasses, we can also generate CLIs from (the
constructors of) standard Python classes."""

import dcargs


class Args:
    def __init__(
        self,
        field1: str,
        field2: int,
        flag: bool = False,
    ):
        """Arguments.

        Args:
            field1: A string field.
            field2: A numeric field.
            flag: A boolean flag.
        """
        self.data = [field1, field2, flag]


if __name__ == "__main__":
    args = dcargs.cli(Args)
    print(args.data)

$ python examples/08_standard_classes.py --help
usage: 08_standard_classes.py [-h] --field1 STR --field2 INT [--flag]

required arguments:
  --field1 STR  Arguments.

                Args:
                    field1: A string field.
                    field2: A numeric field.
                    flag: A boolean flag.
  --field2 INT  Arguments.

                Args:
                    field1: A string field.
                    field2: A numeric field.
                    flag: A boolean flag.

optional arguments:
  -h, --help    show this help message and exit
  --flag        Arguments.

                Args:
                    field1: A string field.
                    field2: A numeric field.
                    flag: A boolean flag.
9. Subparsers

examples/09_subparsers.py

"""Unions over nested types (classes or dataclasses) will result in subparsers."""

from __future__ import annotations

import dataclasses
from typing import Union

import dcargs


def main(command: Union[Checkout, Commit]) -> None:
    print(command)


@dataclasses.dataclass(frozen=True)
class Checkout:
    """Checkout a branch."""

    branch: str


@dataclasses.dataclass(frozen=True)
class Commit:
    """Commit changes."""

    message: str
    all: bool = False


if __name__ == "__main__":
    dcargs.cli(main)

$ python examples/09_subparsers.py --help
usage: 09_subparsers.py [-h] {checkout,commit} ...

optional arguments:
  -h, --help         show this help message and exit

subcommands:
  {checkout,commit}
10. Generics

examples/10_generics.py

"""Example of parsing for generic (~templated) dataclasses."""

import dataclasses
from typing import Generic, TypeVar

import dcargs

ScalarType = TypeVar("ScalarType")
ShapeType = TypeVar("ShapeType")


@dataclasses.dataclass(frozen=True)
class Point3(Generic[ScalarType]):
    x: ScalarType
    y: ScalarType
    z: ScalarType
    frame_id: str


@dataclasses.dataclass(frozen=True)
class Triangle:
    a: Point3[float]
    b: Point3[float]
    c: Point3[float]


@dataclasses.dataclass(frozen=True)
class Args(Generic[ShapeType]):
    point_continuous: Point3[float]
    point_discrete: Point3[int]
    shape: ShapeType


if __name__ == "__main__":
    args = dcargs.cli(Args[Triangle])
    print(args)

$ python examples/10_generics.py --help
usage: 10_generics.py [-h] --point-continuous.x FLOAT --point-continuous.y
                      FLOAT --point-continuous.z FLOAT
                      --point-continuous.frame-id STR --point-discrete.x INT
                      --point-discrete.y INT --point-discrete.z INT
                      --point-discrete.frame-id STR --shape.a.x FLOAT
                      --shape.a.y FLOAT --shape.a.z FLOAT --shape.a.frame-id
                      STR --shape.b.x FLOAT --shape.b.y FLOAT --shape.b.z
                      FLOAT --shape.b.frame-id STR --shape.c.x FLOAT
                      --shape.c.y FLOAT --shape.c.z FLOAT --shape.c.frame-id
                      STR

optional arguments:
  -h, --help            show this help message and exit

required point_continuous arguments:
  Point3(*args, **kwds)

  --point-continuous.x FLOAT
  --point-continuous.y FLOAT
  --point-continuous.z FLOAT
  --point-continuous.frame-id STR

required point_discrete arguments:
  Point3(*args, **kwds)

  --point-discrete.x INT
  --point-discrete.y INT
  --point-discrete.z INT
  --point-discrete.frame-id STR

required shape.a arguments:
  Point3(*args, **kwds)

  --shape.a.x FLOAT
  --shape.a.y FLOAT
  --shape.a.z FLOAT
  --shape.a.frame-id STR

required shape.b arguments:
  Point3(*args, **kwds)

  --shape.b.x FLOAT
  --shape.b.y FLOAT
  --shape.b.z FLOAT
  --shape.b.frame-id STR

required shape.c arguments:
  Point3(*args, **kwds)

  --shape.c.x FLOAT
  --shape.c.y FLOAT
  --shape.c.z FLOAT
  --shape.c.frame-id STR

Serialization

As a secondary feature aimed at enabling the use of dcargs.cli() for general configuration use cases, we also introduce functions for human-readable dataclass serialization:

  • dcargs.from_yaml(cls: Type[T], stream: Union[str, IO[str], bytes, IO[bytes]]) -> T and dcargs.to_yaml(instance: T) -> str convert between YAML-style strings and dataclass instances.

The functions attempt to strike a balance between flexibility and robustness — in contrast to naively dumping or loading dataclass instances (via pickle, PyYAML, etc), explicit type references enable custom tags that are robust against code reorganization and refactor, while a PyYAML backend enables serialization of arbitrary Python objects.

Alternative tools

The core functionality of dcargs — generating argument parsers from type annotations — can be found as a subset of the features offered by many other libraries. A summary of some distinguishing features:

Choices from literals Generics Docstrings as helptext Nesting Subparsers Containers
dcargs
datargs
tap
simple-parsing soon
argparse-dataclass
argparse-dataclasses
dataclass-cli
clout
hf_argparser
pyrallis

Note that most of these other libraries are generally aimed specifically at dataclasses rather than general typed callables, but offer other features that you might find useful, such as registration for custom types (pyrallis), different approaches for serialization and config files (tap, pyrallis), simultaneous parsing of multiple dataclasses (simple-parsing), etc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dcargs-0.1.3.tar.gz (32.7 kB view hashes)

Uploaded Source

Built Distribution

dcargs-0.1.3-py3-none-any.whl (30.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page