Strongly typed, zero-effort CLIs
Project description
dcargs
Overview
pip install dcargs
dcargs
is a library for typed CLI interfaces and configuration objects.
Our core interface generates an argument parser from a type-annotated callable
f
, which may be a function, class, or dataclass:
dcargs.cli(
f: Callable[..., T],
*,
description: Optional[str] = None,
args: Optional[Sequence[str]] = None,
default_instance: Optional[T] = None,
avoid_subparsers: bool = False,
) -> T
Docstring
Call `f(...)`, with arguments populated from an automatically generated CLI
interface.
`f` should have type-annotated inputs, and can be a function or class. Note that if
`f` is a class, `dcargs.cli()` returns an instance.
The parser is generated by populating helptext from docstrings and types from
annotations; a broad range of core type annotations are supported...
- Types natively accepted by `argparse`: str, int, float, pathlib.Path, etc.
- Default values for optional parameters.
- Booleans, which are automatically converted to flags when provided a default
value.
- Enums (via `enum.Enum`).
- Various annotations from the standard typing library. Some examples:
- `typing.ClassVar[T]`.
- `typing.Optional[T]`.
- `typing.Literal[T]`.
- `typing.Sequence[T]`.
- `typing.List[T]`.
- `typing.Dict[K, V]`.
- `typing.Tuple`, such as `typing.Tuple[T1, T2, T3]` or
`typing.Tuple[T, ...]`.
- `typing.Set[T]`.
- `typing.Final[T]` and `typing.Annotated[T]`.
- Various nested combinations of the above: `Optional[Literal[T]]`,
`Final[Optional[Sequence[T]]]`, etc.
- Hierarchical structures via nested dataclasses, TypedDict, NamedTuple,
classes.
- Simple nesting.
- Unions over nested structures (subparsers).
- Optional unions over nested structures (optional subparsers).
- Generics (including nested generics).
Args:
f: Callable.
Keyword Args:
description: Description text for the parser, displayed when the --help flag is
passed in. If not specified, `f`'s docstring is used. Mirrors argument from
`argparse.ArgumentParser()`.
args: If set, parse arguments from a sequence of strings instead of the
commandline. Mirrors argument from `argparse.ArgumentParser.parse_args()`.
default_instance: An instance of `T` to use for default values; only supported
if `T` is a dataclass, TypedDict, or NamedTuple. Helpful for merging CLI
arguments with values loaded from elsewhere. (for example, a config object
loaded from a yaml file)
avoid_subparsers: Avoid creating a subparser when defaults are provided for
unions over nested types. Generates cleaner but less expressive CLIs.
Returns:
The output of `f(...)`.
The goal is a tool that's lightweight enough for simple interactive scripts, but
flexible enough to replace heavier configuration frameworks like hydra
and
ml_collections
. Notably, dcargs.cli()
supports nested classes and
dataclasses, which enable expressive hierarchical configuration objects built on
standard Python features.
Ultimately, we aim to enable configuration interfaces that are:
- Low-effort. Type annotations, docstrings, and default values can be used to automatically generate argument parsers with informative helptext. This includes bells and whistles like enums, containers, etc.
- Strongly typed. Unlike dynamic configuration namespaces produced by
libraries like
argparse
,YACS
,abseil
,hydra
, orml_collections
, typed outputs mean that IDE-assisted autocomplete, rename, refactor, go-to-definition operations work out-of-the-box, as do static checking tools likemypy
andpyright
. - Modular. Most approaches to configuration objects require a centralized definition of all configurable fields. Supporting hierarchically nested configuration structures, however, makes it easy to distribute definitions, defaults, and documentation of configurable fields across modules or source files. A model configuration dataclass, for example, can be co-located in its entirety with the model implementation and dropped into any experiment configuration with an import — this eliminates redundancy and makes the entire module easy to port across codebases.
Examples
1. Functions
In the simplest case, dcargs.cli()
can be used to run a function with arguments
populated from the CLI.
Code (link):
import dcargs
def main(
field1: str,
field2: int = 3,
flag: bool = False,
) -> None:
"""Function, whose arguments will be populated from a CLI interface.
Args:
field1: A string field.
field2: A numeric field, with a default value.
flag: A boolean flag.
"""
print(field1, field2, flag)
if __name__ == "__main__":
dcargs.cli(main)
Example usage:
$ python ./01_functions.py --help usage: 01_functions.py [-h] --field1 STR [--field2 INT] [--flag] Function, whose arguments will be populated from a CLI interface. required arguments: --field1 STR A string field. optional arguments: -h, --help show this help message and exit --field2 INT A numeric field, with a default value. (default: 3) --flag A boolean flag.
$ python ./01_functions.py --field1 hello hello 3 False
$ python ./01_functions.py --field1 hello --flag hello 3 True
2. Dataclasses
Common pattern: use dcargs.cli()
to instantiate a dataclass.
Code (link):
import dataclasses
import dcargs
@dataclasses.dataclass
class Args:
"""Description.
This should show up in the helptext!"""
field1: str # A string field.
field2: int = 3 # A numeric field, with a default value.
flag: bool = False # A boolean flag.
if __name__ == "__main__":
args = dcargs.cli(Args)
print(args)
Example usage:
$ python ./02_dataclasses.py --help usage: 02_dataclasses.py [-h] --field1 STR [--field2 INT] [--flag] Description. This should show up in the helptext! required arguments: --field1 STR A string field. optional arguments: -h, --help show this help message and exit --field2 INT A numeric field, with a default value. (default: 3) --flag A boolean flag.
$ python ./02_dataclasses.py --field1 hello Args(field1='hello', field2=3, flag=False)
$ python ./02_dataclasses.py --field1 hello --flag Args(field1='hello', field2=3, flag=True)
3. Enums And Containers
We can generate argument parsers from more advanced type annotations, like enums and
tuple types. For collections, we only showcase Tuple
here, but List
, Sequence
,
Set
, Dict
, etc are all supported as well.
Code (link):
import dataclasses
import enum
import pathlib
from typing import Optional, Tuple
import dcargs
class OptimizerType(enum.Enum):
ADAM = enum.auto()
SGD = enum.auto()
@dataclasses.dataclass(frozen=True)
class TrainConfig:
# Example of a variable-length tuple:
dataset_sources: Tuple[pathlib.Path, ...]
"""Paths to load training data from. This can be multiple!"""
# Fixed-length tuples are also okay:
image_dimensions: Tuple[int, int] = (32, 32)
"""Height and width of some image data."""
# Enums are handled seamlessly.
optimizer_type: OptimizerType = OptimizerType.ADAM
"""Gradient-based optimizer to use."""
# We can also explicitly mark arguments as optional.
checkpoint_interval: Optional[int] = None
"""Interval to save checkpoints at."""
if __name__ == "__main__":
config = dcargs.cli(TrainConfig)
print(config)
Example usage:
$ python ./03_enums_and_containers.py --help usage: 03_enums_and_containers.py [-h] --dataset-sources PATH [PATH ...] [--image-dimensions INT INT] [--optimizer-type {ADAM,SGD}] [--checkpoint-interval (INT | None)] required arguments: --dataset-sources PATH [PATH ...] Paths to load training data from. This can be multiple! optional arguments: -h, --help show this help message and exit --image-dimensions INT INT Height and width of some image data. (default: 32 32) --optimizer-type {ADAM,SGD} Gradient-based optimizer to use. (default: ADAM) --checkpoint-interval (INT | None) Interval to save checkpoints at. (default: None)
$ python ./03_enums_and_containers.py --dataset-sources ./data --image-dimensions 16 16 TrainConfig(dataset_sources=(PosixPath('data'),), image_dimensions=(16, 16), optimizer_type=<OptimizerType.ADAM: 1>, checkpoint_interval=None)
$ python ./03_enums_and_containers.py --dataset-sources ./data --optimizer-type SGD TrainConfig(dataset_sources=(PosixPath('data'),), image_dimensions=(32, 32), optimizer_type=<OptimizerType.SGD: 2>, checkpoint_interval=None)
4. Flags
Booleans can either be expected to be explicitly passed in, or, if given a default value, automatically converted to flags.
Code (link):
import dataclasses
from typing import Optional
import dcargs
@dataclasses.dataclass
class Args:
# Boolean. This expects an explicit "True" or "False".
boolean: bool
# Optional boolean. Same as above, but can be omitted.
optional_boolean: Optional[bool]
# Pass --flag-a in to set this value to True.
flag_a: bool = False
# Pass --no-flag-b in to set this value to False.
flag_b: bool = True
if __name__ == "__main__":
args = dcargs.cli(Args)
print(args)
Example usage:
$ python ./04_flags.py --boolean True Args(boolean=True, optional_boolean=None, flag_a=False, flag_b=True)
$ python ./04_flags.py --boolean False --flag-a Args(boolean=False, optional_boolean=None, flag_a=True, flag_b=True)
$ python ./04_flags.py --boolean False --no-flag-b Args(boolean=False, optional_boolean=None, flag_a=False, flag_b=False)
5. Hierarchical Configs
Parsing of nested types (in this case nested dataclasses) enables hierarchical configuration objects that are both modular and highly expressive.
Code (link):
import dataclasses
import enum
import pathlib
import dcargs
class OptimizerType(enum.Enum):
ADAM = enum.auto()
SGD = enum.auto()
@dataclasses.dataclass(frozen=True)
class OptimizerConfig:
# Gradient-based optimizer to use.
algorithm: OptimizerType = OptimizerType.ADAM
# Learning rate to use.
learning_rate: float = 3e-4
# Coefficient for L2 regularization.
weight_decay: float = 1e-2
@dataclasses.dataclass(frozen=True)
class ExperimentConfig:
# Various configurable options for our optimizer.
optimizer: OptimizerConfig
# Batch size.
batch_size: int = 32
# Total number of training steps.
train_steps: int = 100_000
# Random seed. This is helpful for making sure that our experiments are all
# reproducible!
seed: int = 0
def train(
out_dir: pathlib.Path,
/,
config: ExperimentConfig,
restore_checkpoint: bool = False,
checkpoint_interval: int = 1000,
) -> None:
"""Train a model.
Args:
out_dir: Where to save logs and checkpoints.
config: Experiment configuration.
restore_checkpoint: Set to restore an existing checkpoint.
checkpoint_interval: Training steps between each checkpoint save.
"""
print(f"{out_dir=}, {restore_checkpoint=}, {checkpoint_interval=}")
print(f"{config=}")
print(dcargs.to_yaml(config))
if __name__ == "__main__":
dcargs.cli(train)
Example usage:
$ python ./05_hierarchical_configs.py --help usage: 05_hierarchical_configs.py [-h] [--config.optimizer.algorithm {ADAM,SGD}] [--config.optimizer.learning-rate FLOAT] [--config.optimizer.weight-decay FLOAT] [--config.batch-size INT] [--config.train-steps INT] [--config.seed INT] [--restore-checkpoint] [--checkpoint-interval INT] OUT_DIR Train a model. positional arguments: OUT_DIR Where to save logs and checkpoints. optional arguments: -h, --help show this help message and exit --restore-checkpoint Set to restore an existing checkpoint. --checkpoint-interval INT Training steps between each checkpoint save. (default: 1000) optional config.optimizer arguments: Various configurable options for our optimizer. --config.optimizer.algorithm {ADAM,SGD} Gradient-based optimizer to use. (default: ADAM) --config.optimizer.learning-rate FLOAT Learning rate to use. (default: 0.0003) --config.optimizer.weight-decay FLOAT Coefficient for L2 regularization. (default: 0.01) optional config arguments: Experiment configuration. --config.batch-size INT Batch size. (default: 32) --config.train-steps INT Total number of training steps. (default: 100000) --config.seed INT Random seed. This is helpful for making sure that our experiments are all reproducible! (default: 0)
$ python ./05_hierarchical_configs.py . --config.optimizer.algorithm SGD out_dir=PosixPath('.'), restore_checkpoint=False, checkpoint_interval=1000 config=ExperimentConfig(optimizer=OptimizerConfig(algorithm=<OptimizerType.SGD: 2>, learning_rate=0.0003, weight_decay=0.01), batch_size=32, train_steps=100000, seed=0) # dcargs YAML. !dataclass:ExperimentConfig batch_size: 32 optimizer: !dataclass:OptimizerConfig algorithm: !enum:OptimizerType 'SGD' learning_rate: 0.0003 weight_decay: 0.01 seed: 0 train_steps: 100000
$ python ./05_hierarchical_configs.py . --restore-checkpoint out_dir=PosixPath('.'), restore_checkpoint=True, checkpoint_interval=1000 config=ExperimentConfig(optimizer=OptimizerConfig(algorithm=<OptimizerType.ADAM: 1>, learning_rate=0.0003, weight_decay=0.01), batch_size=32, train_steps=100000, seed=0) # dcargs YAML. !dataclass:ExperimentConfig batch_size: 32 optimizer: !dataclass:OptimizerConfig algorithm: !enum:OptimizerType 'ADAM' learning_rate: 0.0003 weight_decay: 0.01 seed: 0 train_steps: 100000
6. Base Configs
We can integrate dcargs.cli()
into common configuration patterns: here, we select
one of multiple possible base configurations, and then use the CLI to either override
(existing) or fill in (missing) values.
Code (link):
import dataclasses
import os
from typing import Literal, Tuple, Union
import dcargs
@dataclasses.dataclass
class AdamOptimizer:
# Adam learning rate.
learning_rate: float = 1e-3
# Moving average parameters.
betas: Tuple[float, float] = (0.9, 0.999)
@dataclasses.dataclass
class SgdOptimizer:
# SGD learning rate.
learning_rate: float = 3e-4
@dataclasses.dataclass(frozen=True)
class ExperimentConfig:
# Dataset to run experiment on.
dataset: Literal["mnist", "imagenet-50"]
# Optimizer parameters.
optimizer: Union[AdamOptimizer, SgdOptimizer]
# Model size.
num_layers: int
units: int
# Batch size.
batch_size: int
# Total number of training steps.
train_steps: int
# Random seed. This is helpful for making sure that our experiments are all
# reproducible!
seed: int
# Note that we could also define this library using separate YAML files (similar to
# `config_path`/`config_name` in Hydra), but staying in Python enables seamless type
# checking + IDE support.
base_config_library = {
"small": ExperimentConfig(
dataset="mnist",
optimizer=SgdOptimizer(),
batch_size=2048,
num_layers=4,
units=64,
train_steps=30_000,
# The dcargs.MISSING sentinel allows us to specify that the seed should have no
# default, and needs to be populated from the CLI.
seed=dcargs.MISSING,
),
"big": ExperimentConfig(
dataset="imagenet-50",
optimizer=AdamOptimizer(),
batch_size=32,
num_layers=8,
units=256,
train_steps=100_000,
seed=dcargs.MISSING,
),
}
if __name__ == "__main__":
# Get base configuration name from environment.
base_config_name = os.environ.get("BASE_CONFIG")
if base_config_name is None or base_config_name not in base_config_library:
raise SystemExit(
f"BASE_CONFIG should be set to one of {tuple(base_config_library.keys())}"
)
# Get base configuration from our library, and use it for default CLI parameters.
base_config = base_config_library[base_config_name]
config = dcargs.cli(
ExperimentConfig,
default_instance=base_config,
# `avoid_subparsers` will avoid making a subparser for unions when a default is
# provided; in this case, it simplifies our CLI but makes it less expressive
# (cannot switch away from the base optimizer types).
avoid_subparsers=True,
)
print(config)
Example usage:
$ BASE_CONFIG=small python ./06_base_configs.py --help usage: 06_base_configs.py [-h] [--dataset {mnist,imagenet-50}] [--optimizer.learning-rate FLOAT] [--num-layers INT] [--units INT] [--batch-size INT] [--train-steps INT] --seed INT required arguments: --seed INT Random seed. This is helpful for making sure that our experiments are all reproducible! optional arguments: -h, --help show this help message and exit --dataset {mnist,imagenet-50} Dataset to run experiment on. (default: mnist) --num-layers INT Model size. (default: 4) --units INT Model size. (default: 64) --batch-size INT Batch size. (default: 2048) --train-steps INT Total number of training steps. (default: 30000) optional optimizer arguments: Optimizer parameters. --optimizer.learning-rate FLOAT SGD learning rate. (default: 0.0003)
$ BASE_CONFIG=small python ./06_base_configs.py --seed 94720 ExperimentConfig(dataset='mnist', optimizer=SgdOptimizer(learning_rate=0.0003), num_layers=4, units=64, batch_size=2048, train_steps=30000, seed=94720)
$ BASE_CONFIG=big python ./06_base_configs.py --help usage: 06_base_configs.py [-h] [--dataset {mnist,imagenet-50}] [--optimizer.learning-rate FLOAT] [--optimizer.betas FLOAT FLOAT] [--num-layers INT] [--units INT] [--batch-size INT] [--train-steps INT] --seed INT required arguments: --seed INT Random seed. This is helpful for making sure that our experiments are all reproducible! optional arguments: -h, --help show this help message and exit --dataset {mnist,imagenet-50} Dataset to run experiment on. (default: imagenet-50) --num-layers INT Model size. (default: 8) --units INT Model size. (default: 256) --batch-size INT Batch size. (default: 32) --train-steps INT Total number of training steps. (default: 100000) optional optimizer arguments: Optimizer parameters. --optimizer.learning-rate FLOAT Adam learning rate. (default: 0.001) --optimizer.betas FLOAT FLOAT Moving average parameters. (default: 0.9 0.999)
$ BASE_CONFIG=big python ./06_base_configs.py --seed 94720 ExperimentConfig(dataset='imagenet-50', optimizer=AdamOptimizer(learning_rate=0.001, betas=(0.9, 0.999)), num_layers=8, units=256, batch_size=32, train_steps=100000, seed=94720)
7. Literals
typing.Literal[]
can be used to restrict inputs to a fixed set of choices.
Code (link):
import dataclasses
import enum
from typing import Literal
import dcargs
class Color(enum.Enum):
RED = enum.auto()
GREEN = enum.auto()
BLUE = enum.auto()
@dataclasses.dataclass(frozen=True)
class Args:
enum: Color
restricted_enum: Literal[Color.RED, Color.GREEN]
integer: Literal[0, 1, 2, 3]
string: Literal["red", "green"]
restricted_enum_with_default: Literal[Color.RED, Color.GREEN] = Color.GREEN
integer_with_default: Literal[0, 1, 2, 3] = 3
string_with_Default: Literal["red", "green"] = "red"
if __name__ == "__main__":
args = dcargs.cli(Args)
print(args)
Example usage:
$ python ./07_literals.py --help usage: 07_literals.py [-h] --enum {RED,GREEN,BLUE} --restricted-enum {RED,GREEN} --integer {0,1,2,3} --string {red,green} [--restricted-enum-with-default {RED,GREEN}] [--integer-with-default {0,1,2,3}] [--string-with-Default {red,green}] required arguments: --enum {RED,GREEN,BLUE} --restricted-enum {RED,GREEN} --integer {0,1,2,3} --string {red,green} optional arguments: -h, --help show this help message and exit --restricted-enum-with-default {RED,GREEN} (default: GREEN) --integer-with-default {0,1,2,3} (default: 3) --string-with-Default {red,green} (default: red)
$ python ./07_literals.py --enum RED --restricted-enum GREEN --integer 3 --string green Args(enum=<Color.RED: 1>, restricted_enum=<Color.GREEN: 2>, integer=3, string='green', restricted_enum_with_default=<Color.GREEN: 2>, integer_with_default=3, string_with_Default='red')
8. Positional Args
Positional-only arguments in functions are converted to positional CLI arguments.
Code (link):
from __future__ import annotations
import dataclasses
import enum
import pathlib
from typing import Tuple
import dcargs
def main(
source: pathlib.Path,
dest: pathlib.Path,
/, # Mark the end of positional arguments.
optimizer: OptimizerConfig,
force: bool = False,
verbose: bool = False,
background_rgb: Tuple[float, float, float] = (1.0, 0.0, 0.0),
) -> None:
"""Command-line interface defined using a function signature. Note that this
docstring is parsed to generate helptext.
Args:
source: Source path.
dest: Destination path.
optimizer: Configuration for our optimizer object.
force: Do not prompt before overwriting.
verbose: Explain what is being done.
background_rgb: Background color. Red by default.
"""
print(f"{source=}\n{dest=}\n{optimizer=}\n{force=}\n{verbose=}\n{background_rgb=}")
class OptimizerType(enum.Enum):
ADAM = enum.auto()
SGD = enum.auto()
@dataclasses.dataclass(frozen=True)
class OptimizerConfig:
algorithm: OptimizerType = OptimizerType.ADAM
"""Gradient-based optimizer to use."""
learning_rate: float = 3e-4
"""Learning rate to use."""
weight_decay: float = 1e-2
"""Coefficient for L2 regularization."""
if __name__ == "__main__":
dcargs.cli(main)
Example usage:
$ python ./08_positional_args.py --help usage: 08_positional_args.py [-h] [--optimizer.algorithm {ADAM,SGD}] [--optimizer.learning-rate FLOAT] [--optimizer.weight-decay FLOAT] [--force] [--verbose] [--background-rgb FLOAT FLOAT FLOAT] SOURCE DEST Command-line interface defined using a function signature. Note that this docstring is parsed to generate helptext. positional arguments: SOURCE Source path. DEST Destination path. optional arguments: -h, --help show this help message and exit --force Do not prompt before overwriting. --verbose Explain what is being done. --background-rgb FLOAT FLOAT FLOAT Background color. Red by default. (default: 1.0 0.0 0.0) optional optimizer arguments: Configuration for our optimizer object. --optimizer.algorithm {ADAM,SGD} Gradient-based optimizer to use. (default: ADAM) --optimizer.learning-rate FLOAT Learning rate to use. (default: 0.0003) --optimizer.weight-decay FLOAT Coefficient for L2 regularization. (default: 0.01)
$ python ./08_positional_args.py ./a ./b --optimizer.learning-rate 1e-5 source=PosixPath('a') dest=PosixPath('b') optimizer=OptimizerConfig(algorithm=<OptimizerType.ADAM: 1>, learning_rate=1e-05, weight_decay=0.01) force=False verbose=False background_rgb=(1.0, 0.0, 0.0)
9. Subparsers
Unions over nested types (classes or dataclasses) are populated using subparsers.
Code (link):
from __future__ import annotations
import dataclasses
from typing import Union
import dcargs
@dataclasses.dataclass(frozen=True)
class Checkout:
"""Checkout a branch."""
branch: str
@dataclasses.dataclass(frozen=True)
class Commit:
"""Commit changes."""
message: str
all: bool = False
def main(cmd: Union[Checkout, Commit] = Checkout("main")) -> None:
print(cmd)
if __name__ == "__main__":
dcargs.cli(main)
Example usage:
$ python ./09_subparsers.py --help usage: 09_subparsers.py [-h] [{checkout,commit}] ... optional arguments: -h, --help show this help message and exit optional subcommands: (default: checkout) [{checkout,commit}]
$ python ./09_subparsers.py commit --help usage: 09_subparsers.py commit [-h] --cmd.message STR [--cmd.all] Commit changes. optional arguments: -h, --help show this help message and exit required cmd arguments: --cmd.message STR optional cmd arguments: --cmd.all
$ python ./09_subparsers.py commit --cmd.message hello --cmd.all Commit(message='hello', all=True)
$ python ./09_subparsers.py checkout --help usage: 09_subparsers.py checkout [-h] [--cmd.branch STR] Checkout a branch. optional arguments: -h, --help show this help message and exit optional cmd arguments: --cmd.branch STR (default: main)
$ python ./09_subparsers.py checkout --cmd.branch main Checkout(branch='main')
10. Multiple Subparsers
Multiple unions over nested types are populated using a series of subparsers.
Code (link):
from __future__ import annotations
import dataclasses
from typing import Literal, Tuple, Union
import dcargs
# Possible dataset configurations.
@dataclasses.dataclass
class MnistDataset:
binary: bool = False
"""Set to load binary version of MNIST dataset."""
@dataclasses.dataclass
class ImageNetDataset:
subset: Literal[50, 100, 1000]
"""Choose between ImageNet-50, ImageNet-100, ImageNet-1000, etc."""
# Possible optimizer configurations.
@dataclasses.dataclass
class AdamOptimizer:
learning_rate: float = 1e-3
betas: Tuple[float, float] = (0.9, 0.999)
@dataclasses.dataclass
class SgdOptimizer:
learning_rate: float = 3e-4
# Train script.
def train(
dataset: Union[MnistDataset, ImageNetDataset] = MnistDataset(),
optimizer: Union[AdamOptimizer, SgdOptimizer] = AdamOptimizer(),
) -> None:
"""Example training script.
Args:
dataset: Dataset to train on.
optimizer: Optimizer to train with.
Returns:
None:
"""
print(dataset)
print(optimizer)
if __name__ == "__main__":
dcargs.cli(train)
Example usage:
$ python ./10_multiple_subparsers.py MnistDataset(binary=False) AdamOptimizer(learning_rate=0.001, betas=(0.9, 0.999))
$ python ./10_multiple_subparsers.py --help usage: 10_multiple_subparsers.py [-h] [{mnist-dataset,image-net-dataset}] ... Example training script. optional arguments: -h, --help show this help message and exit optional subcommands: Dataset to train on. (default: mnist-dataset) [{mnist-dataset,image-net-dataset}]
$ python ./10_multiple_subparsers.py mnist-dataset --help usage: 10_multiple_subparsers.py mnist-dataset [-h] [--dataset.binary] [{adam-optimizer,sgd-optimizer}] ... optional arguments: -h, --help show this help message and exit optional dataset arguments: --dataset.binary Set to load binary version of MNIST dataset. optional subcommands: Optimizer to train with. (default: adam-optimizer) [{adam-optimizer,sgd-optimizer}]
$ python ./10_multiple_subparsers.py mnist-dataset adam-optimizer --optimizer.learning-rate 3e-4 MnistDataset(binary=False) AdamOptimizer(learning_rate=0.0003, betas=(0.9, 0.999))
11. Dictionaries
Dictionary inputs can be specified using either a standard Dict[K, V]
annotation,
or a TypedDict
type.
Note that setting total=False
for TypedDict
is currently not (but reasonably could be)
supported.
Code (link):
from typing import Dict, TypedDict
import dcargs
class DictionarySchema(TypedDict):
field1: str # A string field.
field2: int # A numeric field.
field3: bool # A boolean field.
def main(
standard_dict: Dict[str, bool],
typed_dict: DictionarySchema = {
"field1": "hey",
"field2": 3,
"field3": False,
},
) -> None:
assert isinstance(standard_dict, dict)
assert isinstance(typed_dict, dict)
print("Standard dict:", standard_dict)
print("Typed dict:", typed_dict)
if __name__ == "__main__":
dcargs.cli(main)
Example usage:
$ python ./11_dictionaries.py --help usage: 11_dictionaries.py [-h] --standard-dict STR {True,False} [STR {True,False} ...] [--typed-dict.field1 STR] [--typed-dict.field2 INT] [--typed-dict.field3] required arguments: --standard-dict STR {True,False} [STR {True,False} ...] optional arguments: -h, --help show this help message and exit optional typed_dict arguments: --typed-dict.field1 STR A string field. (default: hey) --typed-dict.field2 INT A numeric field. (default: 3) --typed-dict.field3 A boolean field.
$ python ./11_dictionaries.py --standard-dict key1 True key2 False Standard dict: {'key1': True, 'key2': False} Typed dict: {'field1': 'hey', 'field2': 3, 'field3': False}
12. Named Tuples
Example using dcargs.cli()
to instantiate a named tuple.
Code (link):
from typing import NamedTuple
import dcargs
class TupleType(NamedTuple):
"""Description.
This should show up in the helptext!"""
field1: str # A string field.
field2: int = 3 # A numeric field, with a default value.
flag: bool = False # A boolean flag.
if __name__ == "__main__":
x = dcargs.cli(TupleType)
assert isinstance(x, tuple)
print(x)
Example usage:
$ python ./12_named_tuples.py --help usage: 12_named_tuples.py [-h] --field1 STR [--field2 INT] [--flag] Description. This should show up in the helptext! required arguments: --field1 STR A string field. optional arguments: -h, --help show this help message and exit --field2 INT A numeric field, with a default value. (default: 3) --flag A boolean flag.
$ python ./12_named_tuples.py --field1 hello TupleType(field1='hello', field2=3, flag=False)
13. Standard Classes
In addition to functions and dataclasses, we can also generate CLIs from (the constructors of) standard Python classes.
Code (link):
import dcargs
class Args:
def __init__(
self,
field1: str,
field2: int,
flag: bool = False,
):
"""Arguments.
Args:
field1: A string field.
field2: A numeric field.
flag: A boolean flag.
"""
self.data = [field1, field2, flag]
if __name__ == "__main__":
args = dcargs.cli(Args)
print(args.data)
Example usage:
$ python ./13_standard_classes.py --help usage: 13_standard_classes.py [-h] --field1 STR --field2 INT [--flag] Arguments. required arguments: --field1 STR A string field. --field2 INT A numeric field. optional arguments: -h, --help show this help message and exit --flag A boolean flag.
$ python ./13_standard_classes.py --field1 hello --field2 7 ['hello', 7, False]
14. Generics
Example of parsing for generic dataclasses.
Code (link):
import dataclasses
from typing import Generic, TypeVar
import dcargs
ScalarType = TypeVar("ScalarType")
ShapeType = TypeVar("ShapeType")
@dataclasses.dataclass(frozen=True)
class Point3(Generic[ScalarType]):
x: ScalarType
y: ScalarType
z: ScalarType
frame_id: str
@dataclasses.dataclass(frozen=True)
class Triangle:
a: Point3[float]
b: Point3[float]
c: Point3[float]
@dataclasses.dataclass(frozen=True)
class Args(Generic[ShapeType]):
point_continuous: Point3[float]
point_discrete: Point3[int]
shape: ShapeType
if __name__ == "__main__":
args = dcargs.cli(Args[Triangle])
print(args)
Example usage:
$ python ./14_generics.py --help usage: 14_generics.py [-h] --point-continuous.x FLOAT --point-continuous.y FLOAT --point-continuous.z FLOAT --point-continuous.frame-id STR --point-discrete.x INT --point-discrete.y INT --point-discrete.z INT --point-discrete.frame-id STR --shape.a.x FLOAT --shape.a.y FLOAT --shape.a.z FLOAT --shape.a.frame-id STR --shape.b.x FLOAT --shape.b.y FLOAT --shape.b.z FLOAT --shape.b.frame-id STR --shape.c.x FLOAT --shape.c.y FLOAT --shape.c.z FLOAT --shape.c.frame-id STR optional arguments: -h, --help show this help message and exit required point_continuous arguments: --point-continuous.x FLOAT --point-continuous.y FLOAT --point-continuous.z FLOAT --point-continuous.frame-id STR required point_discrete arguments: --point-discrete.x INT --point-discrete.y INT --point-discrete.z INT --point-discrete.frame-id STR required shape.a arguments: --shape.a.x FLOAT --shape.a.y FLOAT --shape.a.z FLOAT --shape.a.frame-id STR required shape.b arguments: --shape.b.x FLOAT --shape.b.y FLOAT --shape.b.z FLOAT --shape.b.frame-id STR required shape.c arguments: --shape.c.x FLOAT --shape.c.y FLOAT --shape.c.z FLOAT --shape.c.frame-id STR
Serialization
As a secondary feature aimed at enabling the use of dcargs.cli()
for general
configuration use cases, we also introduce functions for human-readable
dataclass serialization:
dcargs.from_yaml(cls: Type[T], stream: Union[str, IO[str], bytes, IO[bytes]]) -> T
anddcargs.to_yaml(instance: T) -> str
convert between YAML-style strings and dataclass instances.
The functions attempt to strike a balance between flexibility and robustness — in contrast to naively dumping or loading dataclass instances (via pickle, PyYAML, etc), explicit type references enable custom tags that are robust against code reorganization and refactor, while a PyYAML backend enables serialization of arbitrary Python objects.
Alternative tools
The core functionality of dcargs
— generating argument parsers from type
annotations — can be found as a subset of the features offered by many other
libraries. A summary of some distinguishing features:
Choices from literals | Generics | Docstrings as helptext | Nesting | Subparsers | Containers | |
---|---|---|---|---|---|---|
dcargs | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
datargs | ✓ | ✓ | ✓ | |||
tap | ✓ | ✓ | ✓ | ✓ | ||
simple-parsing | soon | ✓ | ✓ | ✓ | ✓ | |
argparse-dataclass | ||||||
argparse-dataclasses | ||||||
dataclass-cli | ||||||
clout | ✓ | |||||
hf_argparser | ✓ | |||||
pyrallis | ✓ | ✓ | ✓ |
Note that most of these other libraries are generally aimed specifically at
dataclasses rather than general typed callables, but offer other features that
you might find useful, such as registration for custom types (pyrallis
),
different approaches for serialization and config files (tap
, pyrallis
),
simultaneous parsing of multiple dataclasses (simple-parsing
), etc.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.