Strongly typed, zero-effort CLIs
Project description
dcargs
Overview
pip install dcargs
dcargs
is a library for typed CLI interfaces and configuration objects.
Our core interface generates an argument parser from type-annotated callables, which may be functions, classes, or dataclasses:
dcargs.clidcargs.cli(
f: Callable[..., T],
*,
description: Optional[str]=None,
args: Optional[Sequence[str]]=None,
default_instance: Optional[T]=None,
) -> T
|
The goal is a tool that's lightweight enough for simple interactive scripts, but
flexible enough to replace heavier configuration frameworks. Notably,
dcargs.cli()
supports nested classes and dataclasses, which enable
expressive hierarchical configuration objects built on standard Python features.
The ultimate goal is the ability build interfaces that are:
- Low-effort. Type annotations, docstrings, and default values can be used to automatically generate argument parsers with informative helptext. This includes bells and whistles like enums, containers, etc.
- Strongly typed. Unlike dynamic configuration namespaces produced by
libraries like
argparse
,YACS
,abseil
,hydra
, orml_collections
, typed outputs mean that IDE-assisted autocomplete, rename, refactor, go-to-definition operations work out-of-the-box, as do static checking tools likemypy
andpyright
. - Modular. Most approaches to configuration objects require a centralized definition of all configurable fields. Supporting hierarchically nested configuration classes/dataclasses, however, makes it easy to distribute definitions, defaults, and documentation of configurable fields across modules or source files. A model configuration dataclass, for example, can be co-located in its entirety with the model implementation and dropped into any experiment configuration with an import — this eliminates redundancy and makes the entire module easy to port across codebases.
Examples
A series of example scripts covering core features are included below.
1. Simple Functions
examples/01_simple_functions.py
"""CLI generation example from a simple annotated function. `dcargs.cli()` will call
`main()`, with arguments populated from the CLI."""
import dcargs
def main(
field1: str,
field2: int = 3,
flag: bool = False,
) -> None:
"""Function, whose arguments will be populated from a CLI interface.
Args:
field1: A string field.
field2: A numeric field, with a default value.
flag: A boolean flag.
"""
print(field1, field2, flag)
if __name__ == "__main__":
dcargs.cli(main)
$ python examples/01_simple_functions.py --help
|
2. Simple Dataclasses
examples/02_simple_dataclasses.py
"""Example using dcargs.cli() to instantiate a dataclass."""
import dataclasses
import dcargs
@dataclasses.dataclass
class Args:
"""Description.
This should show up in the helptext!"""
field1: str # A string field.
field2: int = 3 # A numeric field, with a default value.
flag: bool = False # A boolean flag.
if __name__ == "__main__":
args = dcargs.cli(Args)
print(args)
print()
print(dcargs.to_yaml(args))
$ python examples/02_simple_dataclasses.py --help
|
3. Enums And Containers
examples/03_enums_and_containers.py
"""Examples of more advanced type annotations: enums and containers types.
For collections, we only showcase Tuple here, but List, Sequence, Set, etc are all
supported as well."""
import dataclasses
import enum
import pathlib
from typing import Optional, Tuple
import dcargs
class OptimizerType(enum.Enum):
ADAM = enum.auto()
SGD = enum.auto()
@dataclasses.dataclass(frozen=True)
class TrainConfig:
# Example of a variable-length tuple:
dataset_sources: Tuple[pathlib.Path, ...]
"""Paths to load training data from. This can be multiple!"""
# Fixed-length tupels are also okay:
image_dimensions: Tuple[int, int]
"""Height and width of some image data."""
# Enums are handled seamlessly.
optimizer_type: OptimizerType
"""Gradient-based optimizer to use."""
# We can also explicitly mark arguments as optional.
checkpoint_interval: Optional[int]
"""Interval to save checkpoints at."""
if __name__ == "__main__":
config = dcargs.cli(TrainConfig)
print(config)
$ python examples/03_enums_and_containers.py --help
|
4. Flags
"""Example of how booleans are handled and automatically converted to flags."""
import dataclasses
from typing import Optional
import dcargs
@dataclasses.dataclass
class Args:
# Boolean. This expects an explicit "True" or "False".
boolean: bool
# Optional boolean. Same as above, but can be omitted.
optional_boolean: Optional[bool]
# Pass --flag-a in to set this value to True.
flag_a: bool = False
# Pass --no-flag-b in to set this value to False.
flag_b: bool = True
if __name__ == "__main__":
args = dcargs.cli(Args)
print(args)
print()
print(dcargs.to_yaml(args))
$ python examples/04_flags.py --help
|
5. Hierarchical Configs
examples/05_hierarchical_configs.py
"""An example of how we can create hierarchical configuration interfaces by nesting
dataclasses."""
import dataclasses
import enum
import pathlib
import dcargs
class OptimizerType(enum.Enum):
ADAM = enum.auto()
SGD = enum.auto()
@dataclasses.dataclass(frozen=True)
class OptimizerConfig:
# Gradient-based optimizer to use.
algorithm: OptimizerType = OptimizerType.ADAM
# Learning rate to use.
learning_rate: float = 3e-4
# Coefficient for L2 regularization.
weight_decay: float = 1e-2
@dataclasses.dataclass(frozen=True)
class ExperimentConfig:
# Various configurable options for our optimizer.
optimizer: OptimizerConfig
# Batch size.
batch_size: int = 32
# Total number of training steps.
train_steps: int = 100_000
# Random seed. This is helpful for making sure that our experiments are all
# reproducible!
seed: int = 0
def train(
out_dir: pathlib.Path,
/,
config: ExperimentConfig,
restore_checkpoint: bool = False,
checkpoint_interval: int = 1000,
) -> None:
"""Train a model.
Args:
out_dir: Where to save logs and checkpoints.
config: Experiment configuration.
restore_checkpoint: Set to restore an existing checkpoint.
checkpoint_interval: Training steps between each checkpoint save.
"""
print(out_dir)
print("---")
print(dcargs.to_yaml(config))
print("---")
print(restore_checkpoint)
print(checkpoint_interval)
if __name__ == "__main__":
dcargs.cli(train)
$ python examples/05_hierarchical_configs.py --help
|
6. Literals
"""typing.Literal[] can be used to specify accepted input choices."""
import dataclasses
import enum
from typing import Literal
import dcargs
class Color(enum.Enum):
RED = enum.auto()
GREEN = enum.auto()
BLUE = enum.auto()
@dataclasses.dataclass(frozen=True)
class Args:
enum: Color
restricted_enum: Literal[Color.RED, Color.GREEN]
integer: Literal[0, 1, 2, 3]
string: Literal["red", "green"]
restricted_enum_with_default: Literal[Color.RED, Color.GREEN] = Color.GREEN
integer_with_default: Literal[0, 1, 2, 3] = 3
string_with_Default: Literal["red", "green"] = "red"
if __name__ == "__main__":
args = dcargs.cli(Args)
print(args)
print()
print(dcargs.to_yaml(args))
$ python examples/06_literals.py --help
|
7. Positional Args
examples/07_positional_args.py
"""Positional-only arguments in functions are converted to positional CLI arguments."""
from __future__ import annotations
import dataclasses
import enum
import pathlib
from typing import Tuple
import dcargs
def main(
source: pathlib.Path,
dest: pathlib.Path,
/, # Mark the end of positional arguments.
optimizer: OptimizerConfig,
force: bool = False,
verbose: bool = False,
background_rgb: Tuple[float, float, float] = (1.0, 0.0, 0.0),
) -> None:
"""Command-line interface defined using a function signature. Note that this
docstring is parsed to generate helptext.
Args:
optimizer: Configuration for our optimizer object.
force: Do not prompt before overwriting.
verbose: Explain what is being done.
background_rgb: Background color. Red by default.
"""
print(
f"{source.absolute()=}"
"\n"
f"{dest.absolute()=}"
"\n"
f"{optimizer=}"
"\n"
f"{force=}"
"\n"
f"{verbose=}"
"\n"
f"{background_rgb=}"
)
class OptimizerType(enum.Enum):
ADAM = enum.auto()
SGD = enum.auto()
@dataclasses.dataclass(frozen=True)
class OptimizerConfig:
algorithm: OptimizerType = OptimizerType.ADAM
"""Gradient-based optimizer to use."""
learning_rate: float = 3e-4
"""Learning rate to use."""
weight_decay: float = 1e-2
"""Coefficient for L2 regularization."""
if __name__ == "__main__":
dcargs.cli(main)
$ python examples/07_positional_args.py --help
|
8. Standard Classes
examples/08_standard_classes.py
"""In addition to functions and dataclasses, we can also generate CLIs from (the
constructors of) standard Python classes."""
import dcargs
class Args:
def __init__(
self,
field1: str,
field2: int,
flag: bool = False,
):
"""Arguments.
Args:
field1: A string field.
field2: A numeric field.
flag: A boolean flag.
"""
self.data = [field1, field2, flag]
if __name__ == "__main__":
args = dcargs.cli(Args)
print(args.data)
$ python examples/08_standard_classes.py --help
|
9. Subparsers
"""Unions over nested types (classes or dataclasses) will result in subparsers."""
from __future__ import annotations
import dataclasses
from typing import Union
import dcargs
def main(command: Union[Checkout, Commit]) -> None:
print(command)
@dataclasses.dataclass(frozen=True)
class Checkout:
"""Checkout a branch."""
branch: str
@dataclasses.dataclass(frozen=True)
class Commit:
"""Commit changes."""
message: str
all: bool = False
if __name__ == "__main__":
dcargs.cli(main)
$ python examples/09_subparsers.py --help
|
10. Generics
"""Example of parsing for generic (~templated) dataclasses."""
import dataclasses
from typing import Generic, TypeVar
import dcargs
ScalarType = TypeVar("ScalarType")
ShapeType = TypeVar("ShapeType")
@dataclasses.dataclass(frozen=True)
class Point3(Generic[ScalarType]):
x: ScalarType
y: ScalarType
z: ScalarType
frame_id: str
@dataclasses.dataclass(frozen=True)
class Triangle:
a: Point3[float]
b: Point3[float]
c: Point3[float]
@dataclasses.dataclass(frozen=True)
class Args(Generic[ShapeType]):
point_continuous: Point3[float]
point_discrete: Point3[int]
shape: ShapeType
if __name__ == "__main__":
args = dcargs.cli(Args[Triangle])
print(args)
$ python examples/10_generics.py --help
|
Serialization
As a secondary feature aimed at enabling the use of dcargs.cli()
for general
configuration use cases, we also introduce functions for human-readable
dataclass serialization:
dcargs.from_yaml(cls: Type[T], stream: Union[str, IO[str], bytes, IO[bytes]]) -> T
anddcargs.to_yaml(instance: T) -> str
convert between YAML-style strings and dataclass instances.
The functions attempt to strike a balance between flexibility and robustness — in contrast to naively dumping or loading dataclass instances (via pickle, PyYAML, etc), explicit type references enable custom tags that are robust against code reorganization and refactor, while a PyYAML backend enables serialization of arbitrary Python objects.
Alternative tools
The core functionality of dcargs
— generating argument parsers from type
annotations — can be found as a subset of the features offered by many other
libraries. A summary of some distinguishing features:
Choices from literals | Generics | Docstrings as helptext | Nesting | Subparsers | Containers | |
---|---|---|---|---|---|---|
dcargs | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
datargs | ✓ | ✓ | ✓ | |||
tap | ✓ | ✓ | ✓ | ✓ | ||
simple-parsing | soon | ✓ | ✓ | ✓ | ✓ | |
argparse-dataclass | ||||||
argparse-dataclasses | ||||||
dataclass-cli | ||||||
clout | ✓ | |||||
hf_argparser | ✓ | |||||
pyrallis | ✓ | ✓ | ✓ |
Note that most of these other libraries are generally aimed specifically at
dataclasses rather than general typed callables, but offer other features that
you might find useful, such as registration for custom types (pyrallis
),
different approaches for serialization and config files (tap
, pyrallis
),
simultaneous parsing of multiple dataclasses (simple-parsing
), etc.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.