Strongly typed, zero effort CLIs
Project description
dcargs
Overview
dcargs
is a library for strongly-typed argument parsers and configuration
objects.
pip install dcargs
Our core interface generates CLI interfaces from type-annotated callables, which may be functions, classes, or dataclasses. The goal is a tool that's lightweight enough for simple interactive scripts, but powerful enough to replace the heavier frameworks typically used to build hierarchical configuration systems.
|
Importantly, dcargs.cli()
supports nested classes and dataclasses, which
enable expressive hierarchical configuration objects built on standard Python
features. Our goal is an interface that's:
- Low-effort. Type annotations, docstrings, and default values can be used to automatically generate argument parsers with informative helptext. This includes bells and whistles like enums, containers, etc.
- Strongly typed. Unlike dynamic configuration namespaces produced by
libraries like
argparse
,YACS
,abseil
,hydra
, orml_collections
, statically typed outputs mean that IDE-assisted autocomplete, rename, refactor, go-to-definition operations work out-of-the-box, as do static checking tools likemypy
andpyright
. - Modular. Most approaches to configuration objects require a centralized definition of all configurable fields. Supporting hierarchically nested configuration classes/dataclasses, however, makes it easy to distribute definitions, defaults, and documentation of configurable fields across modules or source files. A model configuration dataclass, for example, can be co-located in its entirety with the model implementation and dropped into any experiment configuration with an import — this eliminates redundancy and makes the entire module easy to port across codebases.
- Noninvasive. Many popular approaches to argument parsing and configuration are treated as frameworks, with tentacles that squirm deep into project codebases.
Examples
A series of example scripts can be found in ./examples.
Functions
# examples/0_simple_function.py
import dcargs
def main(
field1: str,
field2: int,
flag: bool = False,
) -> None:
"""Function, whose arguments will be populated from a CLI interface.
Args:
field1: First field.
field2: Second field.
flag: Boolean flag that we can set to true.
"""
print(field1, field2, flag)
if __name__ == "__main__":
dcargs.cli(main)
$ python 0_simple_function.py --help
usage: 0_simple_function.py [-h] --field1 STR --field2 INT [--flag]
Function, whose arguments will be populated from a CLI interface.
required arguments:
--field1 STR First field.
--field2 INT Second field.
optional arguments:
-h, --help show this help message and exit
--flag Boolean flag that we can set to true.
Dataclasses
# examples/1_simple_dataclass.py
import dataclasses
import dcargs
@dataclasses.dataclass
class Args:
"""Description.
This should show up in the helptext!"""
field1: str # A string field.
field2: int # A numeric field.
flag: bool = False # A boolean flag.
if __name__ == "__main__":
args = dcargs.cli(Args)
print(args)
$ python 1_simple_dataclass.py --help
usage: 1_simple_dataclass.py [-h] --field1 STR --field2 INT [--flag]
Description.
This should show up in the helptext!
required arguments:
--field1 STR A string field.
--field2 INT A numeric field.
optional arguments:
-h, --help show this help message and exit
--flag A boolean flag.
Nested dataclasses
# examples/6_nested_dataclasses.py
import dataclasses
import enum
import dcargs
class OptimizerType(enum.Enum):
ADAM = enum.auto()
SGD = enum.auto()
@dataclasses.dataclass(frozen=True)
class OptimizerConfig:
# Gradient-based optimizer to use.
algorithm: OptimizerType = OptimizerType.ADAM
# Learning rate to use.
learning_rate: float = 3e-4
# Coefficient for L2 regularization.
weight_decay: float = 1e-2
@dataclasses.dataclass(frozen=True)
class ExperimentConfig:
"""A nested experiment configuration. Note that the argument parser description is
pulled from this docstring by default, but can also be overrided with
`dcargs.cli()`'s `description=` flag."""
# Experiment name to use.
experiment_name: str
# Various configurable options for our optimizer.
optimizer: OptimizerConfig
# Random seed. This is helpful for making sure that our experiments are all
# reproducible!
seed: int = 0
if __name__ == "__main__":
config = dcargs.cli(ExperimentConfig)
print(config)
print(dcargs.to_yaml(config))
usage: 6_nested_dataclasses.py [-h] --experiment-name STR [--optimizer.algorithm {ADAM,SGD}]
[--optimizer.learning-rate FLOAT] [--optimizer.weight-decay FLOAT]
[--seed INT]
A nested experiment configuration. Note that the argument parser description is
pulled from this docstring by default, but can also be overrided with
`dcargs.cli()`'s `description=` flag.
required arguments:
--experiment-name STR
Experiment name to use.
optional arguments:
-h, --help show this help message and exit
--seed INT Random seed. This is helpful for making sure that our experiments are all
reproducible! (default: 0)
optional optimizer arguments:
Various configurable options for our optimizer.
--optimizer.algorithm {ADAM,SGD}
Gradient-based optimizer to use. (default: ADAM)
--optimizer.learning-rate FLOAT
Learning rate to use. (default: 0.0003)
--optimizer.weight-decay FLOAT
Coefficient for L2 regularization. (default: 0.01)
Serialization
As a secondary feature aimed at enabling the use of dcargs.cli()
for general
configuration use cases, we also introduce functions for human-readable
dataclass serialization:
dcargs.from_yaml(cls: Type[T], stream: Union[str, IO[str], bytes, IO[bytes]]) -> T
anddcargs.to_yaml(instance: T) -> str
convert between YAML-style strings and dataclass instances.
The functions attempt to strike a balance between flexibility and robustness — in contrast to naively dumping or loading dataclass instances (via pickle, PyYAML, etc), explicit type references enable custom tags that are robust against code reorganization and refactor, while a PyYAML backend enables serialization of arbitrary Python objects.
Alternative tools
The core functionality of dcargs
--- generating argument parsers from type
annotations --- can be found as a subset of the features offered by many other
libraries. A summary of some distinguishing features:
Choices from literals | Generics | Docstrings as helptext | Nesting | Subparsers | Containers | |
---|---|---|---|---|---|---|
dcargs | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
datargs | ✓ | ✓ | ✓ | |||
tap | ✓ | ✓ | ✓ | ✓ | ||
simple-parsing | soon | ✓ | ✓ | ✓ | ✓ | |
argparse-dataclass | ||||||
argparse-dataclasses | ||||||
dataclass-cli | ||||||
clout | ✓ | |||||
hf_argparser | ✓ | |||||
pyrallis | ✓ | ✓ | ✓ |
Note that most of these other libraries are generally aimed specifically at
dataclasses rather than general typed callables, but offer other features that
you might find useful, such as clout
), registration for custom types
(pyrallis
), different approaches for serialization and config files (tap
,
pyrallis
), simultaneous parsing of multiple dataclasses (simple-parsing
),
etc.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.