A python module desgined for RL logging, monitoring and experiments managing.

Project description

UtilsRL

A util python module designed for reinforcement learning. Bug reports are welcomed.

Installation

You can install this package directly from pypi:

pip install UtilsRL

After installation, you may still need to configure some other dependencies based on your platform, such as PyTorch.

Features & Usage

Argument Parsing

The argument parsing utils in this package provides three features:

Supporting for multiple types of config files. parse_args can parse json, yaml, or even a python config module which is imported ahead

from UtilsRL.argparse import parse_args
json_config = "/path/to/json"
yaml_config = "/path/to/yaml"
import config_module

json_args = parse_args(json_config)
module_args = parse_args(config_module)

Nested argument parsing. We do this by introducing the NameSpace class. To be specific, if you pass convert=True to parse_args, then all of the dicts in the config file (including the argument dict itself) will be converted to a subclass of NameSpace. The contents wrapped in NameSpace can be accessed both in dict manner and in attribute manner, and they will be formatted for better illustration when printing. For example, if we define a config module as follows:

# in config module: config_module
from UtilsRL.misc.namespace import NameSpace

batch_size = 256
num_epochs = 10
class TrainerArgs(NameSpace):
    learning_rate = 1e-3
    weight_decay = 1e-5
    momentum = 0.9
    
class ActorArgs(NameSpace):
    epsilon = 0.05
    class NetArgs(NameSpace):
        layer_num = 2
        layer_nodes = 256

The we import and parse it in main.py:

import config_module
args = parse_args(config_module)
print(args)
print(">>>>>>>>>>>>>>>>>>>>>>")
print(args.trainer.learning_rate)

The outputs are

<NameSpace: args>
|ActorArgs:     <NameSpace: ActorArgs>
                |epsilon: 0.05
                |NetArgs:       <NameSpace: NetArgs>
                                |layer_num: 2
                                |layer_nodes: 256
|TrainerArgs:   <NameSpace: TrainerArgs>
                |learning_rate: 0.001
                |weight_decay: 1e-05
                |momentum: 0.9
|batch_size: 256
|num_epochs: 10
>>>>>>>>>>>>>>>>>>>>>>
0.001

Argument updating. We can update the parsed args with command line arguments. If the specific argument is nested, then you can use slash / to separate each NameSpace, like python main.py --TrainerArgs/momentum 0.8.

from UtilsRL.argparse import parse_args, update_args
import argparse

# get command line arguments
parser = arg_parse.ArgumentParser()
_, unknown = parser.parse_known_args()

# get arguments from file/config module
args = parse_args("/path/to/file")

# update with command line arguments
args = update_args(args, unknown)

Monitor

Monitor listens at the main loop of the training process, and displays the process with tqdm meter.

from UtilsRL.monitor import Monitor

monitor = Monitor(desc="test_monitor")
for i in monitor.listen(range(5)):
    time.sleep(0.1)

You can register callback functions which will be triggered at certain stage of the training. For example, we can register a callback which will email us when training is done:

monitor = Monitor(desc="test_monitor")
monitor.register_callback(
    name= "email me at the end of training", 
    on = "exit", 
    callback = Monitor.email, 
    ...
)

You can also register context variables for training, which will be automatically managed by monitor. In the example below, the registered context variables (i.e. self.actor and local_var ) will be saved every 100 iters.

monitor = Monitor(desc="test_monitor", out_dir="./out")
def train():
    local_var = ...
    local_var = monitor.register_context("local_var", save_every=100)
    for i_epoch in monitor.listen(range(1000)):
        # do training
train()

As a more complex example, we can use the Monitor to resume training from a certain iteration, and restore the context variables from checkpoints:

class Trainer():
    def __init__(self):
        self.actor = ...
    
    def train(self):
        local_var = ...
        
        # load previous saved checkpoints specified by `load_path`
        self.actor, local_var = \
            monitor.register_context(["self.actor", "local_var"], load_path="/path/to/checkpoint/dir").values()
        # use `initial` to designate the start point
        for i_epoch in monitor.listen(range(1000), initial=100):
            # continue training

Logger

Logger provides a rather shallow capsulation for torch.utils.tensorboard.SummaryWriter.

from UtilsRL.logger import TensorboardLogger

# create a logger, with terminal output enabled and file logging disabled
logger = TensorboardLogger(log_dir="./logs", name="debug", terminal=True, txt=False) 

# log a sentence in color blue.
logger.log_str("This is a sentence", type="LOG")
# log sentence in color red. 
logger.log_str("Here occurs an error", type="ERROR") 

# log scalar and a dict of scalars repectively
logger.log_scala(tag="var_name", value=1.0, step=1)
logger.log_scalas(main_tag="group_name", tag_scalar_dict={
    "var1": 1.0, 
    "var2": 2.0
}, step=1)

Device and Seed Management

We provide a set of utils functions of selecting device and setting seed in UtilsRL.misc.device UtilsRL.misc.seed. Please take time and check these files.

A setup function is also available in UtilsRL.misc.__init__, which will setup the arguments with logger, device and seed which you provide.

from UtilsRL.misc import setup

setup(args, logger=None, device="cuda:0", seed=None)  # seed will be initialized randomly
setup(args, logger=None, device=None, seed="4234")  # a most free gpu will be selected as device

Project details

Release history Release notifications | RSS feed

0.6.11

Jul 11, 2024

0.6.10

May 6, 2024

0.6.9

May 3, 2024

0.6.8

Apr 21, 2024

0.6.7

Mar 8, 2024

0.6.6

Feb 20, 2024

0.6.5

Jan 11, 2024

0.6.4

Nov 27, 2023

0.6.3

Sep 5, 2023

0.6.2

Aug 31, 2023

0.6.1

Aug 31, 2023

0.6.0

Aug 30, 2023

0.5.9

May 5, 2023

0.5.8

May 5, 2023

0.5.7

Apr 12, 2023

0.5.6

Apr 11, 2023

0.5.5

Mar 31, 2023

0.5.4

Mar 17, 2023

0.5.3

Mar 12, 2023

0.5.2

Feb 24, 2023

0.5.1

Feb 17, 2023

0.5.0

Feb 15, 2023

0.5.0b3 pre-release

Feb 13, 2023

0.5.0b2 pre-release

Feb 11, 2023

0.5.0b1 pre-release

Jan 14, 2023

0.5.0b0 pre-release

Jan 13, 2023

0.5.0a0 pre-release

Jan 1, 2023

0.4.8

Jul 31, 2022

0.4.7

Jul 25, 2022

0.4.6

Jul 3, 2022

0.4.5

Jul 3, 2022

0.4.4

Jun 16, 2022

0.4.3

Jun 5, 2022

0.4.2

Jun 5, 2022

0.4.1

Jun 3, 2022

0.4.0

Jun 2, 2022

0.3.13

May 24, 2022

0.3.12

May 16, 2022

0.3.11

May 11, 2022

0.3.10

Apr 9, 2022

0.3.8

Apr 9, 2022

0.3.7

Apr 9, 2022

0.3.6

Apr 1, 2022

0.3.5

Mar 23, 2022

0.3.4

Mar 23, 2022

0.3.3

Mar 23, 2022

0.3.2

Mar 22, 2022

0.3.1

Mar 22, 2022

0.3.0

Mar 21, 2022

0.2.4

Mar 20, 2022

0.2.3

Feb 21, 2022

This version

0.2.2

Feb 15, 2022

0.2.1

Feb 15, 2022

0.2.0

Feb 14, 2022

0.1.1

Feb 13, 2022

0.1.0

Feb 10, 2022

0.0.2

Feb 8, 2022

0.0.1

Feb 7, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

UtilsRL-0.2.2.tar.gz (4.4 kB view hashes)

Uploaded Feb 15, 2022 Source

Built Distribution

UtilsRL-0.2.2-py3-none-any.whl (4.3 kB view hashes)

Uploaded Feb 15, 2022 Python 3

Hashes for UtilsRL-0.2.2.tar.gz

Hashes for UtilsRL-0.2.2.tar.gz
Algorithm	Hash digest
SHA256	`7b596202b1331c879381a51a3161627ec9fddc3d74a9fa4653904c3b3a5e0a08`
MD5	`dc397bcd501a82ecd4636e8b15e2290c`
BLAKE2b-256	`31b267266af5b0a5891972258829f5497049d596884042bd519e24068e123735`

Hashes for UtilsRL-0.2.2-py3-none-any.whl

Hashes for UtilsRL-0.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cf23ea3a14a8564c0fab93e9db50944ff4ac160077e74faf3ed3d62877799ae4`
MD5	`f1a63bc76260cfde857801ff2f376140`
BLAKE2b-256	`5aa6aa378f5f741fe3e47f6e430f69f110bea43ba24dc9cb0e52ea7621f257a6`