Skip to main content

Utility functions that are often useful

Project description

pelutils

Utility functions that we commonly use including flexible parser, logger and time taker.

Parsing

A combination of parsing CLI and config file arguments which allows for a powerful, easy-to-use workflow. Useful for parametric methods such as machine learning.

A file main.py could contain:

options = {
    "location": { "default": "local_train", "help": "save_location", "type": str },
    "learning-rate": { "default": 1.5e-3, "help": "Controls size of parameter update", "type": float },
    "gamma": { "default": 1, "help": "Use of generator network in updating", "type": float },
    "initialize-zeros": { "help": "Whether to initialize all parameters to 0", "action": "store_true" },
}
parser = Parser(options)
experiments = parser.parse()

This could then by run by python main.py data/my-big-experiment --learning_rate 1e-5 or by python main.py data/my-big-experiment --config cfg.ini where cfg.ini could contain

[DEFAULT]
gamma = 0.95
[RUN1]
learning-rate = 1e-4
initialize-zeros
[RUN2]
learning-rate = 1e-5
gamma = 0.9

Logging

Easy to use logger which fits common needs.

# Configure logger for the script
log.configure("path/to/save/log.log", "Title of log")

# Start logging
for i in range(70):  # Nice
    log("Execution %i" % i)

# Sections
log.section("New section in the logfile")

# Verbose logging for less important things
log.verbose("Will be logged")
with log.unverbose:
    log.verbose("Will not be logged")

# Error handling
# This explicitly logs a ValueError and then raises it
log.throw(ValueError("Your value is bad, and you should feel bad"))
# The zero-division error is logged
with log.log_errors:
    0 / 0

# User input
inp = log.input("WHAT... is your favourite colour? ")

# Log all logs from a function at the same time
# This is especially useful when using multiple threads so logging does not get mixed up
def fun():
    log("Hello there")
    log("General Kenobi!")
with mp.Pool() as p:
    p.map(collect_logs(fun), args)

Time taking

Simple time taker inspired by Matlab Tic, Toc, which also has profiling tooling.

tt = TickTock()
tt.tick()
<some task>
seconds_used = tt.tock()

for i in range(100):
    tt.profile("Repeated code")
    <some task>
    tt.profile("Subtask")
    <some subtask>
    tt.end_profile()
    tt.end_profile()
print(tt)  # Prints a table view of profiled code sections

Data Storage

A data class that saves/loads its fields from disk. Anything that can be saved to a json file will be. Other data types will be saved to relevant file formats. Currently, numpy arrays is the only supported data type that is not saved to the json file.

@dataclass
class Person(DataStorage):
    name: str
    age: int
    numbers: np.ndarray
    subfolder = "older"
    json_name = "yoda.json"

yoda = Person(name="Yoda", age=900, numbers=np.array([69, 420]))
yoda.save("old")
# Saved data at old/older/yoda.json
# {
#     "name": "Yoda",
#     "age": 900
# }
# There will also be a file named numbers.npy
yoda = Person.load("old")

0.3.2

  • log.input now also accepts iterables as input

    For such inputs, it will return a generator of user inputs

0.3.1 - Breaking changes

  • Add functionality to logger for logging repository commit

  • Remove function get_commit

  • Add function get_repo which returns repository path and commit

    It attempts to find a repository by searching from working directory and upwards

  • Updates to examples in README and other minor documentation changes

  • set_seeds no longer returns seed, as this is already given as input to the function

0.3.0 - Breaking changes

  • Only works for Python 3.7+

  • If logger has not been configured, it now does no logging instead of crashing

    This prevents dependecies that use the logger to crash the program if it is not used

  • log.throw now also logs the actual error rather than just the stack trace

  • log now has public property is_verbose

  • Fixed with log.log_errors always throwing errors

  • Added code samples to README

  • Parser no longer automatically determines if experiments should be placed in subfolders

    Instead, this is given explicitly as an argument to __init__

    It also supports boolean flags in the config file

0.2.13

  • Readd clean method to logger

0.2.12 - Breaking changes

  • The logger is now solely a global variable

    Different loggers are handled internally in the global _Logger instance

0.2.11

  • Add catch property to logger to allow automatically logging errors with with
  • All code is now indented using spaces

0.2.10

  • Allow finer verbosity control in logger
  • Allow multiple log commands to be collected and logged at the same time
  • Add decorator for aforementioned feature
  • Change thousand_seps from TickTock method to stand-alone function in __init__
  • Verbose logging now has same signature as normal logging

0.2.8

  • Add code to execute code with specific environment variables

0.2.7

  • Fix error where the full stacktrace was not printed by log.throw

  • set_seeds now checks if torch is available

    This means torch seeds are still set without needing it as a dependency

0.2.6 - Breaking changes

  • Make Unverbose class private and update documentation
  • Update formatting when using .input

0.2.5

  • Add input method to logger

0.2.4

  • Better logging of errors

0.2.1 - Breaking changes

  • Removed torch as dependency

0.2.0 - Breaking changes

  • Logger is now a global variable

    Logging should happen by importing the log variable and calling .configure to set it up

    To reset the logger, .clean can be called

  • It is still possible to just import Logger and use it in the traditional way, though .configure should be called first

  • Changed timestamp function to give a cleaner output

  • get_commit now returns None if gitpython is not installed

0.1.2

  • Update documentation for logger and ticktock
  • Fix bug where seperator was not an argument to Logger.call

0.1.0

  • Include DataStorage
  • Logger can throw errors and handle seperators
  • TickTock includes time handling and units
  • Minor parser path changes

0.0.1

  • Logger, Parser, TickTock added from previous projects

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pelutils-0.3.2.tar.gz (15.4 kB view hashes)

Uploaded Source

Built Distribution

pelutils-0.3.2-py3-none-any.whl (14.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page