Skip to main content

Basic Python tools

Project description

cdxbasics

Collection of basic tools for Python development.

dynaplot

Tools for dynamic (animated) plotting in Jupyer/IPython. The aim of the toolkit is making it easy to develop visualization with matplotlib which dynamically updates, for example during training with machine learing kits such as tensorflow. This has been tested with Anaconda's JupyterHub and %matplotlib inline.

Some users reported that the package does not work in some versions of Jupyter. In this case, please try fig.render( "canvas" ). I appreciate if you let me know whether this resolved the problem.

Animation

See the jupyter notebook notebooks/DynamicPlot.ipynb for some applications.

# example
%matplotlib inline
import numpy as np
x = np.linspace(-5,5,21)
y = np.ramdom.normal(size=(21,5))

# create figure
from cdxbasics.dynaplot import figure
fig = figure()                  # equivalent to matplotlib.figure
ax  = fig.add_subplot()         # no need to specify row,col,num
l   = ax.plot( x, y[:,0] )[0] 
fig.render()                     # construct figure & draw graph

# animate
import time
for i in range(1,5):
    time.sleep(1) 
    l[0].set_ydata( y[:,i] )       # update data
    fig.render()
    
fig.close()                   # clear figure to avoid duplication

See example notebook for how to use the package for lines, confidence intervals, and 3D graphs.

Simpler sub_plot

The package lets you create sub plots without having to know their number in advance: you do not need to specify rol, col, num when calling add_subplot.

# create figure
from cdxbasics.dynaplot import figure
fig = figure(col_size=4, row_size=4, col_num=3) 
                                # equivalent to matplotlib.figure
ax  = fig.add_subplot()         # no need to specify row,col,num
ax.plot( x, y )
ax  = fig.add_subplot()         # no need to specify row,col,num
ax.plot( x, y )
...
fig.next_row()                  # another row
ax  = fig.add_subplot()         # no need to specify row,col,num
ax.plot( x, y )
...

fig.render()                    # draws the plots

Other features

There are a number of other functions to aid plotting

  • figure():
    Function to replace matplotlib.figure which will defer creation of the figure until the first call of render(). This way we do not have to specify row, col, num when adding subplots.
    Instead of figsize specify row_size, col_size and col_nums to dynamically generate an appropriate figure size.
    Key member functions are:
    • add_subplot to add a new plot. No arguments needed.
    • next_row() to skip to the next row.
    • render() to draw the figure. When called the first time will create all the underlying matplotlib objects. Subsequent calls will re-draw the canvas if the figure was modified. See examples in https://github.com/hansbuehler/cdxbasics/blob/master/cdxbasics/notebooks/DynamicPlot.ipynb
    • close() to close the figure. If not called, Jupyter creates an unseemly second copy of the graph when the current cell is finished running.
  • color_css4, color_base, color_tableau, color_xkcd:
    Each function returns the ith element of the respective matplotlib color table. The purpose is to simplify using consistent colors accross different plots.
    Example:
        fig = dynaplot.figure()
        ax = fig.add_subplot()
        # draw 10 lines in the first sub plot, and add a legend
        for i in range(10):
            ax.plot( x, y[i], color=color_css4(i), label=labels[i] )
        ax.legend()
        # draw 10 lines in the second sub plot. No legend needed as colors are shared with first plot
        ax = fig.add_subplot()
        for i in range(10):
            ax.plot( x, z[i], color=color_css4(i) )
        fig.render()
    
  • colors_css4, colors_base, colors_tableau, colors_xkcd:
    Generator versions of the color_ functions.

prettydict

A number of simple extensions to standard dictionaries which allow accessing any element of the dictionary with "." notation:

from cdxbasics.prettydict import PrettyDict
pdct = PrettyDict(z=1)
pdct['a'] = 1       # standard dictionary write access
pdct.b = 2          # pretty write access
_ = pdct.b          # read access
_ = pdct("c",3)     # short cut for pdct.get("c",3)

There are three versions:

  • PrettyDict:
    Pretty version of standard dictionary.
  • PrettyOrderedDict:
    Pretty version of ordered dictionary.
  • PrettySortedDict:
    Pretty version of sorted dictionary.

Functions

The classes also allow assigning bona fide member functions by a simple semantic of the form:

def mult_b( self, x ):
    return self.b * x
pdct = mult_a 

Calling pdct.mult_a(3) with above config will return 6 as expected. This only works when using the member synthax for assigning values to a pretty dictionary; if the standard [] operator is used then functions will be assigned to the dictionary as usual, hence they are static members of the object.

The reason for this is as follows: consider

def mult( a, b ):
    return a*b
pdct.mult = mult
mult(3,4) --> produces am error as three arguments as are passed if we count 'self'

In this case, use:

pdct['mult'] = mult
pdct.mult(3,4) --> 12

config

Tooling for setting up program-wide configuration. Aimed at machine learning programs to ensure consistency of code accross experimentation.

from cdxbasics.config import Config
config = Config()

Key features

  • Detect misspelled parameters by checking that all parameters of a config have been read.
  • Provide summary of all values read, including summary help for what they were for.
  • Nicer synthax than dictionary notation.

Creating configs
Set data with both dictionary and member notation:

        config = Config()
        config['features']           = [ 'time', 'spot' ]
        config.weights               = [ 1, 2, 3 ]

Create sub configurations with member notation

        config.network.depth         = 10
        config.network.activation    = 'relu'
        config.network.width         = 100   # (intentional typo)

This is equivalent to

        config.network               = Config()
        config.network.depth         = 10
        config.network.activation    = 'relu'
        config.network.widht         = 100   # (intentional typo)

Reading a config
Reading a config provides notation for type handling and also specifying help on what the respective feature is used for. See the usage_report() member.

    def __init__( self, confg ):
        # read top level parameters
        self.features = config("features", [], list, "Features for the agent" )
        self.weights  = config("weights", [], np.asarray, "Weigths for the agent", help_default="no initial weights")

When a parameter is read with (), we are able to specify not only the name, but also its default value, and a cast operator. For example, in the case of weigths we provide the numpy function asarray.

Further parameters of () are the help text, plus ability to provide text versions of the default with help_default (e.g. if the default value is complex), and the cast operator with help_cast (again if the respective operation is complex).

Important: the () operator will not default 'default' to None as dict.get does. If no default is specified, then () will return an error if the respective value was not provided. Therefore, config(key) behaves like config[key].


Accessing children directly with member notation

        self.activation = config.network("activation", "relu", str, "Activation function for the network")

Accessing via the child node

        network  = config.network 
        self.depth = network('depth', 10000, int, "Depth for the network") 

We can impose simple restrictions

        self.width = network('width', 100, Int>3, "Width for the network")

Restrictions on both sides of a scalar:

        self.percentage = network('percentage', 0.5, ( Float >= 0. ) & ( Float <= 1.), "A percentage")

Enforce being a member of a list

        self.ntype = network('ntype', 'fastforward', ['fastforward','recurrent','lstm'], "Type of network")

Do not forget to call done() once done with this config.

        config.done()    # checks that we have read all keywords.

It will alert you if there are keywords or children which haven't been read. Most likely, those will be typos. In our example above, width was misspelled in setting up the config, so you will get a warning to this end:

    *** LogException: Error closing config 'config.network': the following config arguments were not read: ['widht']
    Record of this object:
    config.network['activation'] = relu # Activation function for the network; default: relu
    config.network['depth'] = 10 # Depth for the network; default: 10000
    config.network['width'] = 100 # Width for the network; default: 100
    #
    config['features'] = ['time', 'spot'] # Features for the agent; default: []
    config['weights'] = [1 2 3] # Weigths for the agent; default: []

Detaching child configs
You can also detach a child config, which allows you to store it for later use without triggering done() errors for its parent.

    def read_config(  self, confg ):
        ...
        self.config_training = config.training.detach()
        config.done()

detach() will mark he original child as 'done'. Therefore, we will need to call done() again, when we finished processing the detached child:

    def training(self)
        epochs     = self.config_training("epochs", 100, int, "Epochs for training")
        batch_size = self.config_training("batch_size", None, help="Batch size. Use None for default of 32" )

        self.config_training.done()

Use copy() to make a bona fide copy of a child, without marking the source child as 'done'.

Self-recording all available configs
Once your program ran, you can read the summary of all values, their defaults, and their help texts.

    print( config.usage_report( with_cast=True ) )

Prints:

    config.network['activation'] = relu # (str) Activation function for the network; default: relu
    config.network['depth'] = 10 # (int) Depth for the network; default: 10000
    config.network['width'] = 100 # (int>3) Width for the network; default: 100
    config.network['percentage'] = 0.5 # (float>=0. and float<=1.) Width for the network; default: 0.5
    config.network['ntype'] = 'fastforward' # (['fastforward','recurrent','lstm']) Type of network; default 'fastforward'
    config.training['batch_size'] = None # () Batch size. Use None for default of 32; default: None
    config.training['epochs'] = 100 # (int) Epochs for training; default: 100
    config['features'] = ['time', 'spot'] # (list) Features for the agent; default: []
    config['weights'] = [1 2 3] # (asarray) Weigths for the agent; default: no initial weights

Calling functions with named parameters:

    def create_network( depth=20, activation="relu", width=4 ):
        ...

We may use

    create_network( **config.network )

However, there is no magic - this function will mark all direct members (not children) as 'done' and will not record the default values of the function create_network. Therefore usage_report will be somewhat useless. This method will still catch unused variables as "unexpected keyword arguments".

Advanced **kwargs Handling

The Config class can be used to improve kwargs handling. Assume we have

    def f(**kwargs):
        a = kwargs.get("difficult_name", 10)
        b = kwargs.get("b", 20)

We run the usual risk of somebody mispronouncing the parameter name which we would never know. Therefore we may improve upon the above with

    def f(**kwargs):
        kwargs = Config(kwargs)
        a = kwargs("difficult_name", 10)
        b = kwargs("b", 20)
        kwargs.done()

If now a user calls f with a misspelled config(difficlt_name=5) an error will be raised.

Another pattern is to allow both config and kwargs:

    def f( config=Config(), **kwargs):
        kwargs = config.detach.update(kwargs)
        a = kwargs("difficult_name", 10)
        b = kwargs("b", 20)
        kwargs.done()

logger

Tools for defensive programming a'la the C++ ASSERT/VERIFY macros. Aim is to provide one line validation of inputs to functions with intelligible error messages:

from cdxbasics.logger import Logger
_log = Logger(__file__)
...
def some_function( a, ...):
    _log.verify( a==1, "'a' is not one but %s", a)
    _log.warn_if( a!=1, "'a' was not one but %s", a)

Functions available, mostly self-explanatory:

Exceptions independent of logging level

    verify( cond, text, *args, **kwargs )
        If cond is not met, raise an exception with util.fmt( text, *args, **kwargs ). This is the Python version of C++ VERIFY
    
    throw_if(cond, text, *args, **kwargs )
        If cond is met, raise an exception with util.fmt( text, *args, **kwargs )

    throw( text, *args, **kwargs )
        Just throw an exception with util.fmt( text, *args, **kwargs )

Unconditional logging

    debug( text, *args, **kwargs )
    info( text, *args, **kwargs )
    warning( text, *args, **kwargs )
    error( text, *args, **kwargs )
    critical( text, *args, **kwargs )

    throw( text, *args, **kwargs )

Verify-conditional functions

    # raise an exception if 'cond' is not True        
    verify( cond, text, *args, **kwargs )

    # print log message of respective level if 'cond' is not True
    verify_debug( cond, text, *args, **kwargs )
    verify_info( cond, text, *args, **kwargs )
    verify_warning( cond, text, *args, **kwargs )

If-conditional functions

    # raise an exception if 'cond' is True
    throw_if( cond, text, *args, **kwargs )

    # write log message if 'cond' is True
    debug_if( cond, text, *args, **kwargs )
    info_if( cond, text, *args, **kwargs )
    warning_if( cond, text, *args, **kwargs )

    # print message if 'cond' is True
    prnt_if( cond, text, *args, **kwargs )      # with EOL
    write_if( cond, text, *args, **kwargs )     # without EOL

verbose

Utility class for printing 'verbose' information, with indentation.

from cdxbasics.verbose import Context, quiet

def f_sub( num=10, context = quiet ):
        context.report(0, "Entering loop")
        for i in range(num):
            context.report(1, "Number %ld", i)

def f_main( context = quiet ):
    context.write( "First step" )
    # ... do something
    context.report( 1, "Intermediate step 1" )
    context.report( 1, "Intermediate step 2\nwith newlines" )
    # ... do something
    f_sub( context=context(1) )
    # ... do something
    context.write( "Final step" )

print("Verbose=1")
context = Context(verbose=1)
f_main(context)

print("\nVerbose=2")
context = Context(verbose=2)
f_main(context)

print("\nVerbose='all'")
context = Context(verbose='all')
f_main(context)

print("\nVerbose='quiet'")
context = Context(verbose='quiet')
f_main(context)

Returns

Verbose=1
01:   First step
01:   Final step

Verbose=2
01:   First step
02:     Intermediate step 1
02:     Intermediate step 2
02:     with newlines
02:     Entering loop
01:   Final step

Verbose='all'
01:   First step
02:     Intermediate step 1
02:     Intermediate step 2
02:     with newlines
02:     Entering loop
03:       Number 0  
03:       Number 1
03:       Number 2
03:       Number 3
03:       Number 4
03:       Number 5
03:       Number 6
03:       Number 7
03:       Number 8
03:       Number 9
01:   Final step

Verbose='quiet'

The purpose of initializing functions usually with quiet is that they can be used accross different contexts without printing anything by default.

util

Some basic utilities to make live easier.

  • fmt(): C++ style format function.
  • uniqueHash(): runs a standard hash over most combinations of standard elements or objects.
  • plain(): converts most combinations of standards elements or objects into plain list/dict structures.
  • isAtomic(): whether something is string, float, int, bool or date.
  • isFloat(): whether something is a float, including a numpy float.
  • isFunction(): whether something is some function.
  • bind(): simple shortcut to find a function, e.g.
        def f(a, b, c):
            pass
    
        f_a = bind(f, a=1)
    

subdir

A few tools to handle file i/o in a transparent way in the new subdir module. For the time being this is experimental. Please share any bugs with the author in case you do end up using them.

Project details


Release history Release notifications | RSS feed

This version

0.2.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cdxbasics-0.2.1.tar.gz (43.3 kB view hashes)

Uploaded Source

Built Distribution

cdxbasics-0.2.1-py3-none-any.whl (41.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page