Basic Python tools
Project description
cdxbasics
Collection of basic tools for Python development.
dynaplot
Tools for dynamic (animated) plotting in Jupyer/IPython. The aim of the toolkit is making it easy to develop visualization with matplotlib which dynamically updates, for example during training with machine learing kits such as tensorflow. This has been tested with Anaconda's JupyterHub and %matplotlib inline.
Some users reported that the package does not work in some versions of Jupyter. In this case, please try fig.render( "canvas" ). I appreciate if you let me know whether this resolved the problem.
Animation
See the jupyter notebook notebooks/DynamicPlot.ipynb for some applications.
# example
%matplotlib inline
import numpy as np
x = np.linspace(-5,5,21)
y = np.ramdom.normal(size=(21,5))
# create figure
from cdxbasics.dynaplot import figure
fig = figure() # equivalent to matplotlib.figure
ax = fig.add_subplot() # no need to specify row,col,num
l = ax.plot( x, y[:,0] )[0]
fig.render() # construct figure & draw graph
# animate
import time
for i in range(1,5):
time.sleep(1)
l[0].set_ydata( y[:,i] ) # update data
fig.render()
fig.close() # clear figure to avoid duplication
See example notebook for how to use the package for lines, confidence intervals, and 3D graphs.
Simpler sub_plot
The package lets you create sub plots without having to know their number in advance: you do not need to specify rol, col, num when calling add_subplot.
# create figure
from cdxbasics.dynaplot import figure
fig = figure(col_size=4, row_size=4, col_num=3)
# equivalent to matplotlib.figure
ax = fig.add_subplot() # no need to specify row,col,num
ax.plot( x, y )
ax = fig.add_subplot() # no need to specify row,col,num
ax.plot( x, y )
...
fig.next_row() # another row
ax = fig.add_subplot() # no need to specify row,col,num
ax.plot( x, y )
...
fig.render() # draws the plots
Other features
There are a number of other functions to aid plotting
- figure():
Function to replace matplotlib.figure which will defer creation of the figure until the first call of render(). This way we do not have to specify row, col, num when adding subplots.
Instead of figsize specify row_size, col_size and col_nums to dynamically generate an appropriate figure size.
Key member functions are:- add_subplot to add a new plot. No arguments needed.
- next_row() to skip to the next row.
- render() to draw the figure. When called the first time will create all the underlying matplotlib objects. Subsequent calls will re-draw the canvas if the figure was modified. See examples in https://github.com/hansbuehler/cdxbasics/blob/master/cdxbasics/notebooks/DynamicPlot.ipynb
- close() to close the figure. If not called, Jupyter creates an unseemly second copy of the graph when the current cell is finished running.
- color_css4, color_base, color_tableau, color_xkcd:
Each function returns the ith element of the respective matplotlib color table. The purpose is to simplify using consistent colors accross different plots.
Example:fig = dynaplot.figure() ax = fig.add_subplot() # draw 10 lines in the first sub plot, and add a legend for i in range(10): ax.plot( x, y[i], color=color_css4(i), label=labels[i] ) ax.legend() # draw 10 lines in the second sub plot. No legend needed as colors are shared with first plot ax = fig.add_subplot() for i in range(10): ax.plot( x, z[i], color=color_css4(i) ) fig.render()
- colors_css4, colors_base, colors_tableau, colors_xkcd:
Generator versions of the color_ functions.
prettydict
A number of simple extensions to standard dictionaries which allow accessing any element of the dictionary with "." notation:
from cdxbasics.prettydict import PrettyDict
pdct = PrettyDict(z=1)
pdct['a'] = 1 # standard dictionary write access
pdct.b = 2 # pretty write access
_ = pdct.b # read access
_ = pdct("c",3) # short cut for pdct.get("c",3)
There are three versions:
- PrettyDict:
Pretty version of standard dictionary. - PrettyOrderedDict:
Pretty version of ordered dictionary. - PrettySortedDict:
Pretty version of sorted dictionary.
Functions
The classes also allow assigning bona fide member functions by a simple semantic of the form:
def mult_b( self, x ):
return self.b * x
pdct = mult_a
Calling pdct.mult_a(3) with above config will return 6 as expected. This only works when using the member synthax for assigning values to a pretty dictionary; if the standard [] operator is used then functions will be assigned to the dictionary as usual, hence they are static members of the object.
The reason for this is as follows: consider
def mult( a, b ):
return a*b
pdct.mult = mult
mult(3,4) --> produces am error as three arguments as are passed if we count 'self'
In this case, use:
pdct['mult'] = mult
pdct.mult(3,4) --> 12
config
Tooling for setting up program-wide configuration. Aimed at machine learning programs to ensure consistency of code accross experimentation.
from cdxbasics.config import Config
config = Config()
Key features
- Detect misspelled parameters by checking that all parameters of a config have been read.
- Provide summary of all values read, including summary help for what they were for.
- Nicer synthax than dictionary notation.
Creating configs
Set data with both dictionary and member notation:
config = Config()
config['features'] = [ 'time', 'spot' ]
config.weights = [ 1, 2, 3 ]
Create sub configurations with member notation
config.network.depth = 10
config.network.activation = 'relu'
config.network.width = 100 # (intentional typo)
This is equivalent to
config.network = Config()
config.network.depth = 10
config.network.activation = 'relu'
config.network.widht = 100 # (intentional typo)
Reading a config
Reading a config provides notation for type handling and also specifying help on what the respective feature is used for. See the usage_report()
member.
def __init__( self, confg ):
# read top level parameters
self.features = config("features", [], list, "Features for the agent" )
self.weights = config("weights", [], np.asarray, "Weigths for the agent", help_default="no initial weights")
When a parameter is read with (), we are able to specify not only the name, but also its default value, and a cast operator. For example, in the case of weigths we provide the numpy function asarray.
Further parameters of () are the help text, plus ability to provide text versions of the default with help_default (e.g. if the default value is complex), and the cast operator with help_cast (again if the respective operation is complex).
Important: the () operator will not default 'default' to None as dict.get does. If no default is specified, then () will return an error if the respective value was not provided. Therefore, config(key) behaves like config[key].
Accessing children directly with member notation
self.activation = config.network("activation", "relu", str, "Activation function for the network")
Accessing via the child node
network = config.network
self.depth = network('depth', 10000, int, "Depth for the network")
We can impose simple restrictions
self.width = network('width', 100, Int>3, "Width for the network")
Restrictions on both sides of a scalar:
self.percentage = network('percentage', 0.5, ( Float >= 0. ) & ( Float <= 1.), "A percentage")
Enforce being a member of a list
self.ntype = network('ntype', 'fastforward', ['fastforward','recurrent','lstm'], "Type of network")
Do not forget to call done() once done with this config.
config.done() # checks that we have read all keywords.
It will alert you if there are keywords or children which haven't been read. Most likely, those will be typos. In our example above, width was misspelled in setting up the config, so you will get a warning to this end:
*** LogException: Error closing config 'config.network': the following config arguments were not read: ['widht']
Record of this object:
config.network['activation'] = relu # Activation function for the network; default: relu
config.network['depth'] = 10 # Depth for the network; default: 10000
config.network['width'] = 100 # Width for the network; default: 100
#
config['features'] = ['time', 'spot'] # Features for the agent; default: []
config['weights'] = [1 2 3] # Weigths for the agent; default: []
Detaching child configs
You can also detach a child config, which allows you to store it for later
use without triggering done() errors for its parent.
def read_config( self, confg ):
...
self.config_training = config.training.detach()
config.done()
detach() will mark he original child as 'done'. Therefore, we will need to call done() again, when we finished processing the detached child:
def training(self)
epochs = self.config_training("epochs", 100, int, "Epochs for training")
batch_size = self.config_training("batch_size", None, help="Batch size. Use None for default of 32" )
self.config_training.done()
Use copy() to make a bona fide copy of a child, without marking the source child as 'done'.
Self-recording all available configs
Once your program ran, you can read the summary of all values, their defaults, and their help texts.
print( config.usage_report( with_cast=True ) )
Prints:
config.network['activation'] = relu # (str) Activation function for the network; default: relu
config.network['depth'] = 10 # (int) Depth for the network; default: 10000
config.network['width'] = 100 # (int>3) Width for the network; default: 100
config.network['percentage'] = 0.5 # (float>=0. and float<=1.) Width for the network; default: 0.5
config.network['ntype'] = 'fastforward' # (['fastforward','recurrent','lstm']) Type of network; default 'fastforward'
config.training['batch_size'] = None # () Batch size. Use None for default of 32; default: None
config.training['epochs'] = 100 # (int) Epochs for training; default: 100
config['features'] = ['time', 'spot'] # (list) Features for the agent; default: []
config['weights'] = [1 2 3] # (asarray) Weigths for the agent; default: no initial weights
Calling functions with named parameters:
def create_network( depth=20, activation="relu", width=4 ):
...
We may use
create_network( **config.network )
However, there is no magic - this function will mark all direct members (not children) as 'done' and will not record the default values of the function create_network. Therefore usage_report will be somewhat useless. This method will still catch unused variables as "unexpected keyword arguments".
Advanced **kwargs Handling
The Config class can be used to improve kwargs handling. Assume we have
def f(**kwargs):
a = kwargs.get("difficult_name", 10)
b = kwargs.get("b", 20)
We run the usual risk of somebody mispronouncing the parameter name which we would never know. Therefore we may improve upon the above with
def f(**kwargs):
kwargs = Config(kwargs)
a = kwargs("difficult_name", 10)
b = kwargs("b", 20)
kwargs.done()
If now a user calls f with a misspelled config(difficlt_name=5) an error will be raised.
Another pattern is to allow both config and kwargs:
def f( config=Config(), **kwargs):
kwargs = config.detach.update(kwargs)
a = kwargs("difficult_name", 10)
b = kwargs("b", 20)
kwargs.done()
logger
Tools for defensive programming a'la the C++ ASSERT/VERIFY macros. Aim is to provide one line validation of inputs to functions with intelligible error messages:
from cdxbasics.logger import Logger
_log = Logger(__file__)
...
def some_function( a, ...):
_log.verify( a==1, "'a' is not one but %s", a)
_log.warn_if( a!=1, "'a' was not one but %s", a)
Functions available, mostly self-explanatory:
Exceptions independent of logging level
verify( cond, text, *args, **kwargs )
If cond is not met, raise an exception with util.fmt( text, *args, **kwargs ). This is the Python version of C++ VERIFY
throw_if(cond, text, *args, **kwargs )
If cond is met, raise an exception with util.fmt( text, *args, **kwargs )
throw( text, *args, **kwargs )
Just throw an exception with util.fmt( text, *args, **kwargs )
Unconditional logging
debug( text, *args, **kwargs )
info( text, *args, **kwargs )
warning( text, *args, **kwargs )
error( text, *args, **kwargs )
critical( text, *args, **kwargs )
throw( text, *args, **kwargs )
Verify-conditional functions
# raise an exception if 'cond' is not True
verify( cond, text, *args, **kwargs )
# print log message of respective level if 'cond' is not True
verify_debug( cond, text, *args, **kwargs )
verify_info( cond, text, *args, **kwargs )
verify_warning( cond, text, *args, **kwargs )
If-conditional functions
# raise an exception if 'cond' is True
throw_if( cond, text, *args, **kwargs )
# write log message if 'cond' is True
debug_if( cond, text, *args, **kwargs )
info_if( cond, text, *args, **kwargs )
warning_if( cond, text, *args, **kwargs )
# print message if 'cond' is True
prnt_if( cond, text, *args, **kwargs ) # with EOL
write_if( cond, text, *args, **kwargs ) # without EOL
verbose
Utility class for printing 'verbose' information, with indentation.
from cdxbasics.verbose import Context, quiet
def f_sub( num=10, context = quiet ):
context.report(0, "Entering loop")
for i in range(num):
context.report(1, "Number %ld", i)
def f_main( context = quiet ):
context.write( "First step" )
# ... do something
context.report( 1, "Intermediate step 1" )
context.report( 1, "Intermediate step 2\nwith newlines" )
# ... do something
f_sub( context=context(1) )
# ... do something
context.write( "Final step" )
print("Verbose=1")
context = Context(verbose=1)
f_main(context)
print("\nVerbose=2")
context = Context(verbose=2)
f_main(context)
print("\nVerbose='all'")
context = Context(verbose='all')
f_main(context)
print("\nVerbose='quiet'")
context = Context(verbose='quiet')
f_main(context)
Returns
Verbose=1
01: First step
01: Final step
Verbose=2
01: First step
02: Intermediate step 1
02: Intermediate step 2
02: with newlines
02: Entering loop
01: Final step
Verbose='all'
01: First step
02: Intermediate step 1
02: Intermediate step 2
02: with newlines
02: Entering loop
03: Number 0
03: Number 1
03: Number 2
03: Number 3
03: Number 4
03: Number 5
03: Number 6
03: Number 7
03: Number 8
03: Number 9
01: Final step
Verbose='quiet'
The purpose of initializing functions usually with quiet is that they can be used accross different contexts without printing anything by default.
util
Some basic utilities to make live easier.
- fmt(): C++ style format function.
- uniqueHash(): runs a standard hash over most combinations of standard elements or objects.
- plain(): converts most combinations of standards elements or objects into plain list/dict structures.
- isAtomic(): whether something is string, float, int, bool or date.
- isFloat(): whether something is a float, including a numpy float.
- isFunction(): whether something is some function.
- bind(): simple shortcut to find a function, e.g.
def f(a, b, c): pass f_a = bind(f, a=1)
subdir
A few tools to handle file i/o in a transparent way in the new subdir module. For the time being this is experimental. Please share any bugs with the author in case you do end up using them.Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for cdxbasics-0.2.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0fac6bc0689bfdf354388f3996e1ec7dacdfe19f5cc863b3fbec10b501b307bd |
|
MD5 | 7df74d07cc32f99b8a12ab17761db69e |
|
BLAKE2b-256 | 2f7fcaca9ca6f4de41890bc2bcc09d045aad41740f417a7f17e13745703135ab |