Skip to main content

Command liner with JSON based input file

Project description

CLISe: Command Line Input Setter

Command liner for python routines using JSON based input parameter files. It dives into a module with CLI and execute a routine just as herons dive into water and catch fish. It also uses JSON text to create task lists to automatize multiple executions of the routine: ver 0.2.3 by Jaesub Hong (jhong@cfa.harvard.edu)

   Usage: clise JSON_input_file1 -options_for_file1 ... \
        [json_input_file2 -options_for_file2 ...] \
        [--with common_json_files -common_options ...] \
        [--WITH common_json_files -common_options ...]

   clise --help [keys]
   clise [json_files ...] --Help 
   clise --main module.routine --Help 

Table of Contents

Quick overview of the basic concept

There are many command line interface tools for python (e.g., pyCLI, clize, etc.). They usually provide decorators and other useful functions to deal with arguments. They also provide routines to generate the executable scripts for the user routines. While these are great for developing a large code or project, it can be a bit redundant for a quick test of a routine in a code you don't have access to modify. In the latter, you may have to write your own wrapper to bypass that. CLISe takes a different approach, perhaps suitable for small or moderate size projects.

CLISe enables executions of any routine as a command script only with command line options and input files (in an extended JSON format). CLISe dynamically make a decorator for the called routine, handling the input and output parameters of the routine, so the user doesn't need to modify the original routine. One of the input parameters required for CLISe is the routine and module names. So when the input parameter is stored in a file (recommended), the user doesn't need to remember the routine name in future runs. For more complex operations, CLISe allows sequential calls of multiple routines, and provide a simple mechanics to generate multiple calls of multiple routines.

The main objective of CLISe is an efficient separation and maintenance of essential input parameters of a task or tasks from the programming part for the task(s). While CLISe enables a few simple variables and math operations in setting input parameters for convenience, their features are (subjectively) limited to leave out the programming part: i.e., "rabbit.py" provides the routines that enables the auto task repetition with variations in input parameters, but it doesn't provide, for instance, a range-based for-loop functionality, which belongs to the programming.

The input parameter files contain the name of the routine to call: e.g., "-main": "module.routine". Assume that a python script example.py has

  def my_sum(name, x, y):
       """ This is my sum. """
       print(name+':', x+y)

Then with a JSON file input.json,

  "-main": "example.my_sum",
      "x": 5,
      "y": 7,
   "name": "answer",

One can execute the routine 'my_sum' in a shell command prompt like

  % clise input.json
  answer: 12

Keys starting with alphabets (x, y, and name in the above example) are assumed to be fed into the main routine set by "-main" key. In principle, all the contents in the JSON files can be fed as a long string in the command line or as optional parameters for individual keys with "-". So the above example is equivalent to the followings even without the JSON file input.json.

  %  clise --main example.my_sum -#x 5 -#y 7 -name "answer"
  %  clise '{"-main":"example.my_sum","x":5,"y":7,"name":"answer"}'

or some combination of all three examples:

  %  clise '{"-main":"example.my_sum","name":"answer"}' -#x 5 -#y 7
  %  clise input.json '{"name":"answer"}' -#x 15 -#y 27

When both JSON files and command line input options are available for the same key, the command line options take a priority. In the last example, 'input.json' has x of 5, which is replaced by the command line option x=15. Note # in -#x ensures it is a number but not a string. See more details with 'clise --help cli'. Note --Help (capital H) prints out the doc string of the routine.

  % clise input.json --Help
  This is my sum.

Calling multiple JSON files execute them in sequence.

  % clise input.json input.json
  answer: 12
  answer: 12

  % clise input.json -#x 7 input.json -#x 6
  answer: 14
  answer: 13

As you may have guessed it by now, in the command line, "--" is reserved for options for clise itself, and "-" is reserved for options and keys of the user routine. In JSON files, the parameters lose one "-", so "-var" are for the clise, and "var" without "-" is for the user routine.

Find out what kind of parameters are needed to call the routine using --show func option.

  % clise --main os.path.isfile --show func
   main: os.path.isfile
   path

The above example shows isfile expect a parameter called path.

  % clise --main os.path.isfile -path clise.py --show output
  True

Can check how the parameters get fed to the routine.

  % clise --main os.path.isfile -path clise.py --show feed
   main: os.path.isfile
   path << str .py

  % clise input.json --show feed
   main: example.my_sum
   name << str answer
      x << int 5
      y << int 7

Can call a routine needing no input parameters.

  % clise --main datetime.datetime.now --show output
  2022-04-27 22:11:52.983532

One can force the parameters to a function with --pars option.

  % clise --main math.sin --pars x --show output -#x 1.0
  0.8414709848078965

In the case of the built-in functions: e.g.,

  % clise --main eval --pars x --show output -x 3+3
  6

  % clise --main pow -*-pars x,y --show output -#x 1.5 -#y 3
  3.375

  % clise --main eval --pars x --show output -x 'pow(1.5,3)'
  3.375

Parameter list:

For command line parameters, add the additional prefix -:i.e., "-main" in a JSON file is equivalent to --main as a command line option. Parameters starting with alphabets can be potentially fed into the routine called.

For clise (heron.py)

  -main       str   None  the main module and routine to call
  -load       str*  None  load other JSON file: same as --with option in the
                          command line
  -loop       dict  None  routine for looping to define the task list
  -id         str   auto/manual   short term ID for the run
                          auto: routine name
                          manual: set by a user for loop. See multiiply_manual_by_id
  -nid        int   auto  number ID for the run
  -fid        str   auto  full ID for the run
  -block      str   None  indicates the routine being for task generation
                          ''                : regular routine, 
                          rabbit            : task generator
                          rabbit:release    : task generator, and the next block is the
                                              starting point of the searched
                          rabbit:capture    : the last block for the seed

  -include    str   None  only choose input files with matching string: regex
  -exclude    str   None  do not choose input files with matching string: regex
  -after      str   None  run only for outfile (default) modified after this time
  -before     str   None  run only for outfile (default) modified before this time
  -clobber    bool  false overwrite the existing results

  -hisfile    str   none  copy of input JSON file and command line option,
                          automatically generated if set to be "_auto_".
  -logfile    str   none  log file
  -save_cli_ony_run bool  false save log for cmdline tools

  -roudnup    str   _heron_ input keyword name to pass all the parameters to a user function

  -global     str*  none  global variables
  -pars       str*  none  input parameters of the function called; must set
                          this for invisible non-keyword parameters: e.g.,
                          decorated functions
  -kwpars     str*  none  input keyword parameters of the function called;
                          set this for invisible keywords or simply set
                          -collect true
  -collect    bool  false set this true to feed all the unassigned
                          parameters starting with alphabets as keyword
                          paramters when the routine's keyword parameters
                          are not visible. Use --show feed to check how the
                          parameters are passed.
  -discard    str*  none  discard these parameters before calling the routine
  -nonify     str*  none  set these parameters None and feed to the routine

  -init       dict  none  define parameters needed to initialize the class
  -object     str   none  object name to define and reuse when multiple objects under the same
                          class name are required. Each class will load an object and the routines
                          under the same class share the object unless the object name is given under
                          the -object parameter.

  -return     list  None  name of outputs for a routine, which can be used
                          for the input for the later routines
                          e.g., "-return": ["x", "y"]
                          each variable will be OrderedDict with key being
                          the routine's "-id".

  -inherit    dict  None  pairs of receiving and input parameters for routines that
                          inherit outputs of earlier routines. The input parameters
                          can be an expression, evaluated by 'eval'.
                          e.g., "-inherit" : {
                                      "input1" : "x", 
                                      "input2": "x*y",
                                      "scalar": ["x","y"],
                                }
                          This will grab the first element of output variables "x" and "y",
                          and feed the "x" value to "input1", and the "x*y" value to "input2".

  -enforce_numeric str #  a prefix to specify cmdline par being a number
                          instead of string; "" to disable it
  -enforce_floats  str ## a prefix to specify cmdline par being a float
                          instead of string; "" to disable it
  -enforce_format  dict   a full dict for data format enforcer, int,
                          float, complex, bool, str
  -expand_nested   str ,  a separator to indicate the nested key or
                          parameter; "" to disable it

  -dict_pre   str   **    a prefix to indicate a list in string variable; ""
                          to disable it
  -list_pre   str   *     a prefix to indicate a list in string variable; ""
                          to disable it
  -list_sep   str   ,     a separator to indicate a list in string variable;
                          "" to disable it
  -pop_modifier bool true remove the modifier keys like -enforce_numeric afterwards

  -onlyfor module.routine dict none settings that only apply to a
                          particular module.routine

  -apply to key (op)      assign parameters in a "-main"less block to other blocks,
                          use to define parameters common for multiple tasks.
                          "key" can be "-id", "-main", or others. Available operators arguments:

                          (None), except, contains, without

                          e.g., "apply to -id" : ["Task1", Task2"]

                          will apply the parameters in this block to blocks that have
                          the "-id" values of "Task1" or "Task2". A short form can be used

                          "-apply to key"          is equivalent to ">> key"  or ">> key =="
                          "-apply to key except"   is equivalent to ">> key !="
                          "-apply to key contains" is equivalent to ">> key ~="
                          "-apply to key without"  is equivalent to ">> key ~!="

                          "==" and "!=" operators can take either string or string list.
                          "~=" and "~!=" operators can take a regex string.

  -apply until -block     To indicate the end of the seed for task generator block.
                          The matching end block should have the same value for the "-block" key.

                          "-apply until -block"   is equivalent to ">> -block"
                          
  -skip key (op)          assign blocks to skip: the same mechanics as "-apply to ..."

                          "-skip -id"  is equivalent to "<< -id"

  -verbose    int  0      chatter level for clise (or heron.py)

  -accept
  -reject                 set the accept and reject list for CLI to a block
                          regex string list

In rabbit.py For multiply_by_id(_multi): obsolete under the new task management

  -id         str*        Setting "-id" to a string list will make the block into
                          task generator, calling "rabbit.multiply_by_id".
                          This routine can be called explicitly, and the
                          routine and its parameters can be fed as the
                          "seed" variable. This routine generate the repeat listing
                          for a single routine. The parameter change can be implemented
                          by conditional dict: e.g.,

                                "-id" : ["first", "second"]
                                "-id==first" : { "x":3, "y":4 },
                                "-id==second": { "x":5, "y":6 },

                          "==","!=","~=","~!=" are available for conditioning.

                          To repeat the multiple routines in a set, use
                          rabbit.multiply_by_id_multi


  For multiiply_by_file(_multi):
                          To repeat the multiple routines in a set, use
                          rabbit.multiply_by_file_multi

  infile      str*  None  regex of input files, any regex should be in (),
                          which can be tagged as {1}, {2},...
  outfile     str   None  output file, one can use tags in input file

  indir       str   None  input directory root, needed for recursive search,
                          otherwise optional
  outdir      str   None  output directory root, needed for recursive output
                          matching input, otherwise optional
  outsubdir   str   None  when an additional subdirectory name is needed to
                          be added
  mkoutdir    list  None  make output dir for these output files if not exists

  sort        str   None  Sort by name, basename, modtime, or size. The
                          default changes to modtime if -after/-before is
                          set, or to size if -larger/-smaller is set.

  recursive   bool  true  recursive input file search, requires indir
  mirror      bool  true  when input files are searched recursively, does
                          output follow the same directory structure?
                          required outdir
  swapsub     dict  None  when mirroring indir, if certain subdirectory
                          names needed to be changed

  tagkeys     str*  None  list of file parameters to grab tags
  appkeys     str*  None  list of parameters to apply tags

  include     str   None  only include cases with inkey having this phrase
                          this is different from -include key for clise (or heron.py)
  exclude     str   None  exclude cases with inkey having this phrase
                          this is different from -exclude key for clise (or heron.py)

  sortby      str         sorting key for the files and the method: mtime,
                          size, name, fullname: e.g., -sortby infile:mtime
                          will sort the task list by modified time of infile

  after       str   None this automatically set the sortby key to infile:mtime, 
                         in the ascending order
  before      str   None this automatically set the sortby key to infile:mtime, 
                         in the descending order
  larger      str   None this automatically set the sortby key to outfile:size, 
                         in the ascending order
  smaller     str   None this automatically set the sortby key to outfile:size,
                         in the descending order

  inkey      str infile  the main input file key name
  outkey     str outfile the main output file key name
  checkin    str* infile the input files to check if exists before run
  checkout   str* infile the output files to check if exists before run;
                         "-clobber" setting in heron.py
                         setting will decide whether skip the run or not

  rel2indir   str*  None list of file variables whose path is set relative
                         to the input file
  rel2outdir  str*  None list of file variables whose path is set relative
                         to the output file
  rel2nodir   str*  None list of file variables whose path is not set relative
                         to either the input or the output file

  verbose     int   0     chatter level for multiple_by_file(_multi)

Logging

The automatic logging of each run and a copy of the input parameters are enabled through -logfile and -hisfile = "auto". The log file contains the SHA1 has result of the input parameters, so that one can tell the same run has been run or not.

22-04-28 11:27:58 0:00:00.000100
SHA1:81e37f0fee49f9729aaebe684e9853544664972b example.show_new_name meco

When some of the routines are used frequently as a command line tool instead of tasks, the logging and copying can be disabled. In general, key --^parse sets the auto log off, which can be set by key -save_cmdline_log.

More to explain

  • Regex for infile and other files

    "infile==..." and "-infile.key==..." conditioning 
    
    "infile" : ["file1","file2","file1",...]
    
    how to refer the same file more than once in repetition of the same task(s)
    with different parameters: use "-infile.key..." option
    
  • Priority order in parameter value assignment (the latter supercede the former):

    --with < STARTUP < --load inside json_files 
          < input_json_files < input cli pars 
          < pars set by task generators < conditional_input_parameters
          < --WITH
    
          difference btw using cmdline.json5 vs cmdline_noswap.json5
          when 
                alias search1='clise --^parse cmdline.json5 -infile'
                alias search2='clise --^parse cmdline_noswap.json5 -infile'
    
                A) search1 '(.*).py' -after somefile
                B) search2 '(.*).py' -after somefile
                C) search1 '(.*).py' --with -after somefile
                D) search2 '(.*).py' --WITH -after somefile
    
                A = C = D, but in B, the -after option is ignored.
    
  • How to define multiple calls in serial using JSON

    in a single file (file.json):
          {
                "-main" : "module1.routine1",
                ...
          },
          {
                "-main" : "module2.routine2",
                ...
          },
          ....
          {
                "-main" : "moduleN.routineN",
                ...
          },
    
    % clise file.json
    
    Note a single task doesn't require the outer {}.
    
    When having multiple files
    
    % clise file1.json file2.json ... fileN.json
    
    each file can have multiple calls.
    JSON file without "-main" is considered as additional options
    for other calls.
    
  • Common blocks { // pars in a block without "-main" will be fed to other blocks "common_par1" : "val1", "common_par2" : "val2", ...

                "-apply to -main" : ["moduleB.routineB""], 
      	// the parameters in this block applied to blocks with -main==moduleB.routineB
          },
          {
                "-main" : "moduleB.routineB", //  
                ...
    
          },
          {
                "-main" : "moduleB.routineB", //  
                ...
    
          },
          {
                "-main" : "moduleC.routineC", //  
                ...
    
          },
          .....
    
  • Task generator features in rabbit.py: esp. for repeating a set of multiple calls { "-main" : "moduleA.taskgen", // example "rabbit.multiply_by_id_multi" ...

                "-apply until -block" : "end of taskgen", 
      	// this indicates this block is a task gen block
      	// and which block is the end for the seed
          },
          {
                "-main" : "module1.routine1", // the first routine to multiply
                ...
          },
          {
                "-main" : "module2.routine2", // the 2nd routine to multiply
                ...
          },
          ....
          {
                "-main" : "moduleN.routineN", // the Nth routine to multiply
                ...
          },
          {
                // pars in a block without "-main" will be fed to all the routines above
                "common_par1" : "val1", 
                "common_par2" : "val2", 
                ...
    
                "-block" : "end of taskgen", // this indicates the ending block for task gen
          },
    

    The whole task gen block should be in a single file.

    how to write custom generator: 
    
          - basically return a list of OrderedDict array,
            where each contains all the parameters for a routine call.
    
          - The generator will receive the list of the OrderedDict from
            routine1 to routineN as an input parameter "seed".
    
          - One can redefine the name of the par "seed" with "-seedkey". 
            (not be implemented yet)
    
    4 task gen routines are provided in rabbit.py
          "multiply_by_id"
          "multiply_by_id_multi"
          "multiply_by_file"
          "multiply_by_file_multi"
    
          in the latter two, file wild card vs regex
                * <=> (.*)
    
          *.* in the shell prompt (e.g., % ls *.*) is equivalent to
    
                      "infile": "(.*).(.*)"
    
                the first and second (.*) can be refered to {1}, {2}
                in other parameters
    
                      "outfile" : "{1}.png"
    
                "tagkeys" enables phrase grab from other keys
                      requires more explanation...
    
  • JSON compatibility

    using hjson module: accepted extension:
    

    ext ='(json|json5|hjson|jsonc)'

    Normal key consisting of [a-zA-Z] doesn't require "".
          e.g.,
                x : 1,    <=>    "x": 1,
    
    For others and string values, it's recommended to use "":
          e.g., 
                "-main": "module.routine",
    
  • Block type

     -main       -apply ...          -block           type
    
     yes         no                  -                regular task block
     no          yes/no              -                common parameter block for regular tasks
    
     yes         yes w. end phrase   -                task generation block
     yes/no      no                  end phrase       the end of the task gen seed
     no          -                   "rabbit"         common parameter block for the task gen seed
    

Changes

v0.2.3 2023/10 - clarified options for plotting the time data in plottool.py. e.g., to read xdata as time with a user format (like 2023-10-01 05:06:07) -attr "xtime?%Y-%m-%d %H:%M:%S" to label the time ticks in a different format (like 23-10) -xformat "%y-%m"

  - enable setting tick parameters using -xticks_kw and -yticks_kw. e.g.,
    xtick_kw: { rotation: 45.0 } will rotate xticks' label by 45 degrees. 
    The options will be executed as ax.tick_params(axis='x', **xticks_kw)

  - add ax2color and ay2color in plottool
  - change altx, alty in plottool to use_ax2, use_ay2 

v0.2.2 2023/09 - quick bug fix for: -mkoutdir [...]

v0.2.1 2023/09 - recursively make output directories: -mkoutdir [...] - change the default file type for tabletool read to fits when filetype is not given

v0.2.0 2023/08 - enable late command line option driven parameter substitution by {<<}

v0.1.9 2023/08 - fix debug print out

v0.1.8 2023/08 - add convtool.py and fitstool.py

v0.1.7 2023/08 - add time axis on plottool

v0.1.6 2023/06 - add wcs to dplot in plottool: e.g., -attr wcs range set by degrees - add time to plot1d in plottool: e.g., -attr xtime,ytime - add rebin.py and update plottool.py with rebin

v0.1.5 2023/03 - add read_mca in tabletool

v0.1.4 2023/03 - adhoc fix of a bug in loading clise's own modules

v0.1.3 2023/03 - add tabletool.py

v0.1.2 2023/03 - add xray_mirror.py

v0.1.1 2023/03 - Lower python requirement to 3.8 instead of 3.9

v0.1.0 2023/03 - Initial version - Forked from cjpy - Implemented sequential executions of routines with JSON blocks {} - Implemented custom routine for iterative calls

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clise-0.2.3.tar.gz (121.9 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page