Skip to main content

Automated Logical FRamework for Dynamic script execution(ALFRD)

Project description

ALFRD : Automated Logical FRamework for Dynamic script execution(ALFRD)

This program is written for the SMILE project supported by the ERC starting grant, particularly with following purposes in mind:

  • Communicate with google spreadsheets and update progress periodically when required.
  • Run pipeline based on the spreadsheet progress/requirements.

Contents:

  1. Create API credentials on Google Cloud
  2. Installing
  3. Using ALFRD
  4. Attribution
  5. Acknowledgement

1. Create API credentials on Google Cloud

A similar guide is available here or https://developers.google.com/workspace/guides/create-credentials

NOTE: In order to successfully create a google console project a billing detail is usually required. But the sheets API service is available for free, refer here

  • step 1: Go to https://console.cloud.google.com/

  • step 2: click on the drop-down to create a new project - can choose organization or leave on default

  • step 3 Search in the top bar (or press / ) and type : "Google sheets api"

  • step 4 In the search results - under Marketplace select the first result which should be the same : Google sheets api

  • step 5 Enable the service In the product details page

  • step 6 Now select create credentials > Application Data

  • step 7 Create an account name Select create and continue

  • step 8 Search and select "Editor" in role > Continue

  • step 9 Skip next optional step Select Done

  • step 10 The Credentials are successfully created Select "Credentials" on the left menu

  • step 11 select/edit account that was just created also copy the email address that is shown

  • step 12 Select keys tab > Add keys > Create New Keys > JSON > save

  • step 13 Go to the Google spreadsheet and "share" the sheet to the email address that was copied, as Editor.

2. Installing

  • Install using the pip package manager:
    pip install alfrd
    
  • Alternatively Download ALFRD and unzip / Or
    git clone https://github.com/avialxee/alfrd
    cd alfrd/
    pip install .
    

this should install alfrd and all the dependencies automatically.

3. Using ALFRD

ALFRD is meant to be used as a tool to create a pipeline/workflow. The pipeline can be visualised as a tabular data e.g in a pandas dataframe table, google spreadsheet etc. For example each column may represent a pipeline step and each row corrosponds to a different dataset. A count of success and failure is kept throughout the code for housekeeping.

Example 1 : Initializing and creating instance
from alfrd.lib import GSC, LogFrame
from alfrd.util import timeinmin, read_inputfile

url='https://spreadsheet/link'
worksheet='worksheet-name'

gsc = GSC(url=url, wname=worksheet, key='path/to/json/file')      # default path for key = home/usr/.alfred
df_sheet = gsc.open()
Example 2: Inititalize the framework
lf = LogFrame(gsc)
lf.primary_colname          =   'FILE_NAME'

The dataframe from the sheet can also be accessed through the instance lf using lf.df_sheet.

Example 3: Run - Iterate for each row
from pathlib import Path
import time, subprocess
import glob

count,failed=0,0
allfiles=[]
folder_for_fits ='path/to/allfits'

allfiles.extend(glob.glob(f"{folder_for_fits}*/*fits"))

for fitsfile in allfiles:
    fitsfile_name = Path(fitsfile).name
    lf.primary_value            =   fitsfile_name

Here the variable count and failed is used for housekeeping the success/failure of each function call. This can be useful for ensuring minimal API calls, communication to the sheet only when there is a new update.

Example 4: using column logic and adding values to the cell on success/failures

Let's say we have a result that we need to update to the cell corrosponding to the column Comment and row corrosponding to the Serial Number 3. Where S.No. is the column name for serial number.

lf.primary_colname          =   'S.No.'
lf.primary_value            =   '3'
if lf.isvalue(value='True', colname='TSYS') and lf.isval_unique('Project'):
    
    lf.working_col          =   'Comment'                                          # working column
    print(lf.get_value())

    result                  =   'test'
    if result :
        count               =   lf.put_value(result, count=count)                   
    else:
        failed              =   lf.put_value('failed: logfile_path', count=failed)
    
    print(lf.get_value())
Example 5: Script execution
def run_picard(cmd):
    t0 = time.time()
    subprocess.run(cmd)
    t1 = time.time()
    td = timeinmin(t1-t0)
    return td

def scripted_picard_1(wd_ifolder, count, failed, col='fits to ms'):
    """
    converts fitsfile to ms file; checks if conversion was successful; skips if file exists; logs in the spreadsheet;
    """
    td=0
    cmd=["picard",'-n','10',"-l","e",'--input',wd_ifolder]
       
    params, files, input_folder = read_inputfile(wd_ifolder, "observation.inp")
    
    if not Path(f"{wd_ifolder}").exists():
        print('input folder missing..')
    elif not Path(f"{wd_ifolder}/../{params['ms_name']}").exists():
        td          =   run_picard(cmd)
        count       =   lf.put_value(colname=col, value=td, count=count)
    elif Path(f"{wd_ifolder}/../{params['ms_name']}").exists():
        print('skipped')

    if not Path(f"{wd_ifolder}/../{params['ms_name']}").exists():
        count-=1
        failed      =   lf.put_value(colname=col, value='failed',count=failed)
    return td, count, failed

if lf.isvalue(value='True', colname='TSYS') and lf.isval_unique('Project'):
    lf.working_col          =   'Comment1'
    ttaken, count, failed   =   scripted_picard_1(wd_ifolder='path/to/wd/input_template', count=count, failed=failed )

print(lf.get_value(colname='fits to ms'))
Example 6: Update the sheet

lf.update_sheet updates the changes to the spreadsheet.

lf.update_sheet(count=count, failed=failed, csvfile='df_sheet.csv')                     # if updating the sheet fails, a copy of the dataframe is saved locally at the csvfile path.

NOTE: The count and failed parameter corrosponds to a success/fail event on each iteration of update_sheet i.e., if the values of count/fail do not change on the current iteration, the sheet will not be updated and a skipped message will appear on the terminal.

Example 7: Conditional Formatting

need to run only once.

r1 = lf.create_rule(range='G2:G105', type='TEXT_CONTAINS' ,value='False', c='r')               # check if TSYS == False --> background color = red
r2 = lf.create_rule(range='G2:G105', type="TEXT_CONTAINS", value='True', custom_clr=lf.create_color(0.56, 0.77, 0.49))
r3 = lf.create_conditional_format(range='F2:F105', c='g', valtype='timeinmin')              # check if value in XXmYYs format --> background color = green

r4 = lf.create_conditional_format(range='F2:F105', c='r', valtype='fail')                   # check if value contains fail --> background color = red
r4 = lf.create_rule(range='F2:F105', type='TEXT_CONTAINS', c='r', value='fail')                # similar to above
print(r1,r2,r3,r4)
lf.add_conditional_format(r1, r2, r3, r4)

4. Attribution

When using ALFRD, please add a link to this repository in a footnote.

5. Acknowledgement

ALFRD was developed within the "Search for Milli-Lenses" (SMILE) project. SMILE has received funding from the European Research Council (ERC) under the HORIZON ERC Grants 2021 programme (grant agreement No. 101040021).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alfrd-0.0.5.tar.gz (14.8 kB view details)

Uploaded Source

Built Distribution

alfrd-0.0.5-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file alfrd-0.0.5.tar.gz.

File metadata

  • Download URL: alfrd-0.0.5.tar.gz
  • Upload date:
  • Size: 14.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for alfrd-0.0.5.tar.gz
Algorithm Hash digest
SHA256 f3455f6e28741d7bc5d4e537fff65e8b3fe96b35a250e72b79706f5d1ba95cdd
MD5 39d8e1f2ec7e45550ce55298c74feeb5
BLAKE2b-256 aa56f8b2f02d70a4edf13a0346ee8a163a80f5cd87d488b59530106477956272

See more details on using hashes here.

File details

Details for the file alfrd-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: alfrd-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for alfrd-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 20620929fa7f5c9ce995c60566357c41e827341aca50406e6c61ed5863a8f4e2
MD5 ded105feef12508bce6be8886a54e5aa
BLAKE2b-256 ba115b09f65c27f1d3e7aaadff8f9f322511073617049b5faa9cfef6094da940

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page