Automated Logical FRamework for Dynamic script execution(ALFRD)
Project description
ALFRD : Automated Logical FRamework for Dynamic script execution(ALFRD)
This program is written for the SMILE project supported by the ERC starting grant, particularly with following purposes in mind:
- Communicate with google spreadsheets and update progress periodically when required.
- Run pipeline based on the spreadsheet progress/requirements.
Contents:
1. Create API credentials on Google Cloud
A similar guide is available here or https://developers.google.com/workspace/guides/create-credentials
NOTE: In order to successfully create a google console project a billing detail is usually required. But the sheets API service is available for free, refer here
-
step 1: Go to https://console.cloud.google.com/
-
step 2: click on the drop-down to create a new project - can choose organization or leave on default
-
step 3 Search in the top bar (or press / ) and type : "Google sheets api"
-
step 4 In the search results - under Marketplace select the first result which should be the same : Google sheets api
-
step 5 Enable the service In the product details page
-
step 6 Now select create credentials > Application Data
-
step 7 Create an account name Select create and continue
-
step 8 Search and select "Editor" in role > Continue
-
step 9 Skip next optional step Select Done
-
step 10 The Credentials are successfully created Select "Credentials" on the left menu
-
step 11 select/edit account that was just created also copy the email address that is shown
-
step 12 Select keys tab > Add keys > Create New Keys > JSON > save
-
step 13 Go to the Google spreadsheet and "share" the sheet to the email address that was copied, as Editor.
2. Installing
- Install using the pip package manager:
pip install alfrd
- Alternatively Download ALFRD and unzip / Or
git clone https://github.com/avialxee/alfrd cd alfrd/ pip install .
this should install alfrd and all the dependencies automatically.
3. Using ALFRD
ALFRD is meant to be used as a tool to create a pipeline/workflow. The pipeline can be visualised as a tabular data e.g in a pandas dataframe table, google spreadsheet etc. For example each column may represent a pipeline step and each row corrosponds to a different dataset. A count of success and failure is kept throughout the code for housekeeping.
Example 1 : Initializing and creating instance
from alfrd.lib import GSC, LogFrame
from alfrd.util import timeinmin, read_inputfile
url='https://spreadsheet/link'
worksheet='worksheet-name'
gsc = GSC(url=url, wname=worksheet, key='path/to/json/file') # default path for key = home/usr/.alfred
df_sheet = gsc.open()
Example 2: Inititalize the framework
lf = LogFrame(gsc)
lf.primary_colname = 'FILE_NAME'
The dataframe from the sheet can also be accessed through the instance
lf
usinglf.df_sheet
.
Example 3: Run - Iterate for each row
from pathlib import Path
import time, subprocess
import glob
count,failed=0,0
allfiles=[]
folder_for_fits ='path/to/allfits'
allfiles.extend(glob.glob(f"{folder_for_fits}*/*fits"))
for fitsfile in allfiles:
fitsfile_name = Path(fitsfile).name
lf.primary_value = fitsfile_name
Here the variable
count
andfailed
is used for housekeeping the success/failure of each function call. This can be useful for ensuring minimal API calls, communication to the sheet only when there is a new update.
Example 4: using column logic and adding values to the cell on success/failures
Let's say we have a result
that we need to update to the cell corrosponding to the column Comment
and row corrosponding to the Serial Number 3
. Where S.No.
is the column name for serial number.
lf.primary_colname = 'S.No.'
lf.primary_value = '3'
if lf.isvalue(value='True', colname='TSYS') and lf.isval_unique('Project'):
lf.working_col = 'Comment' # working column
print(lf.get_value())
result = 'test'
if result :
count = lf.put_value(result, count=count)
else:
failed = lf.put_value('failed: logfile_path', count=failed)
print(lf.get_value())
Example 5: Script execution
def run_picard(cmd):
t0 = time.time()
subprocess.run(cmd)
t1 = time.time()
td = timeinmin(t1-t0)
return td
def scripted_picard_1(wd_ifolder, count, failed, col='fits to ms'):
"""
converts fitsfile to ms file; checks if conversion was successful; skips if file exists; logs in the spreadsheet;
"""
td=0
cmd=["picard",'-n','10',"-l","e",'--input',wd_ifolder]
params, files, input_folder = read_inputfile(wd_ifolder, "observation.inp")
if not Path(f"{wd_ifolder}").exists():
print('input folder missing..')
elif not Path(f"{wd_ifolder}/../{params['ms_name']}").exists():
td = run_picard(cmd)
count = lf.put_value(colname=col, value=td, count=count)
elif Path(f"{wd_ifolder}/../{params['ms_name']}").exists():
print('skipped')
if not Path(f"{wd_ifolder}/../{params['ms_name']}").exists():
count-=1
failed = lf.put_value(colname=col, value='failed',count=failed)
return td, count, failed
if lf.isvalue(value='True', colname='TSYS') and lf.isval_unique('Project'):
lf.working_col = 'Comment1'
ttaken, count, failed = scripted_picard_1(wd_ifolder='path/to/wd/input_template', count=count, failed=failed )
print(lf.get_value(colname='fits to ms'))
Example 6: Update the sheet
lf.update_sheet
updates the changes to the spreadsheet.
lf.update_sheet(count=count, failed=failed, csvfile='df_sheet.csv') # if updating the sheet fails, a copy of the dataframe is saved locally at the csvfile path.
NOTE: The
count
andfailed
parameter corrosponds to a success/fail event on each iteration of update_sheet i.e., if the values of count/fail do not change on the current iteration, the sheet will not be updated and askipped
message will appear on the terminal.
Example 7: Conditional Formatting
need to run only once.
r1 = lf.create_rule(range='G2:G105', type='TEXT_CONTAINS' ,value='False', c='r') # check if TSYS == False --> background color = red
r2 = lf.create_rule(range='G2:G105', type="TEXT_CONTAINS", value='True', custom_clr=lf.create_color(0.56, 0.77, 0.49))
r3 = lf.create_conditional_format(range='F2:F105', c='g', valtype='timeinmin') # check if value in XXmYYs format --> background color = green
r4 = lf.create_conditional_format(range='F2:F105', c='r', valtype='fail') # check if value contains fail --> background color = red
r4 = lf.create_rule(range='F2:F105', type='TEXT_CONTAINS', c='r', value='fail') # similar to above
print(r1,r2,r3,r4)
lf.add_conditional_format(r1, r2, r3, r4)
4. Attribution
When using ALFRD, please add a link to this repository in a footnote.
5. Acknowledgement
ALFRD was developed within the "Search for Milli-Lenses" (SMILE) project. SMILE has received funding from the European Research Council (ERC) under the HORIZON ERC Grants 2021 programme (grant agreement No. 101040021).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file alfrd-0.0.4.tar.gz
.
File metadata
- Download URL: alfrd-0.0.4.tar.gz
- Upload date:
- Size: 14.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ccd60e7b364c1c64d1b73d1ac58157d32ff0ca2c5be5b89141161299f4ae5700 |
|
MD5 | 628aaa4e75916743d8296df1a8389170 |
|
BLAKE2b-256 | d1baafc4d244f5dfb5efb95984e2c4efa28ad77cb1e2f527ea7209e88610cdc2 |
File details
Details for the file alfrd-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: alfrd-0.0.4-py3-none-any.whl
- Upload date:
- Size: 12.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 67c19c2304da80dfbc801b4628643ad1295c6aa4fc184d77b16b1d8f04c8cf77 |
|
MD5 | c18d87652531ef2986a703b724542cd8 |
|
BLAKE2b-256 | 044ed7a33adb5923a94f5b1f9125543d1d04bd969cdb620bcfa1542979a0313a |