Skip to main content

Package that can be used to perform RDLC analysis by determining the correlations for the different phases of the development life cycle.

Project description

DA4RDM_Vis_VectorBased

Description

The DA4RDM_Vis_VectorBased is a python based package that allows extracting correlation data for to the different phases of Research Development Life Cycle process. The package can also be used to get vizualizations of the different correlation values for the different RDLC phases as obtained for a given Project Id and other related arguments.

Installation

The package is built using Python as a programming language and utilizes basic python packages. Noteworthy, it uses few visualization packages like plotly express and kaleido to get the vizualizations. Please make sure the necessary packages are installed before execution. Few other packages include scipy, json etc. The test package can be installed using the pip command provided below.

pip install DA4RDM_Vis_VectorBased

Importing the Modules

The package has two important modules Evaluate and Vizualize .The first module invokes the necessary functions that performs the task of data extraction, model creation and fitness evaluation and finally returns the correlation values. The second module uses the values retrieved from module Evaluate to provide the final visualization. The modules can be imported using the below command:

from DA4RDM_Vis_VectorBased import Evaluate
from DA4RDM_Vis_VectorBased import Vizualize

Usage

As mentioned above, the package has two major functionalities as listed below: 1. Extracting correlation data for the project. the correlation data is returned as a dataframe consisting of the correlation values corresponding to each RDLC phase corresponding to the project. To use the package for retreiving the correlation data, the function eval_corr within the module Evaluate should be used. The function body along with parameter information is as shown below:

def eval_corr(data_path, project_id, start_date="", end_date="", operation_list_path="",
              eval_feature="pearson", eval_type="binary"):
 """
  :param data_path,: filepath to the input data as a csv file, a string is expected
  :param project_id: the project for which fcorrelations are to be evaluated
  :param start_date: the earliest timestamp to consider for filtering records, default is evaluated based on first ooccurence of the projectid
  :param end_date: the earliest timestamp to consider for filtering records, default is evaluated based on first ooccurence of the projectid
  :param operation_list_path: filepath to the json file containing list of operations and the vectors for each RDLC phase(default is defined in the function body)
  :param eval_feature: The distance feature to be used for similarity evaluation, either pearson(default) or cosine
  :param eval_type: The type of  distance feature to be used for similarity evaluation, either binary(default) or weighted
 """

The function eval_corr accepts two mandatory positional arguments namely the path for the event log and the Project Id. The project id must be provided to uniquely identify the projects.The optional arguments include start and end date, path for the operational data json file that will be used to build the operational list and RDLC phase identifier, the evaluation feature and type to be used for calculating the similarity. The operationa data is computed from a defalut json file if an external file path is not specified by the user. If the user wishes to provide a customized operational data file then the path must be provided at the arguments position while function invoking. Please refer to the file format as shown below.

{
"Operation_List": ["Add Project", "Edit Project", "Open Resource(RCV)", "Add Resource", "Edit Resource", "Delete Resource", "Upload File",  "Upload MD", "Download File", "View MD", "Delete File", "Update File", "Update MD", "Open User Management", "View Users", "Add Member", "Change Role", "Remove User", "Open Search", "View Search Results", "PID Enquiry", "Create Application Profile",  "Admin Project Quota Change", "Owner Project Quota Change", "Owner Resource Quota Change", "Invite External User", "Archive Resource", "Unarchive Resource", "Merge Request" ],
"Planning": [
        1,        1,        0,        1,        1,        1,        0,        0,        0,
        0,        0,        0,        0,        1,        1,        1,        1,        1,
        0,        0,        0,        1,        1,        1,        1,        1,        0,
        0,        0    ],
"Production": [
        0,        0,        0,        0,        0,        0,        1,        1,        0,
        0,        0,        0,        0,        0,        0,        0,        0,        0,
        0,        0,        0,        0,        0,        0,        0,        0,        0,
        0,        1    ],
"Analysis": [
        0,        0,        0,        0,        0,        0,        0,        0,        1,
        1,        1,        1,        1,        0,        0,        0,        0,        0,
        0,        0,        0,        0,        0,        0,        0,        0,        0,
        1,        1    ],
"Archival": [
        0,        0,        0,        0,        1,        0,        0,        0,        0,
        0,        0,        0,        0,        0,        0,        0,        0,        0,
        0,        0,        0,        0,        0,        0,        0,        0,        1,
        0,        0
    ],
"Access": [
        0,        0,        1,        0,        0,        0,        0,        0,        1,
        1,        0,        0,        0,        1,        1,        1,        0,        0,
        1,        1,        0,        0,        0,        0,        0,        1,        0,
        0,        0    ],
"Reuse": [
        0,        0,        0,        0,        0,        0,        0,        0,        0,
        0,        0,        0,        0,        0,        0,        0,        0,        0,
        0,        0,        1,        0,        0,        0,        0,        0,        0,
        0,        1    ]
}

Example

Below is an execution of the function with all parameters provided.

from DA4RDM_Vis_VectorBased import Evaluate

correlation = Evaluate.eval_corr("RDM_lifecycle_analysis_-_28-04-2022.csv", 'BA1FD94A-CC71-4D32-80AE-67DD2C3BF19A', '2021-04-28', '2023-04-28', "OperationalDatamodify.json", 'cosine', 'binary')
print(correlation)

Below is an execution of the function with only required parameters provided.

from DA4RDM_Vis_VectorBased import Evaluate

correlation = Evaluate.eval_corr("RDM_lifecycle_analysis_-_28-04-2022.csv", 'BA1FD94A-CC71-4D32-80AE-67DD2C3BF19A')
print(correlation)

Below is an example of function execution with no value passed for the parameter corresponding to the path of operational data json file. In order to skip a optional argument, the parameter value should be passed as an empty string at its position. Please refer below for example.

from DA4RDM_Vis_VectorBased import Evaluate

correlation = Evaluate.eval_corr("RDM_lifecycle_analysis_-_28-04-2022.csv", 'BA1FD94A-CC71-4D32-80AE-67DD2C3BF19A', '2021-04-28', '2023-04-28', "", 'cosine', 'binary')
print(correlation)

Output

All the above executions invokes the function eval_corr with the passed parameter values.The correlation values are calculated and returned by the function. Finally, the results received will be printed as shown below.

   RDLC_phase  Correlation_value
0    Planning           0.966092
1  Production           0.000000
2    Analysis           0.000000
3    Archival           0.188982
4      Access           0.356348
5       Reuse           0.000000

2. Generating vizualizations or transforming the correlation data retreived using the eval_corr function as discussed in the sections above. To get a vizualization of the results the visualize function within the module Vizualize should be used. The function body along with parameter information is as shown below:

def visualize(corr_data, save_option):
 """
  :param corr_data: The correlation response data received as output from finction eval_corr
  :param save_option: the type of file to be saved(options are either png, jpeg, pdf or json)
 """

This function accepts a dataframe with correlation data and a vizualization format as required parameters and provides relevant visualizazio. The user can choose from the various allowed formats such are jpeg, png, pdf and json.

Example

Below is an execution of the function with all parameters provided.

from DA4RDM_Vis_VectorBased import Evaluate
from DA4RDM_Vis_VectorBased import Visualize

correlation = Evaluate.eval_corr("RDM_lifecycle_analysis_-_28-04-2022.csv", 'BA1FD94A-CC71-4D32-80AE-67DD2C3BF19A')
Visualize.visualize(correlation, 'jpeg')

Output

If a user selects the format as jpeg, png or pdf, the result is a RadarChart vizualization of the correlation data. If json is the selected format the function outputs a json representation of the correlation values as shown below:

{"Similarity": {"corr_res": [0.9654746681256314, 0.3613249509436927, 0.2388835160664533, 0.5, 0.46176404435490637, 0.4037749551350624], "rdlc_phase": ["Planning", "Production", "Analysis", "Archival", "Access", "Reuse"]}}

The generated files are saved onto the local repository of the program using the package.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DA4RDM_Vis_VectorBased-0.1.0.tar.gz (10.6 kB view hashes)

Uploaded Source

Built Distribution

DA4RDM_Vis_VectorBased-0.1.0-py3-none-any.whl (9.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page