Skip to main content

An implementation of DMN in Python. DMN rules are read from an Excel workbook

Project description

pyDMNrules

An implementation of DMN (Decision Model Notation) in Python, using the pySFeel, openpyxl and pandas modules.

DMN rules are read from an Excel workbook. Then data, matching the input variables in the DMN rules is passed to the decide() function. The returned data contains the decision.

The passed Excel workbook must contain a 'Decision' tab and may contain a 'Glossary' tab. Other tabs contain as many DMN rules tables as necessary.

Glossary - optional

The 'Glossary' tab must contain a table headed 'Glossary'. A 'Glossary' table lets the user add additional documentation/clarification to the input and output data. Data items can be grouped in to Business Concepts to better document where inputs come from and where outputs will be going to. The 'Glossary' table must have exactly three columns with the headings 'Variable', 'Business Concept' and 'Attribute'. The data in these three columns describe the inputs and outputs associated with the 'Decision'.

'Variable' can be any text and is used to pass data to and from pyDMNrules.
NOTE: the name 'Execute' is a reserved name and cannot be used as a Variable name. The name 'Execute' is reserved as the a heading for output columns, output rows or the output variable for a Cross Tabs decision table. The output value in an 'Execute' output column, 'Execute' output row or 'Execute' Cross Tabs decision table must be the name of a Decision Table which will be invoked when the matching rule is triggered. If the output value in an 'Execute' output column, an 'Execute' output row is missing or an output cell in an 'Execute' Cross Tabs decision table, then no Decision Table will be invoked whe the matching rule is triggered.

'Business Concept' and 'Attribute' must be valid FEEL names as the 'Variable' will be mapped to a Glossary item created by joining these two FEEL names with a period character, as in 'Order.orderSize'.
Vertically merged cells in the 'Business Concept' column are used to group multiple 'Variable'/'Attribute' pairs together into a single 'Businss Concept'.
The 'Glossary' can have multiple columns of Annotations to describe things like data types, valid values etc.

Items in the Glossary can be outputs of one decision table and inputs in the next decision table, which makes it easy to support complex business models.

Input decision cells and Output cells can refer to input and output values either by their 'Variable' name, or by their Glossary item (FeeL) name, of 'BusinessConcept.Attribute'

Glossary
Variable Business Concept Attribute
Customer Customer sector
OrderSize Order orderSize
Delivery delivery
Discount Discount discount

If a 'Glossary' tab is not present - if no Glossary is supplied, then pyDMNrules will create one using the input and output headings. There will be a single 'Business Concept' called 'Data'. Attributes will be created by replacing any space characters (' ') with an underscore character ('_') and then removing all invalid FEEL name characters. This can cause conflicts. For instance ':' is a valid character in a Variable, but not in a FEEL name. If there are two Varibles, being 'C' and 'C:' then both will be transformed to an Attribute of 'C' and a Glossary item of 'Data.C', which will be ambiguous. This ambiguity can be avoided by supplying a 'Glossary' and explicitely mapping 'C:' to a valid FEEL name, such as 'Ccolon'.

Decision

The 'Decision' tab must contain a table headed 'Decision' with the headings 'Decisions' and 'Execute Decision Table'. The 'Decisions' column contain a brief description of the DMN rules table which is named in 'Execute Decision Table' column. pyDMNrules will execute each decision table in this order. The 'Decision' table can have Annotations.

Decision
Decisions Execute Decision Table
Determine Discount Discount

The Decision table can have 'Input' columns before the 'Decision' heading.

  • The headings of each 'Input' column must be a Variable from the Glossary or be suitable to be ing added to the Glossary if there is no 'Glossary.
  • The cells under each input heading are input test cells containing an expression about the input Variable, and which must evalutate to True, or False, depending upon the value of the input Variable. These inputs can be used to control the flow, that is, which Decision tables are run and which ones are skipped.

The Decision table can be followed by columns of 'Annotations' after the 'Execute Decision Table' column

pyDMNrules will evalutate all the input test cells before running the associate DMN rules table. If all of the input test cells do not evaluate to True, then pyDMNrules will skip this DMN decision table and moves on to the next row in the Decision table. These input cell are evaluated on the fly, which means that a preceeding DMN rules table can set or clear values in the Glossary which can effect which DMN rules tables are included, and which DMN rules tables are excluded, in the final decision.

DMN rules tables

The DMN rules table(s), listed under Execute Decision Table, must exist on other spreadsheets in the Excel workbook and can be 'Decisions as Rows', 'Decisions as Columns' or 'Decision as Crosstab' rules tables.

Decision as Rows rules tables

Decisions as Rows DMN rules tables have columns of inputs and columns of outputs, with a double line vertical border delineating where input columns stop and output columns begin.

  • The headings for the input and output columns must be Variables from the Glossary or, for an output column, is can be the word 'Execute'.
  • The input test cells under the input heading must contain expressions about the input Variable, which will evaluate to True or False, depending upon the value of the input Variable.
  • The values in the output results cells, under the output headings, are the values that will be assigned to the output Variables, if all the input cells evaluate to True on the same row. If the output heading is the word 'Execute', then the output value must be the name of a Decision Table, which will be invoked immediately. Which means any output values already assigned can be overwritten by the invoked Decision Table, and any outputs created by the invoked Decision Table can be overwritten by subsequent outputs in the current Decision Table.

ExampleRows.xlsx is an example fo a Decision as Rows DMN table.

Decision as Columns rules tables

Decisions as Columns DMN rules tables have rows of inputs and rows of outputs, with a double line horizontal border delineating where input rows stop and output rows begin.

  • The headings on the left hand side of the decision table, for the input and output rows, must be Variables from the Glossary.
  • The input test cells across the row must contain expressions about the input Variable, which will evalutate to True or False, depending upon the value of the input Variable.
  • The values in the output cells, across the row, are the values that will be assigned to the output Variable, if all the input cells evaluate to True in the same column. If the output heading is the word 'Execute', then the output value must be the name of a Decision Table, which will be invoked immediately. Which means any output values already assigned can be overwritten by the invoked Decision Table, and any outputs created by the invoked Decision Table can be overwritten by subsequent outputs in the current Decision Table.

ExampleColumns.xlsx is an example fo a Decision as Columns DMN table.

Decision as Crosstab rules tables

Decision as Crosstab DMN rules tables have one output heading at the top left of the table.

  • The horizontal and vertical headings, which must be Variables from the Glossary, are the input headings.
  • Under the horizontal headings and besides the vertical headings are the input test cells.
  • The output results cells form the body of the table and are assigned to the output Variable when the input cells in the same row and same column, all evaluate to True. If the output Variable is the word 'Execute', then the output value must be the name of a Decision Table, which will be invoked as part of completing the decision.

ExampleCrosstab.xlsx is an example fo a Decision as Crosstab DMN table.

Strings:

Decision Model Notation specifies that strings must be enclosed in double quotes ("), as in "this is a string", or be italicized. pyDMNrules does not look for italicized text, but it does try to interpret unequivocal strings as string. So, HIGH,DECLINE are both strings; they are not numbers, nor dates, nor time periods.

pyDMNrules also recognizes entries from the Glossary, so Patient.age is not a string if 'Patient' is a 'Business Concept' in the Glossary and 'age' is an attribute associated with the 'Patient' Business Concept in the Glossary. When any expression containing Patient.age is evaluated, Patient.age will be replaced with the current value associated with that Glossary entry.

However, strings that may be ambiguous, must be enclosed in double quotes ("). Any string that contains spaces or a comma is ambiguous and must enclosed in double quotes ("). All strings can be enclosed in double quotes, so HIGH,DECLINE and "HIGH","DECLINE" are equivalent.

Ambigous strings:

If you don't enclose codes in double quotes the pyDMNrules will try and interpret them. So codes which are all digits will be interpreted as numbers. Codes that start with a digit, such as 9A42, will cause FEEL errors and must be enclosed in double quotes. FEEL has a syntax for durations. Any code that looks like a duration, such as P42D (42 days), will need to be enclosed in double quotes.

USAGE:

import pyDMNrules
dmnRules = pyDMNrules.DMN()
status = dmnRules.load('ExampleHPV.xlsx')
if 'errors' in status:
    print('ExampleHPV.xlsx has errors', status['errors'])
    sys.exit(0)
else:
    print('ExampleHPV.xlsx loaded')

data = {}
data['Participant Age'] = 36
data['In Test of Cure'] = True
data['Hysterectomy Flag'] = False
data['Cancer Flag'] = False
data['HPV-V'] = 'V0'
data['Current Participant Risk Category'] = 'low'
print('Testing',repr(data))
(status, newData) = dmnRules.decide(data)
print('Decision',repr(newData))
if 'errors' in status:
    print('With errors', status['errors'])

For a Single Hit Policy DMN rules table, newData will be a decision dictionary of the decision.
For a Multi Hit Policy DMN rules tables newData will be a list of decison dictionaries; one for each matched rule.
For a compound decision (a Decision table with multiple rows of DMN rules tables, where more than one Decision table is run) newData will be a list of decison dictionaries; one for each DMN rules table that is run. However, if only one of the DMN rules tables is run, and it is a Single Hit Policy DMN rules table, then newData will be a simple decision dictionary of the decision.
The keys to each decision dictionary are

  • 'Result' - for a Single Hit Policy DMN rules table this will be a dictionary where all the keys will be 'Variables' from the Glossary and the matching the value will be the value of that 'Variable' after the decision was made. For a Multi Hit Policy DMN rules table this will be a list of decision dictionaries, one for each matched rule.
  • 'Excuted Rule' - for a Single Hit Policy DMN rules table this will be a tuple of the Decision Table Description ('Decisions' from the 'Decision' table), the DMN rules table name ('Execute Decision Table' from the 'Decision' table) and the Rule number for the rule that matched in that DMN rules Table. For a Multi Hit Policy DMN rules table this will be the a list of tuples, being the Decision Table Description, DMN rules table name and matched Rule number for each matching rule.
  • 'DecisionAnnotations'(optional) - list of tuples (heading, value) of the annotations from the 'Decision' table named in the 'Executed Rule' tuple.
  • 'RuleAnnotations'(optional) - for a Single Hit Policy DMN rules table, this well be a list of tuples (heading, value) of the annotations for the matching rule, if there were any annotations for the matching rule. For a Multi Hit Policy DMN rules table this will be a list of the lists of any annotations for each matching rule, where an empty list means that the associated matching rule had no annotations.

If the Decision table contains multiple rows (multiple DMN rules tables run sequentially in order to make the decision) then the returned 'newData' will be a list of decision dictionaries, with each containing the keys 'Result', 'Excuted Rule', 'DecisionAnnotations'(optional) and 'RuleAnnotations'(optional), being one list entry for each DMN rules table used from the Decision table. The final enty in this list is the final decision. All other entries are the intermediate states involved in making the final decision.

$ python3 HPV.py
ExampleHPV.xlsx loaded
Testing {'Participant Age': 36, 'In Test of Cure': True, 'Hysterectomy Flag': False, 'Cancer Flag': False, 'HPV-V': 'V0', 'Current Participant Risk Category': 'low'}
Decision(newData) {'Result': {'Immune Deficient Flag': None, 'Hysterectomy Flag': False, 'Cancer Flag': False, 'In Test of Cure': True, 'Participant Age': 36.0, 'Current Participant Risk Category': 'low', 'HPV-V': 'V0', 'Cyto-S': None, 'Cyto-E': None, 'Cyto-O': None, 'Collection Method': None, 'Test Risk Code': 'L', 'New Participant Risk Category': 'low', 'Participant Care Pathway': 'toBeDetermined', 'Next Rule': 'CervicalRisk2'}, 'Executed Rule': ('Determine CervicalRisk', 'FirstTestOfCervicalRisk', '20'), 'DecisionAnnotations': [('Decides', 'Test risk')], 'RuleAnnotations': [('Test meaning', 'need more info')]}

Pandas functionality

pyDMNrules can pass a Pandas DataFrame of value through the decide() function and return a Pandas DataFrame of decisions. The status of each decision is returned as a Pandas Series and a Pandas DataFrame of information about each decision is also returned.

(dfStatus, dfResults, dfDecision) = dmnRules.decidePandas(dfInput)

dfInput is a Pandas DataFrame of rows of data about which decisions need to be made. Column names are mapped to decision Variables.

dfResults is a Pandas DataFrame with one row for each decision. Column names are mapped from all the Variables in the Glossary.

dfStatus is a Pandas Series of one value each decision. Values are the error message, or 'no errors' of there were no errors.

dfDecision is a Pandas DataFrame with columns of 'DecisionName', 'TableName', 'RuleId', 'DecisionAnnotation' and 'RuleAnnotation'

USAGE:

import pyDMNrules
import pandas as pd
import sys
dmnRules = pyDMNrules.DMN()
status = dmnRules.load('AN-SNAP rules (DMN).xlsx')
if 'errors' in status:
  print('AN-SNAP rules (DMN).xlsx has errors', status['errors'])
  sys.exit(0)
else:
    print('AN-SNAP rules (DMN).xlsx loaded')
dataTypes = {'Patient_Age':int, 'Episode_Length_of_stay':int,     'Phase_Length_of_stay':int,
    'Phase_FIM_Motor_Score':int, 'Phase_FIM_Cognition_Score':int, 'Phase_RUG_ADL_Score':int,
    'Phase_w01':float, 'Phase_NWAU21':float,
    'Indigenous_Status':str, 'Care_Type':str, 'Funding_Source':str, 'Phase_Impairment_Code':str,
    'AROC_code':str, 'Phase_AN_SNAP_V4_0_code':str,
    'Same_day_addmitted_care':bool, 'Delerium_or_Dimentia':bool, 'RadioTherapy_Flag':bool, 'Dialysis_Flag':bool}
dates = ['BirthDate', 'Admission_Date', 'Discharge_Date', 'Phase_Start_Date', 'Phase_End_Date']
dfInput = pd.read_csv('subAcuteExtract.csv', dtype=dataTypes, parse_dates=dates)
dfInput['Multidisciplinary'] = True
dfInput['Admitted_Flag'] = True
dfInput['Length_of_Stay'] = dfInput['Phase_Length_of_Stay']
dfInput['Long_term_care'] = False
dfInput.loc[dfInput['Length_of_Stay'] > 92, 'Long_term_care'] = True
dfInput['Same_day_admitted_care'] = False
dfInput['GEM_clinic'] = None
dfInput['Patient_Age_Type'] = None
dfInput['First_Phase'] = False
grouped = dfInput.groupby(['Patient_UR', 'Admission_Date'])
for index in grouped.head(1).index:
    if dfInput.loc[index]['Phase_Type'] == 'Unstable':
     dfInput.loc[index, 'First_Phase'] = True
columns = {'Admitted_Flag':'Admitted Flag', 'Care_Type':'Care Type', 'Length_of_Stay':'Length of Stay', 'Long_term_care':'Long term care',
    'Same_day_admitted_care':'Same-day admitted care', 'GEM_clinic':'GEM clinic', 'Patient_Age':'Patient Age', 'Patient_Age_Type':'Patient Age Type',
    'AROC_code':'AROC code', 'Delirium_of_Dimentia':'Delirium or Dimentia', 'Phase_Type':'Phase Type', 'First_Phase':'First Phase', 'Phase_FIM_Motor_Score':'FIM Motor score',
    'Phase_FIM_Cognition_Score':'FIM Cognition score', 'Phase_RUG_ADL_Score':'RUG-ADL', 'Delirium_or_Dimentia':'Delirium or Dimentia',
    'Problem_Severity_Scrore':'Problem Severity Score'}
(dfStatus, dfResults, dfDecision) = dmnRules.decidePandas(dfInput, headings=columns)
if dfStatus.where(dfStatus != 'no errors').count() > 0:
    print('has errors', dfStatus.loc['status' != 'no errors'])
    sys.exit(0)
for index in dfResults.index:
    print(index, dfResults.loc[index, 'AN_SNAP_V4_code'])

Testing

pyDMNrules reserves the spreadsheet name 'Test' which can be used for storing test data. The function test() reads the test data, passes it through the decide() function and assembles the results which are returned to the caller.

Unit Test data table(s)

The 'Test' spreadsheet must contain Unit Test data tables.

  • Each Unit Test data table must be named with a 'Business Concept'.
  • The heading of the table must be Variables from the Glossary associated with the Business Concept.
  • The data, in each column, must be valid input data for the associated Variable.
  • Each row is considered a unit test data set.
  • Unit Test data tables can have 'Annotations'

DMNrulesTest - the test that will be run

pyDMNrules also searches the 'Test' worksheet for a special table named 'DMNrulesTests' which must exist.

  • The 'DMNrulesTests' table has input columns followed by output columns with a double line vertical border delineating where input columns stop and output columns begin.
  • The headings for the input columns must be Business Concepts from the Glossary.
  • The data in input columns must be one based indexes into the matching unit test data table.
  • The headings for the output columns must be Variables from the Glossary.
  • The data in output columns will be valid data for the matching Variable and are the expected values returned by the decide() function.
  • The 'DMNrulesTests' table can have 'Annotations'

The test() function will process each row in the 'DMNrulesTests' table, getting data from the Unit Test data tables and building a data{} dictionary. The test() function then passes this data{} dictionary to the decide() function. The returned newData{} is then compared to the output data for the matching test in the 'DMNrulesTests' table. The data{}, newData{} and a list of mismatches, if any, is then appended to the list of results, which is eventually passed back to the caller of the test() function.

The returned list is a list of dictionaries, one for each test in the 'DMNrulesTests' table. The keys to this dictionary are

  • 'Test ID' - the one based index into the 'DMNrulesTests' table which identifies which test which was run
  • 'TestAnnotations'(optional) - the list of annotation for this test - not present if no annotations were present in 'DMNrulesTests'
  • 'data' - the dictionary of assembed data passed to the decide() function
  • 'newData' - the decision dictionary returned by the decide() function [see above]
  • 'DataAnnotations'(optional) - the list of annotations from the unit test data tables, for the unit test sets used in this test - not present if no annotations were persent in any of the unit test data tables for the selected unit test sets.
  • 'Mismatches'(optional) - the list of mismatch reports, one for each 'DMNrulesTests' table output value that did not match the value returned from the decide() function - not present if all the data returned from the decide() function matched the values in the 'DMNrulesTest' table.

Note

If an output value should be the value of an input, then you can use the 'Variable' from the Glossary. However, if the output is a manuipulation of an input, then you will need to use the internal name for that variable, being the 'Business Concept' and the 'Attibute' concateneted with the period character. That is, 'Patient Age' is valid, but 'Patient Age + 5' is not. Instead you will need to use the syntax 'Patient.age + 5'

USAGE:

import pyDMNrules
dmnRules = pyDMNrules.DMN()
status = dmnRules.load('Therapy.xlsx')
if 'errors' in status:
    print('Therapy.xlsx has errors', status['errors'])
    sys.exit(0)
else:
    print('Therapy.xlsx loaded')
(testStatus, results) = dmnRules.test()
for test in range(len(results)):
    if 'Mismatches' not in results[test]:
        print('Test ID', results[test]['Test ID'], 'passed')
    else:
        print('Test ID', results[test]['Test ID'], 'failed')


$ python3 Therapy.py
Test ID 1 passed
Test ID 2 passed
Test ID 3 passed
Decisions(results) [{'Test ID': 1, 'data': {'Encounter Diagnosis': 'Acute Sinusitis', 'Patient Age': 58, 'Patient Allergies': ['Penicillin', 'Streptomycin'], 'Patient Creatinine Level': 2, 'Patient Creatinine Clearance': 44.42, 'Patient Weight': 78, 'Patient Active Medication': 'Coumadin'}, 'newData': {'Result': {'Encounter Diagnosis': 'Acute Sinusitis', 'Recommended Medication': 'Levofloxacin', 'Recommended Dose': '250mg every 24 hours for 14 days', 'Warning': 'Coumadin and Levofloxacin can result in reduced effectiveness of Coumadin.', 'Error Message': None, 'Patient Age': 58.0, 'Patient Weight': 78.0, 'Patient Allergies': ['Penicillin', 'Streptomycin'], 'Patient Creatinine Level': 2.0, 'Patient Creatinine Clearance': 44.416667, 'Patient Active Medication': 'Coumadin'}, 'Executed Rule': ('Check Drug Interaction', 'WarnAboutDrugInteraction', 'Interaction-1')}, 'status': {}}, {'Test ID': 2, 'data': {'Encounter Diagnosis': 'Acute Sinusitis', 'Patient Age': 65, 'Patient Creatinine Level': 1.8, 'Patient Creatinine Clearance': 48.03, 'Patient Weight': 83}, 'newData': {'Result': {'Encounter Diagnosis': 'Acute Sinusitis', 'Recommended Medication': 'Amoxicillin', 'Recommended Dose': '250mg every 24 hours for 14 days', 'Warning': None, 'Error Message': None, 'Patient Age': 65.0, 'Patient Weight': 83.0, 'Patient Allergies': None, 'Patient Creatinine Level': 1.8, 'Patient Creatinine Clearance': 48.032407, 'Patient Active Medication': None}, 'Executed Rule': ('Check Drug Interaction', 'WarnAboutDrugInteraction', 'Interaction-2')}, 'status': {}}, {'Test ID': 3, 'data': {'Encounter Diagnosis': 'Diabetes', 'Patient Age': 27, 'Patient Creatinine Level': 1.88, 'Patient Weight': 110}, 'newData': {'Result': {'Encounter Diagnosis': 'Diabetes', 'Recommended Medication': None, 'Recommended Dose': None, 'Warning': None, 'Error Message': 'Sorry, this decision service can handle only Acute Sinusitis', 'Patient Age': 27.0, 'Patient Weight': 110.0, 'Patient Allergies': None, 'Patient Creatinine Level': 1.88, 'Patient Creatinine Clearance': None, 'Patient Active Medication': None}, 'Executed Rule': ('Create Message', 'ErrorMessage', 'Error-1')}, 'status': {}}]

NOTE

You can keep the Test worksheet in a separate workbook and load it after you have loaded the rules

status = dmnRules.loadTest('TherapyTests.xlsx')

Documentation: More details can be found at readthedocs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyDMNrules-1.3.4.tar.gz (55.6 kB view hashes)

Uploaded Source

Built Distribution

pyDMNrules-1.3.4-py3-none-any.whl (49.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page