Python Library to parse AHB expressions.
Project description
A python package that parses condition expressions from EDI@Energy Anwendungshandbücher (AHB). Since it’s based on lark, we named the module AHBicht.
What is this all about?
The German energy market uses EDIFACT as an intercompany data exchange format. The rules on how to structure and validate the EDIFACT messages are written in
one Message Implementation Guide (MIG) per EDIFACT format (for example UTILMD or MSCONS)
one Anwendungshandbuch (AHB, en. manual) per use case group (for example GPKE or Wechselprozesse im Messwesen (WiM))
According to the legislation for the German energy market, the organisations in charge of maintaining the documents described above (AHB and MIGs) are the Bundesverband der Energie- und Wasserwirtschaft (BDEW) and the Bundesnetzagentur (BNetzA). They form a working group named “Arbeitsgruppe EDI@Energy”. This work group publishes the MIGs and AHBs on edi-energy.de. The documents are published as PDFs which is better than faxing them but far from ideal.
The AHBs contain information on how to structure single EDIFACT messages. To create messages that are valid according to the respective AHB, you have to process information of the kind:
In this example: This library parses the string ``Muss [210] U ([182] X ([90] U [183]))`` and allows determining whether “Details der Prognosegrundlage” is an obligatory field according to the AHB, iff the individual status of the conditions is given. We call this “expression evaluation”.
Note that determining the individual status of [210], [182], [90] and [183] itself (the so called “content evaluation”, see below) is not within the scope of this parsing library.
Note also, that this library also parses the new convention using logical operators that becomes effective 2022-04-01 (“MaKo2022”). Muss [210] ∧ ([182] ⊻ ([90] ∧ [183])).
Usage and Examples
Jupyter Notebook
For a minimal working example on how what the library is used, check out this Jupyter notebook.
Free to Use REST API
You can also use our public REST API to parse condition expressions (other features will follow). Simply send a GET request with the condition expression as query parameter to: ahbicht.azurewebsites.net/api/ParseExpression?expression=[2] U ([3] O [4])[901] U [555]
Easily Integrate AHBicht with Your Solution
If you want to use AHBicht together with your own software, you can use the JSON Schema files provided to kick start the integration.
There is a fully typed .NET client available: AhbichtClient.NET
Code Quality / Production Readiness
The code has at least a 95% unit test coverage. ✔️
The code is rated 10/10 in pylint and type checked with mypy. ✔️
The code is MIT licensed. ✔️
There are only few dependencies. ✔️
Expression Evaluation / Parsing the Condition String
Evaluating expressions like Muss [59] U ([123] O [456]) from the AHBs by parsing it with the parsing library lark and combining the parsing result with information about the state of [59], [123], [456] is called expression evaluation. Determining the state of each single condition (f.e. [59] is fulfilled, [123] is not fulfilled, [456] is unknown) for a given message is part of the content evaluation (see next chapter).
If you’re new to this topic, please read edi-energy.de → Dokumente → Allgemeine Festlegungen first. This document contains German explanations, how the Bedingungen are supposed to be read.
Functionality
Expressions can contain single numbers e.g. [47] or numbers combined with U/O/X or ∧/∨/⊻ respectively which are translated to boolean operators and/or/exclusive or, e.g. [45]U[2] or they can be combined without an operator, e.g. [930][5] in the case of FormatConstraints.
Expressions can contain random whitespaces.
Input conditions are passed in form of a ConditionNode, see below.
Bedingungen/RequirementConstraints with a boolean value, Hinweise/Hints and Formatdefinitionen/FormatConstraints are so far functionally implemented as the result returns if the condition expression is fulfilled and which Hints and FormatConstraints are relevant.
The boolean logic follows ‘brackets ( ) before then_also before and before or’.
Hints and UnevaluatedFormatConstraints are implemented as neutral element, so not changing the boolean outcome of an expression for the evaluation regarding the requirement constraints and raising errors when there is no sensible logical outcome of the expression.
A condition_fulfilled attribute can also take the value unknown.
Brackets e.g. ([43]O[4])U[5]
Requirement indicators (i.e Muss, Soll, Kann, X, O, U) are separated from the condition expressions and also separated into single requirement indicator expressions if there are more than one (for modal marks).
Format Constraint Expressions that are returned after the requirement condition evaluation can now be parsed and evaluated.
Evaluate several modal marks in one ahb_expression: the first one that evaluates to fulfilled is the valid one.
In planning
Evaluate requirement indicators:
Soll, Kann, Muss, X, O, U -> is_required, is_forbidden, etc…
Definition of terms
Term |
Description |
Example |
---|---|---|
condition |
single operand |
[53] |
condition_key |
int or str, the number of the condition |
53 |
operator |
combines two conditions |
U, O |
composition |
two parts of an expression combined by an operator |
([4]U[76])O[5] consists of an and_composition of [4] and [76] and an or_composition of [4]U[76] and [5] |
used in the context of the parsing and evaluation of the expression |
||
ahb expression |
an expression as given from the ahb |
X[59]U[53] |
Consists of at least one single requirement indicator expression. |
Muss[59]U([123]O[456])Soll[53] |
|
In case of several model mark expressions the first one will be evaluated and if not fulfilled, it will be continued with the next one. |
||
single requirement indicator expression |
An expression consisting of exactly one requirement indicator and their respective condition expression. |
Soll[53] |
If there is only one requirement indicator in the ahb expression, then both expressions are identical. |
||
condition expression |
one or multiple conditions combined with or (in case of FormatConstraints) also without operators |
[1] |
used as input for the condition parser |
[4]O[5]U[45] |
|
format constraint expression |
Is returned after the evaluation of the RequirementConstraints |
[901]X[954] |
consist only of FormatConstraints |
||
requirement indicator |
The Merkmal/modal_mark or Operator/prefix_operator of the data element/data element group/segment/segment group. |
Muss, Soll, Kann, X, O, U |
Merkmal / modal_mark |
as defined by the EDI Energy group (see edi-energy.de → Dokumente → Allgemeine Festlegungen) |
Muss, Soll, Kann |
Stands alone or before a condition expression, can be the start of several requirement indicator expressions in one ahb expression |
||
prefix operator |
Operator which does not function to combine conditions, but as requirement indicator. |
X, O, U |
Stands alone or in front of a condition expression. |
||
tree, branches, token |
as used by lark |
|
ConditionNode |
Defines the nodes of the tree as they are passed, evaluated und returned. |
RequirementConstraint, FormatConstraint, Hint, EvaluatedComposition, RepeatabilityConstraint |
There are different kinds of conditions (Bedingung, Hinweis, Format) as defined by the EDI Energy group (see edi-energy.de → Dokumente → Allgemeine Festlegungen) and also a EvaluatedComposition after a composition of two nodes is evaluated. |
||
Bedingung / RequirementConstraint (rc) |
|
“falls SG2+IDE+CCI == EHZ” |
|
||
Wiederholbarkeit / RepeatabilityConstraint |
|
“Segmentgruppe ist mindestens einmal je SG4 IDE+24 (Vorgang) anzugeben” |
|
||
Hinweis / Hint |
|
“Hinweis: ‘ID der Messlokation’” |
|
“Hinweis: ‘Es ist der alte MSB zu verwenden’” |
|
Formatdefinition / FormatConstraint (fc) |
|
“Format: Muss größer 0 sein” |
|
“Format: max 5 Nachkommastellen” |
|
Format Constraints are “collected” while evaluating the rest of the tree, meaning the evaluated composition of the Mussfeldprüfung contains an expression that consists only of format constraints. |
||
UnevaluatedFormatConstraint |
A format constraint that is just “collected” during the requirement constraint evaluation. To have a clear separation of conditions that affect whether a field is mandatory or not and those that check the format of fields without changing their state it will become a part of the format_constraint_expression which is part of the EvaluatedComposition. |
|
EvaluatableFormatConstraint |
An evaluatable FormatConstraint will (other than the UnevaluatedFormatConstraint) be evaluated by e.g. matching a regex, calculating a checksum etc. This happens after the Mussfeldprüfung. (details to be added upon implementing) |
|
EvaluatedComposition |
is returned after a composition of two nodes is evaluated |
|
Package Resolver |
a package resolver is a class that replaces package nodes in a tree with a sub tree that is derived from a package definition. Replacing package nodes with sub trees is referred to as “package expansion” |
Example: “[123P]” is replaced with a tree for “[5]U[6]O[7]” |
neutral |
Hints and UnevaluatedFormat Constraints are seen as neutral as they don’t have a condition to be fulfilled or unfulfilled and should not change the requirement outcome. See truth table below. |
|
unknown |
If the condition can be fulfilled but we don’t know (yet) if it is or not. See truth table below. |
“Wenn vorhanden” |
The decision if a requirement constraint is met / fulfilled / true is made in the content evaluation module.
Program structure
The following diagram shows the structure of the condition check for more than one condition. If it is only a single condition or just a requirement indicator, the respective tree consists of just this token and the result equals the input.
The raw and updated data for this diagram can be found in the draw_io_charts repository and edited under app.diagrams.net with your GitHub Account.
There is also an UML Diagram available (last updated 2022-01-29).
Truth tables
Additionally to the usual boolean logic we also have neutral elements (e.g. Hints, UnevaluatedFormatConstraints and in some cases EvaluatedCompositions) or unknown requirement constraints. They are handled as follows:
and_composition
A |
B |
A U B |
---|---|---|
Neutral |
True |
True |
Neutral |
False |
False |
Neutral |
Neutral |
Neutral |
Unknown |
True |
Unknown |
Unknown |
False |
False |
Unknown |
Unknown |
Unknown |
Unknown |
Neutral |
Unknown |
or_composition
A |
B |
A O B |
note |
---|---|---|---|
Neutral |
True |
does not make sense |
|
Neutral |
False |
does not make sense |
|
Neutral |
Neutral |
Neutral |
no or_compositions of hint and format constraint |
Unknown |
True |
True |
|
Unknown |
False |
Unknown |
|
Unknown |
Unknown |
Unknown |
|
Unknown |
Neutral |
does not make sense |
xor_composition
A |
B |
A X B |
note |
---|---|---|---|
Neutral |
True |
does not make sense |
|
Neutral |
False |
does not make sense |
|
Neutral |
Neutral |
Neutral |
no xor_compositions of hint and format constraint |
Unknown |
True |
Unknown |
|
Unknown |
False |
Unknown |
|
Unknown |
Unknown |
Unknown |
|
Unknown |
Neutral |
does not make sense |
Link to automatically generate HintsProvider Json content: https://regex101.com/r/za8pr3/5
Content Evaluation
Evaluation is the term used for the processing of single unevaluated conditions. The results of the evaluation of all relevant conditions inside a message can then be used to validate a message. The latter is not part of the evaluation.
This library does not provide content evaluation code for all the conditions used in the available AHBs. You can use the Content Evaluation class stubs though. Please contact @JoschaMetze if you’re interested in a ready-to-use solution to validate your EDIFACT messages according to the latest AHBs. We probably have you covered.
EvaluatableData (Edifact Seed and others)
For the evaluation of a condition (that is referenced by its key, e.g. “17”) it is necessary to have a data basis that allows to decide whether the respective condition is met or not met. This data basis that is stable for all conditions that are evaluated in on evaluation run is called EvaluatableData. These data usually contain the edifact seed (a JSON representation of the EDIFACT message) but may also hold other information. The EvaluatableData class acts a container for these data.
EvaluationContext (Scope and others)
While the data basis is stable, the context in which a condition is evaluated might change during on evaluation run. The same condition can have different evaluation results depending on e.g. in which scope it is evaluated. A scope is a (json) path that references a specific subtree of the edifact seed. For example one “Vorgang” (SG4 IDE) in UTILMD could be a scope. If a condition is described as
There has to be exactly one xyz per Vorgang (SG4+IDE) Then for n Vorgänge there are n scopes:
one scope for each Vorgang (path’s refer to an edifact seed):
$["Dokument"][0]["Nachricht"][0]["Vorgang"][0]
$["Dokument"][0]["Nachricht"][0]["Vorgang"][1]
…
$["Dokument"][0]["Nachricht"][0]["Vorgang"][<n-1>]
Each of the single vorgang scopes can have a different evaluation result. Those results are relevant for the user when entering data, probably based in a somehow Vorgang-centric manner.
The EvaluationContext class is a container for the scope and other information that are relevant for a single condition and a single evaluation only but (other than EvaluatableData) might change within an otherwise stable message.
Releasing
To create a new release, just create a new release with a new version tag (e.g. v1.2.3) on the releases page of this repository.
Contributing
You are very welcome to contribute to this repository by opening a pull request against the main branch.
How to use this Repository on Your Machine / Local Setup
Please follow the instructions in our Python Template Repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ahbicht-0.10.0.tar.gz
.
File metadata
- Download URL: ahbicht-0.10.0.tar.gz
- Upload date:
- Size: 2.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1cc742f61b66cd7d14d08ebf0af57cb46ff5e4fae460499f48c691c1b1c79b8a |
|
MD5 | 26af0641dd1bf69873d16771d1d6a215 |
|
BLAKE2b-256 | d48390c709d64bc95af1184440aa73917c7f76d528c03e883ef93187e03ecc8f |
File details
Details for the file ahbicht-0.10.0-py3-none-any.whl
.
File metadata
- Download URL: ahbicht-0.10.0-py3-none-any.whl
- Upload date:
- Size: 371.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7f372c74b98ead9728be2764644b1effaf90ddcdf8a325f222cb6d0e97c22131 |
|
MD5 | 3a2948edb31e1964cdfab9577cda02b2 |
|
BLAKE2b-256 | f6221560ac078b31b524dbbb5291754f9e0a24c8bbf1dccd9eb6ae22af8c24c8 |