Skip to main content

Python Library to parse AHB expressions.

Project description

Unittests status badge Coverage status badge Linting status badge Black status badge pypy status badge

ahbicht logo

A python package that parses condition expressions from EDI@Energy Anwendungshandbücher (AHB). Since it’s based on lark, we named the module AHBicht.

What is this all about?

The German energy market uses EDIFACT as an intercompany data exchange format. The rules on how to structure and validate the EDIFACT messages are written in

  • one Message Implementation Guide (MIG) per EDIFACT format (for example UTILMD or MSCONS)

  • one Anwendungshandbuch (AHB, en. manual) per use case group (for example GPKE or Wechselprozesse im Messwesen (WiM))

According to the legislation for the German energy market, the organisations in charge of maintaining the documents described above (AHB and MIGs) are the Bundesverband der Energie- und Wasserwirtschaft (BDEW) and the Bundesnetzagentur (BNetzA). They form a working group named “Arbeitsgruppe EDI@Energy”. This work group publishes the MIGs and AHBs on edi-energy.de. The documents are published as PDFs which is better than faxing them but far from ideal.

The AHBs contain information on how to structure single EDIFACT messages. To create messages that are valid according to the respective AHB, you have to process information of the kind: UTILMD_AHB_WiM_3_1b_20201016.pdf page 90

In this example: This library parses the string ``Muss [210] U ([182] X ([90] U [183]))`` and allows determining whether “Details der Prognosegrundlage” is an obligatory field according to the AHB, iff the individual status of the conditions is given. We call this “expression evaluation”.

Note that determining the individual status of [210], [182], [90] and [183] itself (the so called “content evaluation”, see below) is not within the scope of this parsing library.

Note also, that this library also parses the new convention using logical operators that becomes effective 2022-04-01 (“MaKo2022”). Muss [210] ∧ ([182] ⊻ ([90] ∧ [183])).

Usage and Examples

Jupyter Notebook

For a minimal working example on how what the library is used, check out this Jupyter notebook.

Free to Use REST API

You can also use our public REST API to parse condition expressions (other features will follow). Simply send a GET request with the condition expression as query parameter to: ahbicht.azurewebsites.net/api/ParseExpression?expression=[2] U ([3] O [4])[901] U [555]

Easily Integrate AHBicht with Your Solution

If you want to use AHBicht together with your own software, you can use the JSON Schema files provided to kick start the integration.

Code Quality / Production Readiness

  • The code has at least a 95% unit test coverage. ✔️

  • The code is rated 10/10 in pylint and type checked with mypy. ✔️

  • The code is MIT licensed. ✔️

  • There are only few dependencies. ✔️

Expression Evaluation / Parsing the Condition String

Evaluating expressions like Muss [59] U ([123] O [456]) from the AHBs by parsing it with the parsing library lark and combining the parsing result with information about the state of [59], [123], [456] is called expression evaluation. Determining the state of each single condition (f.e. [59] is fulfilled, [123] is not fulfilled, [456] is unknown) for a given message is part of the content evaluation (see next chapter).

If you’re new to this topic, please read edi-energy.de → Dokumente → Allgemeine Festlegungen first. This document contains German explanations, how the Bedingungen are supposed to be read.

Functionality

  • Expressions can contain single numbers e.g. [47] or numbers combined with U/O/X or // respectively which are translated to boolean operators and/or/exclusive or, e.g. [45]U[2] or they can be combined without an operator, e.g. [930][5] in the case of FormatConstraints.

  • Expressions can contain random whitespaces.

  • Input conditions are passed in form of a ConditionNode, see below.

  • Bedingungen/RequirementConstraints with a boolean value, Hinweise/Hints and Formatdefinitionen/FormatConstraints are so far functionally implemented as the result returns if the condition expression is fulfilled and which Hints and FormatConstraints are relevant.

  • The boolean logic follows ‘brackets ( ) before then_also before and before or’.

  • Hints and UnevaluatedFormatConstraints are implemented as neutral element, so not changing the boolean outcome of an expression for the evaluation regarding the requirement constraints and raising errors when there is no sensible logical outcome of the expression.

  • A condition_fulfilled attribute can also take the value unknown.

  • Brackets e.g. ([43]O[4])U[5]

  • Requirement indicators (i.e Muss, Soll, Kann, X, O, U) are seperated from the condition expressions and also seperated into single requirement indicator expressions if there are more than one (for modal marks).

  • Format Constraint Expressions that are returned after the requirement condition evaluation can now be parsed and evaluated.

  • Evaluate several modal marks in one ahb_expression: the first one that evaluates to fulfilled is the valid one.

In planning

  • Evaluate requirement indicators:

    • Soll, Kann, Muss, X, O, U -> is_required, is_forbidden, etc…

Definition of terms

Term

Description

Example

condition

single operand

[53]

condition_key

int or str, the number of the condition

53

operator

combines two conditions

U, O

composition

two parts of an expression combined by an operator

([4]U[76])O[5] consists of an and_composition of [4] and [76] and an or_composition of [4]U[76] and [5]

used in the context of the parsing and evaluation of the expression

ahb expression

an expression as given from the ahb

X[59]U[53]

Consists of at least one single requirement indicator expression.

Muss[59]U([123]O[456])Soll[53]

In case of several model mark expressions the first one will be evaluated and if not fulfilled, it will be continued with the next one.

single requirement indicator expression

An expression consisting of exactly one requirement indicator and their respective condition expression.

Soll[53]

If there is only one requirement indicator in the ahb expression, then both expressions are identical.

condition expression

one or multiple conditions combined with or (in case of FormatConstraints) also without operators

[1]

used as input for the condition parser

[4]O[5]U[45]

format constraint expression

Is returned after the evaluation of the RequirementConstraints

[901]X[954]

consist only of FormatConstraints

requirement indicator

The Merkmal/modal_mark or Operator/prefix_operator of the data element/data element group/segment/segment group.

Muss, Soll, Kann, X, O, U

Merkmal / modal_mark

as defined by the EDI Energy group (see edi-energy.de → Dokumente → Allgemeine Festlegungen)

Muss, Soll, Kann

Stands alone or before a condition expression, can be the start of several requirement indicator expressions in one ahb expression

prefix operator

Operator which does not function to combine conditions, but as requirement indicator.

X, O, U

Stands alone or in front of a condition expression.

tree, branches, token

as used by lark

ConditionNode

Defines the nodes of the tree as they are passed, evaluated und returned.

RequirementConstraint, FormatConstraint, Hint, EvaluatedComposition, RepeatabilityConstraint

There are different kinds of conditions (Bedingung, Hinweis, Format) as defined by the EDI Energy group (see edi-energy.de → Dokumente → Allgemeine Festlegungen) and also a EvaluatedComposition after a composition of two nodes is evaluated.

Bedingung / RequirementConstraint (rc)

  • are true or false, has to be determined

“falls SG2+IDE+CCI == EHZ”

  • keys between [1] and [499]

Wiederholbarkeit / RepeatabilityConstraint

  • gives minimum and maximum occurrence

“Segmentgruppe ist mindestens einmal je SG4 IDE+24 (Vorgang) anzugeben”

  • keys between [2000] and [2499]

Hinweis / Hint

  • just a hint, even if it is worded like a condition

“Hinweis: ‘ID der Messlokation’”

  • keys from [500] onwards, starts with ‘Hinweis:’

“Hinweis: ‘Es ist der alte MSB zu verwenden’”

Formatdefinition / FormatConstraint (fc)

  • a constraint for how the data should be given

“Format: Muss größer 0 sein”

  • keys between [901] and [999], starts with ‘Format:’

“Format: max 5 Nachkommastellen”

Format Constraints are “collected” while evaluating the rest of the tree, meaning the evaluated composition of the Mussfeldprüfung contains an expression that consists only of format constraints.

UnevaluatedFormatConstraint

A format constraint that is just “collected” during the requirement constraint evaluation. To have a clear separation of conditions that affect whether a field is mandatory or not and those that check the format of fields without changing their state it will become a part of the format_constraint_expression which is part of the EvaluatedComposition.

EvaluatableFormatConstraint

An evaluatable FormatConstraint will (other than the UnevaluatedFormatConstraint) be evaluated by e.g. matching a regex, calculating a checksum etc. This happens after the Mussfeldprüfung. (details to be added upon implementing)

EvaluatedComposition

is returned after a composition of two nodes is evaluated

Package Resolver

a package resolver is a class that replaces package nodes in a tree with a sub tree that is derived from a package definition. Replacing package nodes with sub trees is referred to as “package expansion”

Example: “[123P]” is replaced with a tree for “[5]U[6]O[7]”

neutral

Hints and UnevaluatedFormat Constraints are seen as neutral as they don’t have a condition to be fulfilled or unfulfilled and should not change the requirement outcome. See truth table below.

unknown

If the condition can be fulfilled but we don’t know (yet) if it is or not. See truth table below.

“Wenn vorhanden”

The decision if a requirement constraint is met / fulfilled / true is made in the content evaluation module.

Program structure

The following diagram shows the structure of the condition check for more than one condition. If it is only a single condition or just a requirement indicator, the respective tree consists of just this token and the result equals the input.

grafik

The raw and updated data for this diagram can be found in the draw_io_charts repository and edited under app.diagrams.net with your GitHub Account.

There is also an UML Diagram available (last updated 2022-01-29).

Truth tables

Additionally to the usual boolean logic we also have neutral elements (e.g. Hints, UnevaluatedFormatConstraints and in some cases EvaluatedCompositions) or unknown requirement constraints. They are handled as follows:

and_composition

A

B

A U B

Neutral

True

True

Neutral

False

False

Neutral

Neutral

Neutral

Unknown

True

Unknown

Unknown

False

False

Unknown

Unknown

Unknown

Unknown

Neutral

Unknown

or_composition

A

B

A O B

note

Neutral

True

does not make sense

Neutral

False

does not make sense

Neutral

Neutral

Neutral

no or_compositions of hint and format constraint

Unknown

True

True

Unknown

False

Unknown

Unknown

Unknown

Unknown

Unknown

Neutral

does not make sense

xor_composition

A

B

A X B

note

Neutral

True

does not make sense

Neutral

False

does not make sense

Neutral

Neutral

Neutral

no xor_compositions of hint and format constraint

Unkown

True

Unknown

Unkown

False

Unknown

Unkown

Unknown

Unknown

Unkown

Neutral

does not make sense

Link to automatically generate HintsProvider Json content: https://regex101.com/r/za8pr3/5

Content Evaluation

Evaluation is the term used for the processing of single unevaluated conditions. The results of the evaluation of all relevant conditions inside a message can then be used to validate a message. The latter is not part of the evaluation.

This library does not provide content evaluation code for all the conditions used in the available AHBs. You can use the Content Evaluation class stubs though. Please contact @JoschaMetze if you’re interested in a ready-to-use solution to validate your EDIFACT messages according to the latest AHBs. We probably have you covered.

EvaluatableData (Edifact Seed and others)

For the evaluation of a condition (that is referenced by its key, e.g. “17”) it is necessary to have a data basis that allows to decide whether the respective condition is met or not met. This data basis that is stable for all conditions that are evaluated in on evaluation run is called EvaluatableData. These data usually contain the edifact seed (a JSON representation of the EDIFACT message) but may also hold other information. The EvaluatableData class acts a container for these data.

EvaluationContext (Scope and others)

While the data basis is stable, the context in which a condition is evaluated might change during on evaluation run. The same condition can have different evaluation results depending on e.g. in which scope it is evaluated. A scope is a (json) path that references a specific subtree of the edifact seed. For example one “Vorgang” (SG4 IDE) in UTILMD could be a scope. If a condition is described as

There has to be exactly one xyz per Vorgang (SG4+IDE) Then for n Vorgänge there are n scopes:

  • one scope for each Vorgang (pathes refer to an edifact seed):

    • $["Dokument"][0]["Nachricht"][0]["Vorgang"][0]

    • $["Dokument"][0]["Nachricht"][0]["Vorgang"][1]

    • $["Dokument"][0]["Nachricht"][0]["Vorgang"][<n-1>]

Each of the single vorgang scopes can have a different evaluation result. Those results are relevant for the user when entering data, probably based in a somehow Vorgang-centric manner.

The EvaluationContext class is a container for the scope and other information that are relevant for a single condition and a single evaluation only but (other than EvaluatableData) might change within an otherwise stable message.

grafik

Releasing

The version number has to be changed in setup.cfg file.

Contributing

You are very welcome to contribute to this repository by opening a pull request against the main branch.

How to use this Repository on Your Machine / Local Setup

Please follow the instructions in our Python Template Repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ahbicht-0.8.4.tar.gz (2.1 MB view hashes)

Uploaded Source

Built Distribution

ahbicht-0.8.4-py3-none-any.whl (370.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page