Skip to main content

Converts (already scraped) Entscheidungsbaumdiagramm tables to real graphs

Project description

rebdhuhn

License: GPL Python Versions (officially) supported Unittests status badge Coverage status badge Linting status badge Formatting status badge PyPi Status Badge

🇩🇪 Dieses Repository enthält ein Python-Paket namens rebdhuhn (früher: ebdtable2graph), das genutzt werden kann, um aus .docx-Dateien extrahierte maschinenlesbare Tabellen, die einen Entscheidungsbaum (EBD) modellieren, in echte Graphen zu konvertieren. Diese Entscheidungsbäume sind Teil eines regulatorischen Regelwerks für die deutsche Energiewirtschaft und kommen in der Eingangsprüfung der Marktkommunikation zum Einsatz.

🇬🇧 This repository contains the source code of the Python package rebdhuhn (formerly known as ebdtable2graph).

Rationale

Assume, that you scraped the Entscheidungsbaumdiagramm tables by EDI@Energy from their somewhat "digitized" PDF/DOCX files. (To do so, you can use the package ebdamame (was: ebddocx2table).) Also assume, that the result of your scraping is a rebdhuhn.models.EbdTable.

The package rebdhuhn contains logic to convert your scraped data into a graph. This graph can then be exported e.g. as SVG and/or UML.

How to use rebdhuhn?

Install the package from pypi:

pip install rebdhuhn

Create an Instance of EbdTable

EbdTable contains the raw data by BDEW in a machine-readable format. Creating instances of EbdTable is out of scope for this package. Ask Hochfrequenz for support on this topic. In the following example we hard code the information.

from rebdhuhn.graph_conversion import convert_table_to_graph
from rebdhuhn.models import EbdCheckResult, EbdTable, EbdTableMetaData, EbdTableRow, EbdTableSubRow, EbdGraph

ebd_table: EbdTable  # this is the result of scraping the docx file
ebd_table = EbdTable(  # this data shouldn't be handwritten
    metadata=EbdTableMetaData(
        ebd_code="E_0003",
        chapter="7.39 AD: Bestellung der Aggregationsebene der Bilanzkreissummenzeitreihe auf Ebene der Regelzone",
        sub_chapter="7.39.1 E_0003_Bestellung der Aggregationsebene RZ prüfen",
        role="ÜNB",
    ),
    rows=[
        EbdTableRow(
            step_number="1",
            description="Erfolgt der Eingang der Bestellung fristgerecht?",
            sub_rows=[
                EbdTableSubRow(
                    check_result=EbdCheckResult(result=False, subsequent_step_number=None),
                    result_code="A01",
                    note="Fristüberschreitung",
                ),
                EbdTableSubRow(
                    check_result=EbdCheckResult(result=True, subsequent_step_number="2"),
                    result_code=None,
                    note=None,
                ),
            ],
        ),
        EbdTableRow(
            step_number="2",
            description="Erfolgt die Bestellung zum Monatsersten 00:00 Uhr?",
            sub_rows=[
                EbdTableSubRow(
                    check_result=EbdCheckResult(result=False, subsequent_step_number=None),
                    result_code="A02",
                    note="Gewählter Zeitpunkt nicht zulässig",
                ),
                EbdTableSubRow(
                    check_result=EbdCheckResult(result=True, subsequent_step_number="Ende"),
                    result_code=None,
                    note=None,
                ),
            ],
        ),
    ],
)
assert isinstance(ebd_table, EbdTable)

ebd_graph = convert_table_to_graph(ebd_table)
assert isinstance(ebd_graph, EbdGraph)

Export as PlantUML

from rebdhuhn import convert_graph_to_plantuml

plantuml_code = convert_graph_to_plantuml(ebd_graph)
with open("e_0003.puml", "w+", encoding="utf-8") as uml_file:
    uml_file.write(plantuml_code)

The file e_0003.puml now looks like this:

@startuml
...
if (<b>1: </b> Erfolgt der Eingang der Bestellung fristgerecht?) then (ja)
else (nein)
    :A01;
    note left
        Fristüberschreitung
    endnote
    kill;
endif
if (<b>2: </b> Erfolgt die Bestellung zum Monatsersten 00:00 Uhr?) then (ja)
    end
else (nein)
    :A02;
    note left
        Gewählter Zeitpunkt nicht zulässig
    endnote
    kill;
endif
@enduml

Export the graph as SVG

First, make sure to have a local instance of kroki up and running via docker (localhost:8125):

Add the required .env file to the repository root by opening a new terminal session, changing the directory to

cd path\to\rebdhuhn\repository\root

and executing the create_env_file.py script via

python create_env_file.py

Run the docker-desktop app on your local maschine and host the local kroki instance on PORT 8125 via

docker-compose up -d

To export the graph as SVG, use

from rebdhuhn import convert_plantuml_to_svg_kroki
from rebdhuhn.kroki import Kroki

kroki_client = Kroki()
svg_code = convert_plantuml_to_svg_kroki(plantuml_code, kroki_client)
with open("e_0003.svg", "w+", encoding="utf-8") as svg_file:
    svg_file.write(svg_code)

How to use this Repository on Your Machine (for development)

Please follow the instructions in our Python Template Repository . And for further information, see the Tox Repository.

Contribute

You are very welcome to contribute to this template repository by opening a pull request against the main branch.

Related Tools and Context

This repository is part of the Hochfrequenz Libraries and Tools for a truly digitized market communication.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rebdhuhn-0.4.1.tar.gz (49.5 kB view details)

Uploaded Source

Built Distribution

rebdhuhn-0.4.1-py3-none-any.whl (46.6 kB view details)

Uploaded Python 3

File details

Details for the file rebdhuhn-0.4.1.tar.gz.

File metadata

  • Download URL: rebdhuhn-0.4.1.tar.gz
  • Upload date:
  • Size: 49.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for rebdhuhn-0.4.1.tar.gz
Algorithm Hash digest
SHA256 fa88dbf66d2f3281cf8f866e92b98e4b518aae4c8748c333bd2c7e43df2c0efc
MD5 fddd8048f366e9dae9db10ef22a45cd3
BLAKE2b-256 426f211ad530d4dbacb7a4e6f34025d92c40839262c25f936eafd51e99dfdefa

See more details on using hashes here.

Provenance

The following attestation bundles were made for rebdhuhn-0.4.1.tar.gz:

Publisher: python-publish.yml on Hochfrequenz/rebdhuhn

Attestations:

File details

Details for the file rebdhuhn-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: rebdhuhn-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 46.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for rebdhuhn-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1828a876d7c546eb18fef26c75aa559ae71091c7b0a5184ce7cefe9cb76a7756
MD5 2b9048f59b2be0db4cdc8cc359e93c71
BLAKE2b-256 00d19ef65f7d85ad06d1f556a87aef2e9a2da384a64d622ca1153c32e21142b6

See more details on using hashes here.

Provenance

The following attestation bundles were made for rebdhuhn-0.4.1-py3-none-any.whl:

Publisher: python-publish.yml on Hochfrequenz/rebdhuhn

Attestations:

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page