Skip to main content

Embedded Hardware Description Processor

Project description

modm-data: Embedded Hardware Description

This project is a collection of data processing pipelines that convert and combine multiple sources of hardware description data into the most accurate common representation without manual supervision.

There are many different supported input sources per hardware vendor:

  • PDF technical documentation, especially datasheets and reference manuals.
  • Source code and CMSIS-SVD files describing peripheral registers.
  • Vendor libraries for helping with naming things canonically.
  • Proprietary databases extracted from vendor tooling.

These input sources are made accessible via deterministic data pipelines before finally merging them together. This approach has the best chance of compensating weaknesses in each individual input source while also arbitrating conflicts. The output formats are knowledge graphs with a shared ontology.

The resulting knowledge graphs represent a normalized and complete semantic description of the hardware and are NOT intended to be used directly. Rather, you should extract the data you require and convert it into a format that is useful for your specific use case and device scope. This repository only contains data pipeline code, therefore, if you are interested in the hardware description data only, please use the resulting knowledge graphs directly.

Warning
The project is still in beta and not fully functional or documented. Improving the documentation and flexibility of the modm_data.pdf2html submodule is the main focus of development right now. No output data other than HTML is currently supported.

Installation

You can install this Python ≥3.11 project via PyPi:

pip install modm-data

You also need g++ and patch installed and callable in your path.

Input Sources

You can download all input sources via make input-sources. Please note that it may take a while to download ~10GB of data, mostly PDF technical documentation.

This project uses only publicly available data sources which we have aggregated in several GitHub repositories. However, since the copyright of some sources prohibits republication, these sources are downloaded from the vendor websites directly:

  • STMicro CubeMX database.
  • STMicro PDF technical documentation.

Pipelines

The data pipelines are implemented as Python modules inside modm_data folder and have the following structure:

flowchart LR
    A(PDF) -->|pdf2html| B
    B -->|html2svd| D
    B(HTML) -->|html| C
    %% C --> K
    C(Python) -->|owl| E
    D(CMSIS-SVD) -->|cmsis-svd| C
    E[OWL]
    F(CMSIS\nHeader) -->|header2svd| D
    G(CubeMX) -->|cubemx| C
    H(CubeHAL) -->|cubehal| C
    J -->|dl| A
    J -->|dl| F
    J -->|dl| G
    J -->|dl| H
    J[Vendor] -->|dl| D
    %% K[Evaluation]

Each pipeline has its own command-line interface, please refer to the API documentation for their advanced usage.

Development

For development you can install the package locally:

pip install -e ".[all]"

To browse the API documentation locally:

pdoc modm_data

Citation

This project is a further development of a peer-reviewed paper published in the in the Journal of Systems Research (JSys). Please cite this paper when referring to this project:

@article{hauser2023automatically,
  title={{Automatically Extracting Hardware Descriptions from PDF Technical Documentation}},
  author={Hauser, Niklas and Pennekamp, Jan},
  journal={Journal of Systems Research (JSys)},
  volume={3},
  issue={2},
  year={2023},
  doi={10.5070/tbd}
}

The paper itself is based on a master thesis.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modm-data-0.0.1.tar.gz (126.7 kB view details)

Uploaded Source

Built Distribution

modm_data-0.0.1-py3-none-any.whl (169.0 kB view details)

Uploaded Python 3

File details

Details for the file modm-data-0.0.1.tar.gz.

File metadata

  • Download URL: modm-data-0.0.1.tar.gz
  • Upload date:
  • Size: 126.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for modm-data-0.0.1.tar.gz
Algorithm Hash digest
SHA256 889f30588e1d89fafc3c6d304f96fa8fa9fe8506f87cf0d5f22f1ea0af7a9d01
MD5 750dbd4e755480a7df2e8150a964ff3b
BLAKE2b-256 79ef129cc3325d82cfb880a16c3e4595c3a0777aa27cd01bb2c5658e43df5c6c

See more details on using hashes here.

File details

Details for the file modm_data-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: modm_data-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 169.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for modm_data-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9033ba5d1760dd9a4221a406add1e04756d09a471f6d9e66332806c58a147791
MD5 cd69fed971469eb5cf79bdab976906ff
BLAKE2b-256 d4d0c719f7d460116b90faed6eee6eb624a1916e6f3521e04e073ef16f9151da

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page