Skip to main content

Embedded Hardware Description Processor

Project description

Semantic Hardware Description

This project is a collection of data processing pipelines that convert and combine multiple sources of hardware description data into the most accurate common representation without manual supervision.

There are many different supported input sources per hardware vendor:

  • PDF technical documentation, especially datasheets and reference manuals.
  • Source code and CMSIS-SVD files describing peripheral registers.
  • Vendor libraries for helping with naming things canonically.
  • Proprietary databases extracted from vendor tooling.

These input sources are made accessible via deterministic data pipelines before finally merging them together. This approach has the best chance of compensating weaknesses in each individual input source while also arbitrating conflicts. The output formats are knowledge graphs with a shared ontology.

The resulting knowledge graphs represent a normalized and complete semantic description of the hardware and are NOT intended to be used directly. Rather, you should extract the data you require and convert it into a format that is useful for your specific use case and device scope. This repository only contains data pipeline code, therefore, if you are interested in the hardware description data only, please use the resulting knowledge graphs directly.

Warning
The project is still in beta and not fully functional or documented. Improving the documentation and flexibility of the modm_data.pdf2html submodule is the main focus of development right now. No output data other than HTML is currently supported.

Installation

You can install this Python ≥3.11 project via PyPi:

pip install modm-data

You also need g++ and patch installed and callable in your path.

Pipelines

The data pipelines are implemented as Python submodules inside modm_data folder and have the following structure:

flowchart LR
    A(PDF) -->|pdf2html| B
    B -->|html2svd| D
    B(HTML) -->|html| C
    C(Python) -->|owl| E
    D(CMSIS-SVD) -->|cmsis-svd| C
    E[OWL]
    F(CMSIS Header) -->|header2svd| D
    I(Open Pin Data) -->|cubemx| C
    G(CubeMX) -->|cubemx| C
    H(CubeHAL) -->|cubehal| C
    J -->|dl| A
    J -->|dl| F
    J -->|dl| G
    J -->|dl| H
    J -->|dl| I
    J[STMicro] -->|dl| D

Each pipeline has its own command-line interface, please refer to the API documentation for their advanced usage.

Development

For development you can install the package locally:

pip install -e ".[all]"

To browse the API documentation locally:

pdoc --mermaid modm_data

This project uses only publicly available data sources, choosing permissive licenses whenever possible:

You can download all input sources via make input-sources. Please note that it may take a while to download ~10GB of data, mostly PDF technical documentation.

Citation

This project is a further development of a peer-reviewed paper published in the Journal of Systems Research (JSys). Please cite this paper when referring to this project:

@article{HP23,
  author = {Hauser, Niklas and Pennekamp, Jan},
  title = {{Automatically Extracting Hardware Descriptions from PDF Technical Documentation}},
  journal = {Journal of Systems Research},
  year = {2023},
  volume = {3},
  number = {1},
  publisher = {eScholarship Publishing},
  month = {10},
  doi = {10.5070/SR33162446},
  code = {https://github.com/salkinium/pdf-data-extraction-jsys-artifact},
  code2 = {https://github.com/modm-io/modm-data},
  meta = {},
}

The paper itself is based on a master thesis.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modm_data-0.0.3.tar.gz (123.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

modm_data-0.0.3-py3-none-any.whl (170.5 kB view details)

Uploaded Python 3

File details

Details for the file modm_data-0.0.3.tar.gz.

File metadata

  • Download URL: modm_data-0.0.3.tar.gz
  • Upload date:
  • Size: 123.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for modm_data-0.0.3.tar.gz
Algorithm Hash digest
SHA256 7a1594f104c3696e7eec9c23654e15c9e6495de79215753c08d1b67911f9de44
MD5 13073ee9c9947b2dc89780b2600f381c
BLAKE2b-256 9afe2771cda0140890fcf80a969a5c5f1c0a2365d32248c0d9152c7fa7dac179

See more details on using hashes here.

Provenance

The following attestation bundles were made for modm_data-0.0.3.tar.gz:

Publisher: pypi.yml on modm-io/modm-data

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file modm_data-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: modm_data-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 170.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for modm_data-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 e9bcd21062b5bc8d0fcddafb4b82cc73ee9828e34ffae380aaed6839027ca244
MD5 1bbeb8154353cdb8069b43432f16a2ee
BLAKE2b-256 6b13e561c87247a1b41bff4e87d683701c2f4f734b587dddeae79a365deba460

See more details on using hashes here.

Provenance

The following attestation bundles were made for modm_data-0.0.3-py3-none-any.whl:

Publisher: pypi.yml on modm-io/modm-data

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page