Embedded Hardware Description Processor
Project description
Semantic Hardware Description
This project is a collection of data processing pipelines that convert and combine multiple sources of hardware description data into the most accurate common representation without manual supervision.
There are many different supported input sources per hardware vendor:
- PDF technical documentation, especially datasheets and reference manuals.
- Source code and CMSIS-SVD files describing peripheral registers.
- Vendor libraries for helping with naming things canonically.
- Proprietary databases extracted from vendor tooling.
These input sources are made accessible via deterministic data pipelines before finally merging them together. This approach has the best chance of compensating weaknesses in each individual input source while also arbitrating conflicts. The output formats are knowledge graphs with a shared ontology.
The resulting knowledge graphs represent a normalized and complete semantic description of the hardware and are NOT intended to be used directly. Rather, you should extract the data you require and convert it into a format that is useful for your specific use case and device scope. This repository only contains data pipeline code, therefore, if you are interested in the hardware description data only, please use the resulting knowledge graphs directly.
Warning
The project is still in beta and not fully functional or documented. Improving the documentation and flexibility of themodm_data.pdf2htmlsubmodule is the main focus of development right now. No output data other than HTML is currently supported.
Installation
You can install this Python ≥3.11 project via PyPi:
pip install modm-data
You also need g++ and patch installed and callable in your path.
Pipelines
The data pipelines are implemented as Python submodules inside modm_data
folder and have the following structure:
flowchart LR
A(PDF) -->|pdf2html| B
B -->|html2svd| D
B(HTML) -->|html| C
C(Python) -->|owl| E
D(CMSIS-SVD) -->|cmsis-svd| C
E[OWL]
F(CMSIS Header) -->|header2svd| D
I(Open Pin Data) -->|cubemx| C
G(CubeMX) -->|cubemx| C
H(CubeHAL) -->|cubehal| C
J -->|dl| A
J -->|dl| F
J -->|dl| G
J -->|dl| H
J -->|dl| I
J[STMicro] -->|dl| D
Each pipeline has its own command-line interface, please refer to the API documentation for their advanced usage.
Development
For development you can install the package locally:
pip install -e ".[all]"
To browse the API documentation locally:
pdoc --mermaid modm_data
This project uses only publicly available data sources, choosing permissive licenses whenever possible:
- STM32 CMSIS header files: BSD-3-Clause.
- STM32 Open Pin Data: BSD-3-Clause.
- STM32 CMSIS-SVD files: Apache-2.0.
- STMicro CubeMX database: ST SLA.
- STMicro PDF technical documentation: ST SLA.
You can download all input sources via make input-sources. Please note that it
may take a while to download ~10GB of data, mostly PDF technical documentation.
Citation
This project is a further development of a peer-reviewed paper published in the Journal of Systems Research (JSys). Please cite this paper when referring to this project:
@article{HP23,
author = {Hauser, Niklas and Pennekamp, Jan},
title = {{Automatically Extracting Hardware Descriptions from PDF Technical Documentation}},
journal = {Journal of Systems Research},
year = {2023},
volume = {3},
number = {1},
publisher = {eScholarship Publishing},
month = {10},
doi = {10.5070/SR33162446},
code = {https://github.com/salkinium/pdf-data-extraction-jsys-artifact},
code2 = {https://github.com/modm-io/modm-data},
meta = {},
}
The paper itself is based on a master thesis.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file modm_data-0.0.3.tar.gz.
File metadata
- Download URL: modm_data-0.0.3.tar.gz
- Upload date:
- Size: 123.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7a1594f104c3696e7eec9c23654e15c9e6495de79215753c08d1b67911f9de44
|
|
| MD5 |
13073ee9c9947b2dc89780b2600f381c
|
|
| BLAKE2b-256 |
9afe2771cda0140890fcf80a969a5c5f1c0a2365d32248c0d9152c7fa7dac179
|
Provenance
The following attestation bundles were made for modm_data-0.0.3.tar.gz:
Publisher:
pypi.yml on modm-io/modm-data
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
modm_data-0.0.3.tar.gz -
Subject digest:
7a1594f104c3696e7eec9c23654e15c9e6495de79215753c08d1b67911f9de44 - Sigstore transparency entry: 790468535
- Sigstore integration time:
-
Permalink:
modm-io/modm-data@c5bc6fad3dcec79be4a46c5c388c860aa0364a38 -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/modm-io
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@c5bc6fad3dcec79be4a46c5c388c860aa0364a38 -
Trigger Event:
push
-
Statement type:
File details
Details for the file modm_data-0.0.3-py3-none-any.whl.
File metadata
- Download URL: modm_data-0.0.3-py3-none-any.whl
- Upload date:
- Size: 170.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9bcd21062b5bc8d0fcddafb4b82cc73ee9828e34ffae380aaed6839027ca244
|
|
| MD5 |
1bbeb8154353cdb8069b43432f16a2ee
|
|
| BLAKE2b-256 |
6b13e561c87247a1b41bff4e87d683701c2f4f734b587dddeae79a365deba460
|
Provenance
The following attestation bundles were made for modm_data-0.0.3-py3-none-any.whl:
Publisher:
pypi.yml on modm-io/modm-data
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
modm_data-0.0.3-py3-none-any.whl -
Subject digest:
e9bcd21062b5bc8d0fcddafb4b82cc73ee9828e34ffae380aaed6839027ca244 - Sigstore transparency entry: 790468538
- Sigstore integration time:
-
Permalink:
modm-io/modm-data@c5bc6fad3dcec79be4a46c5c388c860aa0364a38 -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/modm-io
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@c5bc6fad3dcec79be4a46c5c388c860aa0364a38 -
Trigger Event:
push
-
Statement type: