Skip to main content

Python bindings for the nmhit NEML2-flavored HIT parser

Project description

neml2-hit

A standalone C++17 parser for a NEML2-flavored dialect of HIT (Hierarchical Input Text) — the hierarchical input format used by MOOSE. This library provides a self-contained, opinionated implementation tailored for use in NEML2 and related projects. It differs from the upstream MOOSE HIT parser in syntax restrictions and API design choices; it is not a general-purpose drop-in replacement. The library depends on Flex & Bison.

The public C++ namespace is nmhit ("NEML2 HIT").


HIT Format

HIT is a simple, human-readable format for hierarchical configuration. A file is a flat sequence of items — sections, key-value fields, comments, blank lines, and file includes — which together form a tree.


Syntax Reference

Comments

A # character begins a comment that extends to the end of the line. Comments are preserved in the AST and are reproduced by render().

# This is a comment
key = value  # inline comments are not supported; this text is part of the value

Note: # is a reserved character in all value positions. It cannot appear inside an unquoted string value or an array element.

Blank lines

One or more consecutive blank lines are preserved as a single Blank node and are reproduced by render().

Sections

A section groups related fields and nested sub-sections.

[section_name]
  key = value
  nested_key = 42
[]

Every section must be closed with []. There is no [../] or [./name] syntax.

Path splitting. A slash in the section header creates the corresponding nesting in the AST:

[mesh/generator]
  type = CartesianMesh
[]

is equivalent to:

[mesh]
  [generator]
    type = CartesianMesh
  []
[]

Sections may appear at the top level or nested inside other sections. Fields and nested sections can appear in any order within a section body.

Fields

A field assigns a value to a name.

key = value

Identifier characters. A field name may contain letters, digits, and any of . / : < > + - * ! _. Slashes in a field name trigger path splitting (see below).

Path splitting. A slash in the field name creates intermediate Section nodes in the AST:

[solver]
  linear/max_iter = 100
[]

is equivalent to:

[solver]
  [linear]
    max_iter = 100
  []
[]

Override assignment. The operators := and :override= are both accepted. The library implements last-override-wins semantics directly: the earlier occurrence of the field is removed from the tree, leaving only the overriding value.

max_iter := 200
max_iter :override= 200   # identical meaning

Values

Every value is one of the following kinds.

Integer

An optional sign followed by one or more decimal digits.

n = 42
n = -7
n = +0

Floating-point number

Standard decimal notation with an optional sign and optional exponent.

x = 3.14
x = -1.0e-3
x = .5
x = 2.
x = 1e10

At least one digit must appear on one side of the decimal point, or an exponent must be present.

The value is stored verbatim as a string. At interpretation time:

  • param<double>() parses it as a 64-bit IEEE 754 double-precision value.
  • param<float>() parses it as double first, then narrows to 32-bit single precision. Values outside the float range become ±inf; values that are representable in double but not exactly in float are rounded to the nearest float.

Boolean

Exactly the two lowercase literals true and false. No other strings (including yes, no, on, off, or any capitalised variant) are accepted.

flag = true
flag = false

Unquoted string

Any sequence of non-whitespace characters that does not begin a number, boolean, quoted string, array, or brace expression, and contains none of [ # $ ' " \.

type = GeneratedMesh
label = some_label
path = /usr/local/share

Unquoted strings are single-line only — they cannot contain whitespace or newlines.

Array (1-D)

A whitespace-delimited sequence of elements enclosed in single quotes or double quotes — both delimiters are completely equivalent. Elements may be integers, floating-point numbers, or unquoted tokens (none of which may contain ;, #, $, ', ", or \).

vals   = '1 2 3'
floats = '1.0 2.5 3.14'
tags   = 'alpha beta gamma'

The two quote styles are interchangeable:

vals = '1 2 3'
vals = "1 2 3"   # identical meaning

An empty array is written as '' or "".

Array contents may span multiple lines — newlines inside the quotes are treated as whitespace:

vals = '
  1 2 3
  4 5 6
'

Array (2-D)

Rows are separated by ;. Each row is a whitespace-delimited sequence of elements, following the same rules as 1-D array elements.

matrix = '1 2 3; 4 5 6; 7 8 9'

The semicolons and surrounding whitespace (including newlines) are flexible:

matrix = '
  1 2 3;
  4 5 6;
  7 8 9
'

Every row must contain at least one element. Trailing semicolons (an empty last row) are a parse error.

Accessing a 2-D array value as a 1-D type (e.g. param<std::vector<int>>) will fail because the semicolons are stored as part of the raw value. Accessing a 1-D array as a 2-D type returns a single-row result.

Brace expressions

A ${...} expression is expanded at value-extraction time (i.e. when param<T>() is called). The raw token is stored in the AST as-is.

The following built-in commands are supported:

Expression Effect
${varname} Look up the field at path varname from the document root and return its string value.
${replace varname} Identical to ${varname}.
${env VARNAME} Substitute the environment variable VARNAME. Returns an empty string when unset.
${raw a b c} Concatenate all arguments literally: abc.

Brace expressions may be nested:

prefix = /opt
lib    = ${raw ${prefix} /lib}   # → /opt/lib

A brace expression may appear as the sole value of a field:

dim = ${mesh/dim}

File inclusion

!include relative/or/absolute/path.i

The referenced file is parsed recursively and its top-level items are spliced into the AST at the point of the !include directive. Relative paths are resolved against the directory of the including file.


Complete Grammar (EBNF)

file        = item* ;
item        = section | field | comment | blank | include ;
section     = '[' path ']' item* '[]' ;
field       = ident ('=' | ':=' | ':override=') value ;
quote       = "'" | '"' ;
value       = integer | float | bool | unquoted_str
            | brace_expr
            | quote array_row (';' array_row)* quote
            | quote quote ;
array_row   = array_elem+ ;
array_elem  = integer | float | unquoted_elem ;
include     = '!include' path ;
comment     = '#' <to end of line> ;
blank       = <two or more consecutive newlines> ;

path        = segment ('/' segment)* ;
segment     = <one or more non-whitespace, non-bracket characters> ;
ident       = [A-Za-z0-9_./<>+\-*!:]+ ;
integer     = [+\-]? [0-9]+ ;
float       = [+\-]? ( [0-9]* '.' [0-9]+ | [0-9]+ '.' [0-9]* ) ([eE] [+\-]? [0-9]+)?
            | [+\-]? [0-9]+ [eE] [+\-]? [0-9]+ ;
            (* stored verbatim; interpreted as double-precision (64-bit IEEE 754) by default,
               narrowed to single-precision (32-bit) when read as float *)
bool        = 'true' | 'false' ;
unquoted_str= [^ \t\n\r\[#$'"\\]+ ;
unquoted_elem=[^ \t\n\r;#$'"\\]+ ;
brace_expr  = '${' <content, brace-depth-tracked> '}' ;

C++ API

Parsing

Two entry points are provided to avoid ambiguity when passing string literals:

#include "nmhit/nmhit.h"

// Read and parse a file from disk.
// Throws nmhit::Error if the file cannot be opened or on syntax errors.
std::unique_ptr<nmhit::Node> root = nmhit::parse_file("my_file.i");

// Parse an in-memory string.
// !include paths are resolved relative to the current working directory.
std::unique_ptr<nmhit::Node> root = nmhit::parse_text("dim = 3\n");

Both functions accept optional pre/post string vectors for injecting HIT snippets (e.g. command-line overrides). All content is concatenated and parsed as a single document, so := override semantics apply globally across all sources:

std::vector<std::string> cli_overrides = { "solver/max_iter := 200" };
auto root = nmhit::parse_file("input.i", /*pre=*/{}, cli_overrides);
auto root = nmhit::parse_text(input_text, /*pre=*/{}, cli_overrides);

Reading values

// Resolve a slash-separated path and return a typed value.
// Throws nmhit::Error if the path does not exist or the value cannot be converted.
int    n  = root->param<int>("mesh/dim");
double x  = root->param<double>("solver/tol");
bool   on = root->param<bool>("output/enabled");

// Return a default when the path is absent (does not throw).
int n = root->param_optional<int>("mesh/dim", 3);

Built-in scalar types: bool, int, unsigned int, int64_t, float, double, std::string.

1-D arrays: std::vector<T> for any built-in or registered scalar T.

2-D arrays: std::vector<std::vector<T>> for any built-in or registered scalar T.

Tree navigation

// Walk direct children, optionally filtered by node type.
for (nmhit::Node * child : root->children())          { ... }
for (nmhit::Node * child : root->children(nmhit::NodeType::Field)) { ... }

// Find a node by relative path (returns nullptr when absent).
nmhit::Node * n = root->find("mesh/dim");

// Walk upward.
nmhit::Node * parent = n->parent();
nmhit::Node * docroot = n->root();

// Full slash-joined path from the root.
std::string fp = n->fullpath();   // e.g. "mesh/dim"

// Source location.
int line = n->line();
int col  = n->column();
std::string file = n->filename();

Inspecting fields

auto * f = dynamic_cast<nmhit::Field *>(root->find("mesh/dim"));
if (f) {
    std::string raw = f->raw_val();    // stored string, e.g. "3" or "'1 2 3'"
    f->set_val("4");                   // replace the stored value
}

Rendering

// Render the tree back to HIT text (preserves comments and blank lines).
std::string text = root->render();

// Custom indentation.
std::string text = root->render(0, "    ");  // 4-space indent

Scalar converters

The same conversions used internally by param<T>() are available as free functions for use on raw strings (e.g. from Field::raw_val()). Surrounding single or double quotes are stripped before conversion. All functions throw nmhit::Error on failure; the optional ctx node is used only to attach file/line/column information to the error.

bool    nmhit::parse_bool  (const std::string & s, const nmhit::Node * ctx = nullptr);
int64_t nmhit::parse_int   (const std::string & s, const nmhit::Node * ctx = nullptr);
double  nmhit::parse_double(const std::string & s, const nmhit::Node * ctx = nullptr);
float   nmhit::parse_float (const std::string & s, const nmhit::Node * ctx = nullptr);

Custom types

Register a scalar parser once before any param<T>() call:

// Registration (e.g. in main() or a static initializer)
nmhit::TypeRegistry::register_parser<MyEnum>(
  [](const std::string & s) -> MyEnum {
    if (s == "linear")    return MyEnum::Linear;
    if (s == "quadratic") return MyEnum::Quadratic;
    throw std::invalid_argument("unknown MyEnum value: " + s);
  }
);

// Usage — all three arities work automatically once T is registered.
MyEnum                          e  = root->param<MyEnum>("order");
std::vector<MyEnum>             v  = root->param<std::vector<MyEnum>>("orders");
std::vector<std::vector<MyEnum>> m = root->param<std::vector<std::vector<MyEnum>>>("order_matrix");

The parser receives the unquoted, brace-expanded token string. Calling param<T>() for an unregistered type throws nmhit::Error.

Thread safety: register_parser is not thread-safe relative to concurrent param calls. Register all custom types before spawning threads that call param.

Errors

All errors throw nmhit::Error, which is a std::exception carrying a vector of nmhit::ErrorMessage (filename, line, column, message).

try {
    auto root = nmhit::parse("input.i", text);
} catch (const nmhit::Error & e) {
    for (auto & msg : e.messages)
        std::cerr << msg.str() << '\n';   // "file.i:10:5: unexpected '}'"
}

Python API

Installation

pip install nmhit

Wheels are published to PyPI for Linux (x86_64, aarch64) and macOS (x86_64, arm64), covering Python 3.9 and later. No Flex or Bison is required.

Quick start

import nmhit

# Parse a file or an in-memory string
root = nmhit.parse_file("input.i")
root = nmhit.parse_text("[mesh]\n  dim = 3\n[]")

# Read typed values via slash-separated paths
dim = root.param_int("mesh/dim")         # int
tol = root.param_float("solver/tol")     # float
on  = root.param_bool("output/enabled")  # bool
tag = root.param_str("type")             # str

# Optional — returns a default when the path is absent
n = root.param_optional_int("mesh/dim", 3)

# 1-D and 2-D arrays
vals   = root.param_list_int("vals")           # list[int]
matrix = root.param_list_list_float("matrix")  # list[list[float]]

parse_text and parse_file accept optional pre and post keyword arguments (lists of HIT strings) for injecting snippets or command-line overrides:

root = nmhit.parse_file("input.i", post=["solver/max_iter := 200"])

Auto-detection with param()

nmhit.param() infers the type from the raw value (bool → int → float → str) and returns a native Python object. Pass an explicit type as the third argument to override inference:

nmhit.param(root, "mesh/dim")           # → 3  (int)
nmhit.param(root, "mesh/dim", float)    # → 3.0
nmhit.param(root, "mesh/dim", str)      # → "3"

Node types and tree navigation

root = nmhit.parse_text("[mesh]\n  dim = 3\n[]")

node = root.find("mesh/dim")            # returns Field, or None if absent
sec  = root.find("mesh")               # returns Section

node.type()      # nmhit.NodeType.Field / .Section / .Root / ...
node.path()      # "dim"
node.fullpath()  # "mesh/dim"
node.line()      # source line number

# Direct children, optionally filtered by type
root.children()                          # list[Node]
root.children(nmhit.NodeType.Section)    # list[Section]

# Walk upward
node.parent()    # parent Node, or None at root
node.root_node() # the Root node

Mutation

# Change a field value in-place
root.find("mesh/dim").set_val("2")

# Add / insert / remove children (cloned into the tree)
root.add_child(nmhit.Field("k", "42"))
root.insert_child(0, nmhit.Field("first", "1"))
removed = root.remove_child("mesh")    # returns the detached node

# Deep copy
root2 = root.clone()

Render

text = root.render()          # default 2-space indent
text = root.render(indent_text="    ")

Errors

All errors raise nmhit.Error (a subclass of RuntimeError). The exception carries a .messages attribute — a list of ErrorMessage objects with line, column, message, and filename fields:

try:
    nmhit.parse_text("[mesh]\n  dim = 3")   # missing []
except nmhit.Error as e:
    for m in e.messages:
        print(m)   # e.g. "<string>:2:9: expected '[]'"

Building

Requirements

Tool Minimum version When required
CMake 3.20 Always
C++ compiler C++17 Always
Flex 2.6 Debug builds only
Bison 3.7 Debug builds only

Pre-generated parser/lexer sources are committed to generated/ and used automatically for non-Debug build types, so end users and CI release builds do not need flex or bison installed.

Configure and build

Release build (no flex or bison required — uses committed generated sources):

cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j$(nproc)

Debug build (requires flex ≥ 2.6 and bison ≥ 3.7 — regenerates parser/lexer from source):

cmake -S . -B build -DCMAKE_BUILD_TYPE=Debug
cmake --build build -j$(nproc)

After modifying src/Lexer.l or src/Parser.y, run the helper target to refresh the committed sources in generated/ and then commit them:

cmake --build build --target update_generated
git add generated/ && git commit

Pass -DNMHIT_BUILD_TESTS=OFF to skip building the test executable.

Run tests

ctest --test-dir build --output-on-failure

Install

cmake -S . -B build -DCMAKE_INSTALL_PREFIX=/your/prefix
cmake --build build -j$(nproc)
cmake --install build

This installs:

Path Contents
<prefix>/lib/libnmhit.a Static library
<prefix>/include/nmhit/ Public headers
<prefix>/lib/cmake/nmhit/ CMake config files
<prefix>/lib/pkgconfig/nmhit.pc pkg-config file

Use from an installed location

CMake find_package:

find_package(nmhit REQUIRED)
target_link_libraries(myapp PRIVATE nmhit::nmhit)

If the library was installed to a non-standard prefix, point CMake at it:

cmake -S . -B build -Dnmhit_DIR=/your/prefix/lib/cmake/nmhit

pkg-config:

pkg-config --cflags --libs nmhit

If the library was installed to a non-standard prefix:

PKG_CONFIG_PATH=/your/prefix/lib/pkgconfig pkg-config --cflags --libs nmhit

Use as a CMake subdirectory

Add the repository as a subdirectory of your project:

add_subdirectory(neml2-hit)
target_link_libraries(myapp PRIVATE nmhit)

The nmhit target exports include/ as a public include directory, so #include "nmhit/nmhit.h" works without any additional configuration.


License

This project is a sub-component of NEML2 and is distributed under the same license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nmhit-0.1.0.tar.gz (92.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

nmhit-0.1.0-cp39-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (232.3 kB view details)

Uploaded CPython 3.9+manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

nmhit-0.1.0-cp39-abi3-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl (214.9 kB view details)

Uploaded CPython 3.9+manylinux: glibc 2.27+ ARM64manylinux: glibc 2.28+ ARM64

nmhit-0.1.0-cp39-abi3-macosx_11_0_arm64.whl (137.7 kB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

nmhit-0.1.0-cp39-abi3-macosx_10_15_x86_64.whl (148.4 kB view details)

Uploaded CPython 3.9+macOS 10.15+ x86-64

File details

Details for the file nmhit-0.1.0.tar.gz.

File metadata

  • Download URL: nmhit-0.1.0.tar.gz
  • Upload date:
  • Size: 92.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for nmhit-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6bcd93c9e31bdd57b841c98b43fca69619c3edd53b997d2b7dbd2d49ee6c25fd
MD5 19d1e661306020851d9fc55934f58243
BLAKE2b-256 ecc0931f546d5ec3229524fc8f12335fb9467c90a7fd5ec3a7b8d0005bbf0f01

See more details on using hashes here.

Provenance

The following attestation bundles were made for nmhit-0.1.0.tar.gz:

Publisher: release.yml on applied-material-modeling/neml2-hit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nmhit-0.1.0-cp39-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nmhit-0.1.0-cp39-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9fd4eeea7dc8aa76fb3a7a0f59b133e2d2e9dccb4be5dd32dfa78fbcf2b5bd8f
MD5 8f92facbceaeb9974cc66e0ee498e97b
BLAKE2b-256 564b0fc4629f9e3379282e0fa39f17316cef92380365ff2b4fa7eaeb6d6f50f8

See more details on using hashes here.

Provenance

The following attestation bundles were made for nmhit-0.1.0-cp39-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on applied-material-modeling/neml2-hit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nmhit-0.1.0-cp39-abi3-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nmhit-0.1.0-cp39-abi3-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 e3dab1690154c6122cec58b17cceeca8d19fd6aa0a4d2b6126a14d8a460baa71
MD5 96ab02acb7093f90cffb674ba4793220
BLAKE2b-256 8beb2e5d885f88f806fe8079a571d43637b2a08d1dade693c248fde931331f04

See more details on using hashes here.

Provenance

The following attestation bundles were made for nmhit-0.1.0-cp39-abi3-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl:

Publisher: release.yml on applied-material-modeling/neml2-hit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nmhit-0.1.0-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

  • Download URL: nmhit-0.1.0-cp39-abi3-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 137.7 kB
  • Tags: CPython 3.9+, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for nmhit-0.1.0-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 226fcba4060a560c8dae94a06ac0a2d71f67112e37861c095b18c72758cfbaa3
MD5 c8f5044f47eb5e2aa8f7ad09db435e91
BLAKE2b-256 f57bb0725186733a7ace89f870e9966e0fda348c16a01e1c023662b03f3dc439

See more details on using hashes here.

Provenance

The following attestation bundles were made for nmhit-0.1.0-cp39-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on applied-material-modeling/neml2-hit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nmhit-0.1.0-cp39-abi3-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for nmhit-0.1.0-cp39-abi3-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 bc52733ddf868a00e18b0818ca6daf5b9e3bf891df902f560b90db0701961a47
MD5 fb99e09186fc0d1c328f56d29e7f5717
BLAKE2b-256 40bdfd5133ad1e5e9c47eb6d3a69aafdcdf5708718c2a5db3f34453681cbd120

See more details on using hashes here.

Provenance

The following attestation bundles were made for nmhit-0.1.0-cp39-abi3-macosx_10_15_x86_64.whl:

Publisher: release.yml on applied-material-modeling/neml2-hit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page