Skip to main content

Python bindings for the nmhit NEML2-flavored HIT parser

Project description

neml2-hit

A standalone C++17 parser for a NEML2-flavored dialect of HIT (Hierarchical Input Text) — the hierarchical input format used by MOOSE. This library provides a self-contained, opinionated implementation tailored for use in NEML2 and related projects. It differs from the upstream MOOSE HIT parser in syntax restrictions and API design choices; it is not a general-purpose drop-in replacement. The library depends on Flex & Bison.

The public C++ namespace is nmhit ("NEML2 HIT").


HIT Format

HIT is a simple, human-readable format for hierarchical configuration. A file is a flat sequence of items — sections, key-value fields, comments, blank lines, and file includes — which together form a tree.


Syntax Reference

Comments

A # character begins a comment that extends to the end of the line. Comments are preserved in the AST and are reproduced by render().

# This is a comment
key = value  # inline comments are not supported; this text is part of the value

Note: # is a reserved character in all value positions. It cannot appear inside an unquoted string value or an array element.

Blank lines

One or more consecutive blank lines are preserved as a single Blank node and are reproduced by render().

Sections

A section groups related fields and nested sub-sections.

[section_name]
  key = value
  nested_key = 42
[]

Every section must be closed with []. There is no [../] or [./name] syntax.

Path splitting. A slash in the section header creates the corresponding nesting in the AST:

[mesh/generator]
  type = CartesianMesh
[]

is equivalent to:

[mesh]
  [generator]
    type = CartesianMesh
  []
[]

Sections may appear at the top level or nested inside other sections. Fields and nested sections can appear in any order within a section body.

Fields

A field assigns a value to a name.

key = value

Identifier characters. A field name may contain letters, digits, and any of . / : < > + - * ! _. Slashes in a field name trigger path splitting (see below).

Path splitting. A slash in the field name creates intermediate Section nodes in the AST:

[solver]
  linear/max_iter = 100
[]

is equivalent to:

[solver]
  [linear]
    max_iter = 100
  []
[]

Override assignment. The operators := and :override= are both accepted. The library implements last-override-wins semantics directly: the earlier occurrence of the field is removed from the tree, leaving only the overriding value.

max_iter := 200
max_iter :override= 200   # identical meaning

Values

Every value is one of the following kinds.

Integer

An optional sign followed by one or more decimal digits.

n = 42
n = -7
n = +0

Floating-point number

Standard decimal notation with an optional sign and optional exponent.

x = 3.14
x = -1.0e-3
x = .5
x = 2.
x = 1e10

At least one digit must appear on one side of the decimal point, or an exponent must be present.

The value is stored verbatim as a string. At interpretation time:

  • param<double>() parses it as a 64-bit IEEE 754 double-precision value.
  • param<float>() parses it as double first, then narrows to 32-bit single precision. Values outside the float range become ±inf; values that are representable in double but not exactly in float are rounded to the nearest float.

Boolean

Exactly the two lowercase literals true and false. No other strings (including yes, no, on, off, or any capitalised variant) are accepted.

flag = true
flag = false

Unquoted string

Any sequence of non-whitespace characters that does not begin a number, boolean, quoted string, array, or brace expression, and contains none of [ # $ ' " \.

type = GeneratedMesh
label = some_label
path = /usr/local/share

Unquoted strings are single-line only — they cannot contain whitespace or newlines.

Array (1-D)

A whitespace-delimited sequence of elements enclosed in single quotes or double quotes — both delimiters are completely equivalent. Elements may be integers, floating-point numbers, or unquoted tokens (none of which may contain ;, #, $, ', ", or \).

vals   = '1 2 3'
floats = '1.0 2.5 3.14'
tags   = 'alpha beta gamma'

The two quote styles are interchangeable:

vals = '1 2 3'
vals = "1 2 3"   # identical meaning

An empty array is written as '' or "".

Array contents may span multiple lines — newlines inside the quotes are treated as whitespace:

vals = '
  1 2 3
  4 5 6
'

Array (2-D)

Rows are separated by ;. Each row is a whitespace-delimited sequence of elements, following the same rules as 1-D array elements.

matrix = '1 2 3; 4 5 6; 7 8 9'

The semicolons and surrounding whitespace (including newlines) are flexible:

matrix = '
  1 2 3;
  4 5 6;
  7 8 9
'

Every row must contain at least one element. Trailing semicolons (an empty last row) are a parse error.

Accessing a 2-D array value as a 1-D type (e.g. param<std::vector<int>>) will fail because the semicolons are stored as part of the raw value. Accessing a 1-D array as a 2-D type returns a single-row result.

Brace expressions

A ${...} expression is expanded at value-extraction time (i.e. when param<T>() is called). The raw token is stored in the AST as-is.

The following built-in commands are supported:

Expression Effect
${varname} Look up the field at path varname from the document root and return its string value.
${replace varname} Identical to ${varname}.
${env VARNAME} Substitute the environment variable VARNAME. Returns an empty string when unset.
${raw a b c} Concatenate all arguments literally: abc.

Brace expressions may be nested:

prefix = /opt
lib    = ${raw ${prefix} /lib}   # → /opt/lib

A brace expression may appear as the sole value of a field:

dim = ${mesh/dim}

File inclusion

!include relative/or/absolute/path.i

The referenced file is parsed recursively and its top-level items are spliced into the AST at the point of the !include directive. Relative paths are resolved against the directory of the including file.


Complete Grammar (EBNF)

file        = item* ;
item        = section | field | comment | blank | include ;
section     = '[' path ']' item* '[]' ;
field       = ident ('=' | ':=' | ':override=') value ;
quote       = "'" | '"' ;
value       = integer | float | bool | unquoted_str
            | brace_expr
            | quote array_row (';' array_row)* quote
            | quote quote ;
array_row   = array_elem+ ;
array_elem  = integer | float | unquoted_elem ;
include     = '!include' path ;
comment     = '#' <to end of line> ;
blank       = <two or more consecutive newlines> ;

path        = segment ('/' segment)* ;
segment     = <one or more non-whitespace, non-bracket characters> ;
ident       = [A-Za-z0-9_./<>+\-*!:]+ ;
integer     = [+\-]? [0-9]+ ;
float       = [+\-]? ( [0-9]* '.' [0-9]+ | [0-9]+ '.' [0-9]* ) ([eE] [+\-]? [0-9]+)?
            | [+\-]? [0-9]+ [eE] [+\-]? [0-9]+ ;
            (* stored verbatim; interpreted as double-precision (64-bit IEEE 754) by default,
               narrowed to single-precision (32-bit) when read as float *)
bool        = 'true' | 'false' ;
unquoted_str= [^ \t\n\r\[#$'"\\]+ ;
unquoted_elem=[^ \t\n\r;#$'"\\]+ ;
brace_expr  = '${' <content, brace-depth-tracked> '}' ;

C++ API

Parsing

Two entry points are provided to avoid ambiguity when passing string literals:

#include "nmhit/nmhit.h"

// Read and parse a file from disk.
// Throws nmhit::Error if the file cannot be opened or on syntax errors.
std::unique_ptr<nmhit::Node> root = nmhit::parse_file("my_file.i");

// Parse an in-memory string.
// !include paths are resolved relative to the current working directory.
std::unique_ptr<nmhit::Node> root = nmhit::parse_text("dim = 3\n");

Both functions accept optional pre/post string vectors for injecting HIT snippets (e.g. command-line overrides). All content is concatenated and parsed as a single document, so := override semantics apply globally across all sources:

std::vector<std::string> cli_overrides = { "solver/max_iter := 200" };
auto root = nmhit::parse_file("input.i", /*pre=*/{}, cli_overrides);
auto root = nmhit::parse_text(input_text, /*pre=*/{}, cli_overrides);

Reading values

// Resolve a slash-separated path and return a typed value.
// Throws nmhit::Error if the path does not exist or the value cannot be converted.
int    n  = root->param<int>("mesh/dim");
double x  = root->param<double>("solver/tol");
bool   on = root->param<bool>("output/enabled");

// Return a default when the path is absent (does not throw).
int n = root->param_optional<int>("mesh/dim", 3);

Built-in scalar types: bool, int, unsigned int, int64_t, float, double, std::string.

1-D arrays: std::vector<T> for any built-in or registered scalar T.

2-D arrays: std::vector<std::vector<T>> for any built-in or registered scalar T.

Tree navigation

// Walk direct children, optionally filtered by node type.
for (nmhit::Node * child : root->children())          { ... }
for (nmhit::Node * child : root->children(nmhit::NodeType::Field)) { ... }

// Find a node by relative path (returns nullptr when absent).
nmhit::Node * n = root->find("mesh/dim");

// Walk upward.
nmhit::Node * parent = n->parent();
nmhit::Node * docroot = n->root();

// Full slash-joined path from the root.
std::string fp = n->fullpath();   // e.g. "mesh/dim"

// Source location.
int line = n->line();
int col  = n->column();
std::string file = n->filename();

Inspecting fields

auto * f = dynamic_cast<nmhit::Field *>(root->find("mesh/dim"));
if (f) {
    std::string raw = f->raw_val();    // stored string, e.g. "3" or "'1 2 3'"
    f->set_val("4");                   // replace the stored value
}

Rendering

// Render the tree back to HIT text (preserves comments and blank lines).
std::string text = root->render();

// Custom indentation.
std::string text = root->render(0, "    ");  // 4-space indent

Scalar converters

The same conversions used internally by param<T>() are available as free functions for use on raw strings (e.g. from Field::raw_val()). Surrounding single or double quotes are stripped before conversion. All functions throw nmhit::Error on failure; the optional ctx node is used only to attach file/line/column information to the error.

bool    nmhit::parse_bool  (const std::string & s, const nmhit::Node * ctx = nullptr);
int64_t nmhit::parse_int   (const std::string & s, const nmhit::Node * ctx = nullptr);
double  nmhit::parse_double(const std::string & s, const nmhit::Node * ctx = nullptr);
float   nmhit::parse_float (const std::string & s, const nmhit::Node * ctx = nullptr);

Custom types

Register a scalar parser once before any param<T>() call:

// Registration (e.g. in main() or a static initializer)
nmhit::TypeRegistry::register_parser<MyEnum>(
  [](const std::string & s) -> MyEnum {
    if (s == "linear")    return MyEnum::Linear;
    if (s == "quadratic") return MyEnum::Quadratic;
    throw std::invalid_argument("unknown MyEnum value: " + s);
  }
);

// Usage — all three arities work automatically once T is registered.
MyEnum                          e  = root->param<MyEnum>("order");
std::vector<MyEnum>             v  = root->param<std::vector<MyEnum>>("orders");
std::vector<std::vector<MyEnum>> m = root->param<std::vector<std::vector<MyEnum>>>("order_matrix");

The parser receives the unquoted, brace-expanded token string. Calling param<T>() for an unregistered type throws nmhit::Error.

Thread safety: register_parser is not thread-safe relative to concurrent param calls. Register all custom types before spawning threads that call param.

Errors

All errors throw nmhit::Error, which is a std::exception carrying a vector of nmhit::ErrorMessage (filename, line, column, message).

try {
    auto root = nmhit::parse("input.i", text);
} catch (const nmhit::Error & e) {
    for (auto & msg : e.messages)
        std::cerr << msg.str() << '\n';   // "file.i:10:5: unexpected '}'"
}

Python API

Installation

pip install nmhit

Wheels are published to PyPI for Linux (x86_64, aarch64) and macOS (x86_64, arm64), covering Python 3.9 and later. No Flex or Bison is required.

Quick start

import nmhit

# Parse a file or an in-memory string
root = nmhit.parse_file("input.i")
root = nmhit.parse_text("[mesh]\n  dim = 3\n[]")

# Read typed values via slash-separated paths
dim = root.param_int("mesh/dim")         # int
tol = root.param_float("solver/tol")     # float
on  = root.param_bool("output/enabled")  # bool
tag = root.param_str("type")             # str

# Optional — returns a default when the path is absent
n = root.param_optional_int("mesh/dim", 3)

# 1-D and 2-D arrays
vals   = root.param_list_int("vals")           # list[int]
matrix = root.param_list_list_float("matrix")  # list[list[float]]

parse_text and parse_file accept optional pre and post keyword arguments (lists of HIT strings) for injecting snippets or command-line overrides:

root = nmhit.parse_file("input.i", post=["solver/max_iter := 200"])

Auto-detection with param()

nmhit.param() infers the type from the raw value (bool → int → float → str) and returns a native Python object. Pass an explicit type as the third argument to override inference:

nmhit.param(root, "mesh/dim")           # → 3  (int)
nmhit.param(root, "mesh/dim", float)    # → 3.0
nmhit.param(root, "mesh/dim", str)      # → "3"

Node types and tree navigation

root = nmhit.parse_text("[mesh]\n  dim = 3\n[]")

node = root.find("mesh/dim")            # returns Field, or None if absent
sec  = root.find("mesh")               # returns Section

node.type()      # nmhit.NodeType.Field / .Section / .Root / ...
node.path()      # "dim"
node.fullpath()  # "mesh/dim"
node.line()      # source line number

# Direct children, optionally filtered by type
root.children()                          # list[Node]
root.children(nmhit.NodeType.Section)    # list[Section]

# Walk upward
node.parent()    # parent Node, or None at root
node.root_node() # the Root node

Mutation

# Change a field value in-place
root.find("mesh/dim").set_val("2")

# Add / insert / remove children (cloned into the tree)
root.add_child(nmhit.Field("k", "42"))
root.insert_child(0, nmhit.Field("first", "1"))
removed = root.remove_child("mesh")    # returns the detached node

# Deep copy
root2 = root.clone()

Render

text = root.render()          # default 2-space indent
text = root.render(indent_text="    ")

Errors

All errors raise nmhit.Error (a subclass of RuntimeError). The exception carries a .messages attribute — a list of ErrorMessage objects with line, column, message, and filename fields:

try:
    nmhit.parse_text("[mesh]\n  dim = 3")   # missing []
except nmhit.Error as e:
    for m in e.messages:
        print(m)   # e.g. "<string>:2:9: expected '[]'"

Building

Requirements

Tool Minimum version When required
CMake 3.20 Always
C++ compiler C++17 Always
Flex 2.6 Debug builds only
Bison 3.7 Debug builds only

Pre-generated parser/lexer sources are committed to generated/ and used automatically for non-Debug build types, so end users and CI release builds do not need flex or bison installed.

Configure and build

Release build (no flex or bison required — uses committed generated sources):

cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j$(nproc)

Debug build (requires flex ≥ 2.6 and bison ≥ 3.7 — regenerates parser/lexer from source):

cmake -S . -B build -DCMAKE_BUILD_TYPE=Debug
cmake --build build -j$(nproc)

After modifying src/Lexer.l or src/Parser.y, run the helper target to refresh the committed sources in generated/ and then commit them:

cmake --build build --target update_generated
git add generated/ && git commit

Pass -DNMHIT_BUILD_TESTS=OFF to skip building the test executable.

Run tests

ctest --test-dir build --output-on-failure

Install

cmake -S . -B build -DCMAKE_INSTALL_PREFIX=/your/prefix
cmake --build build -j$(nproc)
cmake --install build

This installs:

Path Contents
<prefix>/lib/libnmhit.a Static library
<prefix>/include/nmhit/ Public headers
<prefix>/lib/cmake/nmhit/ CMake config files
<prefix>/lib/pkgconfig/nmhit.pc pkg-config file

Use from an installed location

CMake find_package:

find_package(nmhit REQUIRED)
target_link_libraries(myapp PRIVATE nmhit::nmhit)

If the library was installed to a non-standard prefix, point CMake at it:

cmake -S . -B build -Dnmhit_DIR=/your/prefix/lib/cmake/nmhit

pkg-config:

pkg-config --cflags --libs nmhit

If the library was installed to a non-standard prefix:

PKG_CONFIG_PATH=/your/prefix/lib/pkgconfig pkg-config --cflags --libs nmhit

Use as a CMake subdirectory

Add the repository as a subdirectory of your project:

add_subdirectory(neml2-hit)
target_link_libraries(myapp PRIVATE nmhit)

The nmhit target exports include/ as a public include directory, so #include "nmhit/nmhit.h" works without any additional configuration.


License

This project is a sub-component of NEML2 and is distributed under the same license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nmhit-0.1.1.tar.gz (92.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

nmhit-0.1.1-cp39-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (232.3 kB view details)

Uploaded CPython 3.9+manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

nmhit-0.1.1-cp39-abi3-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl (214.9 kB view details)

Uploaded CPython 3.9+manylinux: glibc 2.27+ ARM64manylinux: glibc 2.28+ ARM64

nmhit-0.1.1-cp39-abi3-macosx_11_0_arm64.whl (137.7 kB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

nmhit-0.1.1-cp39-abi3-macosx_10_15_x86_64.whl (148.4 kB view details)

Uploaded CPython 3.9+macOS 10.15+ x86-64

File details

Details for the file nmhit-0.1.1.tar.gz.

File metadata

  • Download URL: nmhit-0.1.1.tar.gz
  • Upload date:
  • Size: 92.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for nmhit-0.1.1.tar.gz
Algorithm Hash digest
SHA256 2bd9e5ff8aeb61c020459986fa7ff35619ca22539f0c0335ebebed1a98398f40
MD5 2c21d38285bf4e571f285314c8920fc1
BLAKE2b-256 f5ae40ab07b6237a5661e926a6702e7a78472df70ef1f0d2e1dd9ee60fc15c3d

See more details on using hashes here.

Provenance

The following attestation bundles were made for nmhit-0.1.1.tar.gz:

Publisher: release.yml on applied-material-modeling/neml2-hit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nmhit-0.1.1-cp39-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for nmhit-0.1.1-cp39-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 068b7cdc1ebbc766e98e62c44a672f322424d230c5d05d9ef05e98f19b71baf7
MD5 93d4cdacbebcc3c00b3e1dfc41922236
BLAKE2b-256 9ced6d197e4b10f4d65109e1f4e3b0d6dbe93b63d312c587b67bc0e1d65b8938

See more details on using hashes here.

Provenance

The following attestation bundles were made for nmhit-0.1.1-cp39-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on applied-material-modeling/neml2-hit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nmhit-0.1.1-cp39-abi3-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for nmhit-0.1.1-cp39-abi3-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 0acb1f43063667634e4eed19a24e8b4aa2f755099e200e61dd2f54edb0ad9c5e
MD5 aee04a81b7c18f77c1f8a80ca3c128d6
BLAKE2b-256 17e505db3fe8943977063ea669767652a789ce7007de92da930d27a478aaaec6

See more details on using hashes here.

Provenance

The following attestation bundles were made for nmhit-0.1.1-cp39-abi3-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl:

Publisher: release.yml on applied-material-modeling/neml2-hit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nmhit-0.1.1-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

  • Download URL: nmhit-0.1.1-cp39-abi3-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 137.7 kB
  • Tags: CPython 3.9+, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for nmhit-0.1.1-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6e45a66162326203cee09425d5afbbfe75591d19b16f1ea64e1d955f714b795b
MD5 1058392a68857ccc8b2f12587c3b0f47
BLAKE2b-256 8658a9424b267b7482e7041dcd2e2b826f7291829ad9594aeb4392cf1086a304

See more details on using hashes here.

Provenance

The following attestation bundles were made for nmhit-0.1.1-cp39-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on applied-material-modeling/neml2-hit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nmhit-0.1.1-cp39-abi3-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for nmhit-0.1.1-cp39-abi3-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 cb532b2336b417e961be9a6e1052ead2cf9c1d5c5b1b58040910dc20b08c50d7
MD5 eed9445a9416b7d41bda1119fa3cb1f7
BLAKE2b-256 09b801bff9cae48af8b2d573df5325272c4acc088191035d9a2cd3435dbe1cd8

See more details on using hashes here.

Provenance

The following attestation bundles were made for nmhit-0.1.1-cp39-abi3-macosx_10_15_x86_64.whl:

Publisher: release.yml on applied-material-modeling/neml2-hit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page