Convert data + NeXus definitions to NeXus HDF5, with plugins
Project description
NeXusCreator
NeXusCreator converts experimental data and NeXus definition files into valid NeXus (.nxs) HDF5 files. It also supports generating NeXus-definition templates (.nxd) from input data. A lightweight plugin system powers both generation and parsing for formats like SPEC and DTA/DAT, including a batteries folder workflow.
Features
- High-performance conversion from
.nxd+ input data to.nxs(HDF5) files - Optimized generation of
.nxdfrom input (e.g., SPEC file or DTA/DAT folder) - Efficient per-scan output for SPEC (
-f) plus a master file with external links - Batteries workflow (folder of DTA/DAT) with fast combined parsing
- Performance-optimized temperature
.datvariables written under/entry/experiments/temperature - Fast template expansion for scan groups in
.nxdvia@scan_template/placeholders - Optional high-speed
punxvalidation after.nxsgeneration - Rapid CSV export of dataset values from
.nxs/HDF5 inputs - Extensible plugin system for generators and parsers with minimal overhead
- Memory-efficient processing of large datasets and multi-scan files
Installation
- Python 3.9+
- Recommended: a virtual environment
Install from this repo:
python3 -m venv .venv && source .venv/bin/activate
python3 -m pip install .
This installs console scripts:
nexuscreator(preferred)nxc(alias)
To develop/run from source without installing, you can still invoke the CLI directly:
python3 NeXusCreator.py -h
Quick Start
- Generate a NeXus-definition (
.nxd) from a SPEC file:
nexuscreator -g out.nxd -i data.spec
- Generate a single-scan template from SPEC for later multi-scan conversion:
nexuscreator -g template.nxd -i data.spec -t
- Generate a
.nxdfrom a batteries folder (DTA/DAT):
nexuscreator -g out.nxd -i /path/to/folder -b batteries
# Directory generation modes
# Single combined .nxd for entire folder
nexuscreator -g out.nxd -i /path/to/folder --single-file
# One .nxd per input file (.dta and .dat)
nexuscreator -g out_prefix_ -i /path/to/folder --multi-file
- Convert an operando EIS folder directly to NeXus (no
.nxdrequired):
# Auto-generate and convert in one step
nexuscreator -i /path/to/eis_folder -b operando_eis -o eis.nxs
# Save the generated definition for reuse
nexuscreator -g eis.nxd -i /path/to/eis_folder -b operando_eis
# Generate a single reusable template (one entry per file_name)
nexuscreator -g eis_template.nxd -i /path/to/eis_folder -b operando_eis -t
The folder must contain Gamry .DTA / _Raw.DTA file pairs following the
naming convention EIS_{CH|DIS}_#cycle_#file for operando measurements or
EIS_{CH|DIS}-static_#file for static (open-circuit) measurements.
- Generate YAML instead of .nxd (with -g):
# Single file → YAML
nexuscreator -g out_dir/ -i data.spec --yaml
# Folder (multi-file) → YAML per input
nexuscreator -g out_dir/ -i /path/to/folder --multi-file --yaml
# Folder (single-file) → one YAML file
nexuscreator -g out_dir/ -i /path/to/folder --single-file --yaml
- Generate a
.nxdfrom an existing HDF5/NeXus file:
nexuscreator -g out.nxd -i data.nxs --hdf5-option links # external links (default)
nexuscreator -g out.nxd -i data.nxs --hdf5-option extract # placeholders + dictionary
- Convert using a NeXus definition and data input:
# Single output
nexuscreator -n def.nxd -i data.spec -o out.nxs
# One file per scan + master file with external links
nexuscreator -n def.nxd -i data.spec -o out.nxs -f
# DTA/DAT single file
nexuscreator -n def.nxd -i data.dta -o out.nxs
# Batteries folder
nexuscreator -n def.nxd -i /path/to/folder -b batteries -o out.nxs
Useful options by when they are relevant:
- Always relevant:
-i/--input,-o/--output_path,-b/--beamline_name - When converting (
-n):-n/--nexus_definition,-f/--file_per_scan,-I/--icat_proposal_number,--auto-generate-nxd,--pair-dta-raw - When generating (
-g):-g/--generate_nexus_definition,-t/--template,--single-file,--multi-file,--yaml,--hdf5-option - For directory/batch processing:
-r/--recursive,--glob,--glob-spec,--glob-dta,--dry-run,--summary-only,--no-group-dta-folders - For metadata/schema placement:
--metadata-csv,--jsonld-structure,--nxdl-root,--app-def,--export-vars-csv,-d/--dictionary,-D/--debug - For validation/CSV value export:
--validate,--export-values-csv,--export-values-prefix,--csv-delimiter
--hdf5-option modes:
links(or1): build definitions with external links to source HDF5/NeXus datasets (default for-g).extract(or2): extract datasets into variables/placeholders (default for-n).
See more detailed examples in docs/source/usage.md (or build the Sphinx docs for HTML output).
Citation
If you use NeXusCreator, cite the archived release:
Perez Ponce, H. (2026). NeXusCreator (2.0.0). Zenodo. https://doi.org/10.5281/zenodo.19600002
Machine-readable citation metadata is available in CITATION.cff.
Documentation
- Read the Docs (latest): https://nexuscreator.readthedocs.io/en/latest/
- NeXusCreator Assistant (Custom GPT): https://chatgpt.com/g/g-6979cafeeca48191af6a9027bba4e2d8-nexuscreator-assistant
- Source files live under
docs/sourceand are built with Sphinx (MyST Markdown). - NeXus description syntax reference:
docs/source/nexus-description-syntax.md(Sphinx source) anddocs/nexus-description-syntax.md(plain Markdown copy). - Recent changes:
docs/source/whats-new.md - Install doc dependencies:
python -m pip install -r dev-requirements.txt - Build HTML:
make docs(renders intodocs/_build/html)
When sphinx-build is not available locally, install it via the developer requirements above.
Open _build/html/index.html in a browser to browse the rendered documentation.
Command Reference (CLI)
Run python3 NeXusCreator.py help or python3 NeXusCreator.py -h for the latest usage.
Always relevant
help,-h, --help: show help-v, --version: print version--license: print full Apache-2.0 license text--notice: print third-party notices--list-beamlines: list accepted values for-b/--beamline_name-i, --input PATH: input file or directory-o, --output_path PATH: output file or directory-b, --beamline_name NAME: beamline context (for exampleikft,batteries,peaxis)
Relevant when converting to .nxs (-n)
-n, --nexus_definition FILE: convert using a NeXus definition-f, --file_per_scan: for SPEC, write one.nxsper scan and a master-I, --icat_proposal_number NUM: ICAT proposal subfolder in output--auto-generate-nxd: with directory input, auto-generate.nxdper file then convert--pair-dta-raw: for single.dta, combine with sibling*_raw.dta(or inverse)
Relevant when generating definitions (-g)
-g, --generate_nexus_definition FILE: build a.nxdfrom input-t, --template: emit a single-scan template--single-file: with directory input, generate one combined definition--multi-file: with directory input, generate one definition per input file--yaml: write YAML (.yaml) instead of.nxd--hdf5-option MODE:links(default for-g) orextract(default for-n), accepts1/2
Relevant for directory/batch processing
-r, --recursive: scan directories recursively--glob PATTERN: filter scanned files (for example*.spec)--glob-spec PATTERN: SPEC-only filter (overrides--globfor SPEC)--glob-dta PATTERN: DTA/DAT-only filter (overrides--globfor DTA/DAT)--dry-run: list matches and planned outputs without writing--summary-only: with--dry-run, print only the summary line--no-group-dta-folders: process DTA/DAT individually instead of per-folder grouping
Relevant for metadata/schema-guided placement
--metadata-csv FILE: enrich variables (variable_name, variable_description, units)--jsonld-structure FILE: JSON-LD descriptor for parser/generator behavior--nxdl-root PATH: path to NXDL definitions for placement guidance--app-def NAME: preferred NXDL application (for exampleNXxas)--export-vars-csv FILE: export parsed variables with descriptions/units to CSV-d, --dictionary: print the parsed variable dictionary-D, --debug: print the current.nxdline being processed
Relevant for validation and value export
--validate: runpunxvalidation after writing.nxs--export-values-csv FILE: export dataset values from.nxs/HDF5 to CSV--export-values-prefix PATH: restrict exported datasets to a prefix (must start with/)--csv-delimiter CHAR: override CSV delimiter (default,)
You can also run the script directly if you haven’t installed the package:
python3 NeXusCreator.py -h
Internal Links in .nxd and YAML
You can declare internal links in NeXus Description files to point to an existing
dataset elsewhere in the same file. Use the arrow syntax in .nxd:
scopeP: --> /entry/instrument/11_detector_chamber/scope/scopeP
When converting to HDF5, this becomes an internal soft link at scopeP pointing
to the target dataset.
YAML representation uses a link field for the same construct:
scopeP:
link: /entry/instrument/11_detector_chamber/scope/scopeP
Both forms are supported by the readers and the HDF5 writer.
External Links in .nxd and YAML
You can link to datasets in another NeXus/HDF5 file using the external link
syntax in .nxd:
calibration_2: --> ../calibration/rixsCucalcold_R0001.nxs | /entry/
This creates an HDF5 external link named calibration_2 pointing to the
group /entry/ inside the target file.
YAML representation uses an external mapping with file and path keys:
calibration_2:
external:
file: ../calibration/rixsCucalcold_R0001.nxs
path: /entry/
Both forms are supported by the YAML and .nxd readers and are written as HDF5 external links during conversion.
Plugins
NeXusCreator auto-discovers plugins under Plugins/:
- Definition generators build
.nxdobjects for the-gflow - Data parsers create the variable library used for injection during conversion
Built-in plugins:
- SPEC (
plugins/spec_plugin.py): generator + parser - DTA/DAT (
plugins/dta_plugin.py): generator + parser; supports RAW/temp/non-RAW and batteries folder - HDF5/NeXus (
plugins/hdf5_plugin.py): generator + parser; generate .nxd by linking datasets in existing .nxs/.h5 - TIF (
plugins/tiff_plugin.py): generator + parser; reads.tif/.tiffand creates an image-centric definition
Write your own plugin by adding a module to plugins/ that subclasses the base interfaces; no registration is required. See docs/source/plugins.md for details and examples.
You can also load external plugin files/directories without editing this repository:
export NEXUSCREATOR_PLUGIN_PATHS="/path/to/my_plugins:/path/to/custom_plugin.py"
Set NEXUSCREATOR_PLUGIN_DEBUG=1 to print plugin import diagnostics.
Notes on directory generation with -g
- Default for directory input is multi-file (one
.nxdper supported input). - If
-b peaxisis set for a directory, the default switches to single-file (combined) generation. - Use
--single-fileto produce one combined.nxdfrom a folder (batteries-style structure for DTA/DAT folders). --single-fileand--multi-fileare mutually exclusive. The CLI will error if both are passed.
Temperature data location:
- For both single
.datinputs and batteries folders, temperature datasets are placed under/entry/experiments/temperature.
Further reading: docs/architecture.md, docs/usage.md.
Python API
Programmatic conversion using the package API:
from nexuscreator import NeXusCreator, create_nexus
# One-shot helper
out_path = create_nexus(nexus_definition_file="def.nxd",
input_path="data.spec",
output_path="out.nxs",
beamline_name="ikft")
# Lower-level: execute using flags
NeXusCreator().execute_conversion({
'nexus_definition_file': 'def.nxd',
'input_path': 'data.spec',
'output_path': 'out.nxs',
'beamline_name': 'ikft',
})
Advanced Usage
from nexuscreator import NeXusCreator
creator = NeXusCreator()
# Batch processing multiple files
files_to_process = [
{'input': 'data1.spec', 'output': 'out1.nxs'},
{'input': 'data2.spec', 'output': 'out2.nxs'},
{'input': 'data3.spec', 'output': 'out3.nxs'}
]
for file_info in files_to_process:
creator.execute_conversion({
'nexus_definition_file': 'template.nxd',
'input_path': file_info['input'],
'output_path': file_info['output'],
'beamline_name': 'ikft'
})
Performance Tips
- Reuse a prepared
.nxdwhen converting many similar inputs. - Batch related files together to reduce setup overhead.
- Use
--dry-runfirst for large directory inputs to confirm file matching. - Enable
--validateonly when you need NXDL validation, since it adds work after writing.
NeXus Definition Syntax
A nexus_object is a nested dictionary that defines the structure of a NeXus file. It uses specific keys and conventions to represent NeXus groups, datasets, attributes, and links.
Basic Structure
A nexus_object typically starts with an entry group, which is the root of the NeXus file:
nexus_object = {
'@default': 'entry',
'entry': {
'@NX_class': 'NXentry',
# Additional groups and datasets go here
}
}
Groups
Groups are represented as nested dictionaries. The @NX_class key specifies the NeXus class of the group:
nexus_object = {
'entry': {
'@NX_class': 'NXentry',
'instrument': {
'@NX_class': 'NXinstrument',
# Instrument-specific groups and datasets
},
'sample': {
'@NX_class': 'NXsample',
# Sample-specific groups and datasets
}
}
}
Datasets
Datasets are represented as dictionaries with specific keys:
@dtype: The data type (e.g.,NX_FLOAT64,NX_INT32).@value: The value of the dataset. This can be a direct value or a placeholder.@units: The units of the dataset (optional).@description: A description of the dataset (optional).
nexus_object = {
'entry': {
'@NX_class': 'NXentry',
'temperature': {
'@dtype': 'NX_FLOAT64',
'@value': 298.15,
'@units': 'K',
'@description': 'Sample temperature'
}
}
}
Placeholders
Placeholders are used to inject values from a library during conversion. The @value key specifies the placeholder name:
nexus_object = {
'entry': {
'@NX_class': 'NXentry',
'temperature': {
'@dtype': 'NX_FLOAT64',
'@value': 'temp_value',
'@units': 'K'
}
}
}
Links
Links allow you to reference datasets in the same file (internal links) or in another file (external links).
Internal Links
Internal links use the link key to point to another dataset in the same file:
nexus_object = {
'entry': {
'@NX_class': 'NXentry',
'data': {
'@NX_class': 'NXdata',
'signal': {
'@dtype': 'NX_FLOAT64',
'@value': [1.0, 2.0, 3.0]
},
'signal_link': {
'link': '/entry/data/signal'
}
}
}
}
External Links
External links use the external key to point to a dataset in another file:
nexus_object = {
'entry': {
'@NX_class': 'NXentry',
'calibration': {
'external': {
'file': 'calibration.nxs',
'path': '/entry/calibration/data'
}
}
}
}
Attributes
Attributes are represented as keys starting with @ in the dataset or group dictionary:
nexus_object = {
'entry': {
'@NX_class': 'NXentry',
'@default': 'data',
'data': {
'@NX_class': 'NXdata',
'@signal': 'signal',
'@axes': 'axes',
'signal': {
'@dtype': 'NX_FLOAT64',
'@value': [1.0, 2.0, 3.0],
'@units': 'counts',
'@description': 'Measurement signal'
}
}
}
}
Creating a nexus_object
To create a nexus_object, you can either:
- Write a
.nxdfile and use theNeXusDefinitionlibrary to read it. - Directly construct the dictionary in Python.
Example: Direct Construction
nexus_object = {
'@default': 'entry',
'entry': {
'@NX_class': 'NXentry',
'instrument': {
'@NX_class': 'NXinstrument',
'detector': {
'@NX_class': 'NXdetector',
'data': {
'@dtype': 'NX_FLOAT64',
'@value': [1.0, 2.0, 3.0],
'@units': 'counts',
'@description': 'Detector counts'
}
}
},
'sample': {
'@NX_class': 'NXsample',
'temperature': {
'@dtype': 'NX_FLOAT64',
'@value': 'temp_placeholder',
'@units': 'K',
'@description': 'Sample temperature'
}
}
}
}
Example: Using a .nxd File
Create a file named example.nxd:
@default: entry
entry: {
@NX_class: NXentry
instrument: {
@NX_class: NXinstrument
detector: {
@NX_class: NXdetector
data: {
@dtype: NX_FLOAT64
@value: [1.0, 2.0, 3.0]
@units: counts
@description: Detector counts
}
}
}
sample: {
@NX_class: NXsample
temperature: {
@dtype: NX_FLOAT64
@value: temp_placeholder
@units: K
@description: Sample temperature
}
}
}
Then read it using the NeXusDefinition library:
from libraries.NeXusDefinition import NexusDefinitionReader
nexus_object = NexusDefinitionReader('example.nxd').read()
Modifying a nexus_object
You can modify a nexus_object by directly manipulating the dictionary:
# Add a new dataset
nexus_object['entry']['sample']['pressure'] = {
'@dtype': 'NX_FLOAT64',
'@value': 'pressure_placeholder',
'@units': 'Pa',
'@description': 'Sample pressure'
}
# Modify an existing dataset
nexus_object['entry']['sample']['temperature']['@value'] = 'new_temp_placeholder'
# Remove a dataset
del nexus_object['entry']['sample']['temperature']
Writing a nexus_object to a NeXus File
Use the NexusHDF5Writer to write a nexus_object to a NeXus file:
from libraries.NeXusHDF5 import NexusHDF5Writer
NexusHDF5Writer(nexus_object).write('output.nxs')
How It Works
NeXusCreator.pyprovides the CLI and routes to either definition generation or conversion.NeXusCreatorClass.pyorchestrates parsing, template expansion, injection, and writing. It:- Selects a parser via the plugin manager (preferred) or built-ins
- Supports per-scan outputs for SPEC with a generated master file (external links)
- Expands scan templates in
.nxdusing placeholders likescan{num}_and@scan_template
Libraries/NeXusHDF5.pycontainsNexusValueInjectorandNexusHDF5Writerwith memory-efficient chunked writing.Generators/andParsers/include format-specific logic reused by plugins, with streaming parsers for large files.
See docs/architecture.md for a deeper dive into the data flow and components.
These optimizations make NeXusCreator suitable for processing large experimental datasets efficiently while maintaining data integrity and NeXus compliance.
Testing
make install # or: pip install -r requirements.txt
make test # or: pytest -q
- Run a specific file:
pytest -q Tests/test_spec_parser_data_files.py - If a plugin causes a Qt/shiboken import error, disable autoload:
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 pytest -q
See Docs/tests.md for details on the test suite and troubleshooting.
Development
- Linting and formatting checks:
make install-dev # installs ruff + flake8
make lint # runs ruff and flake8 (non-fatal)
make compile # byte-compiles modules
Troubleshooting
ModuleNotFoundError: No module named 'h5py'— Install dependencies:pip install h5py numpy.- “no suitable generator/parser plugin found” — Verify file extension and beamline flag; see Plugins docs.
- Empty output or missing datasets — Use
-dto print the variable dictionary; check that.nxdplaceholders match library keys. - SPEC per-scan not splitting — Ensure
-fis provided and scans exist in the parsed dictionary. - Too many lines in
--dry-runoutput — Add--summary-onlyto print only the final batch summary.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nexuscreator-2.0.0.tar.gz.
File metadata
- Download URL: nexuscreator-2.0.0.tar.gz
- Upload date:
- Size: 145.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
031ef2f425bfb1fba400c2b0688ecf415fc4d32ba7b21a5945ae76e88e197b78
|
|
| MD5 |
1b6f7719ca1df31a7902cd8839c2d6c7
|
|
| BLAKE2b-256 |
aa4f42f943106c6565de264bbca0dc3466ecaf4460de702c4458d0589ecf50e1
|
File details
Details for the file nexuscreator-2.0.0-py3-none-any.whl.
File metadata
- Download URL: nexuscreator-2.0.0-py3-none-any.whl
- Upload date:
- Size: 160.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
17e6c1ee1ff8122d663712e5a4e4da2e7e3a323582e0ab3f3b4ddc037ce34049
|
|
| MD5 |
4fdb0f14f73ee70707c4abf1b114d6dd
|
|
| BLAKE2b-256 |
3a75ec7b69901c02d69e31afe9518a73aa65ef41d2c31013201b2732e3300401
|