Skip to main content

Convert data + NeXus definitions to NeXus HDF5, with plugins

Project description

NeXusCreator

DOI

NeXusCreator converts experimental data and NeXus definition files into valid NeXus (.nxs) HDF5 files. It also supports generating NeXus-definition templates (.nxd) from input data. A lightweight plugin system powers both generation and parsing for formats like SPEC and DTA/DAT, including a batteries folder workflow.

Features

  • High-performance conversion from .nxd + input data to .nxs (HDF5) files
  • Optimized generation of .nxd from input (e.g., SPEC file or DTA/DAT folder)
  • Efficient per-scan output for SPEC (-f) plus a master file with external links
  • Batteries workflow (folder of DTA/DAT) with fast combined parsing
  • Performance-optimized temperature .dat variables written under /entry/experiments/temperature
  • Fast template expansion for scan groups in .nxd via @scan_template/placeholders
  • Optional high-speed punx validation after .nxs generation
  • Rapid CSV export of dataset values from .nxs/HDF5 inputs
  • Extensible plugin system for generators and parsers with minimal overhead
  • Memory-efficient processing of large datasets and multi-scan files

Installation

  • Python 3.9+
  • Recommended: a virtual environment

Install from PyPI:

python3 -m venv .venv && source .venv/bin/activate
pip install nexuscreator

Or install from source:

python3 -m venv .venv && source .venv/bin/activate
python3 -m pip install .

This installs console scripts:

  • nexuscreator (preferred)
  • nxc (alias)

To develop/run from source without installing, you can still invoke the CLI directly:

python3 -m nexuscreator -h

Quick Start

  1. Generate a NeXus-definition (.nxd) from a SPEC file:
nexuscreator -g out.nxd -i data.spec
  1. Generate a single-scan template from SPEC for later multi-scan conversion:
nexuscreator -g template.nxd -i data.spec -t
  1. Generate a .nxd from a batteries folder (DTA/DAT):
nexuscreator -g out.nxd -i /path/to/folder -b batteries

# Directory generation modes
# Single combined .nxd for entire folder
nexuscreator -g out.nxd -i /path/to/folder --single-file

# One .nxd per input file (.dta and .dat)
nexuscreator -g out_prefix_ -i /path/to/folder --multi-file
  1. Convert an operando EIS folder directly to NeXus (no .nxd required):
# Auto-generate and convert in one step
nexuscreator -i /path/to/eis_folder -b operando_eis -o eis.nxs

# Save the generated definition for reuse
nexuscreator -g eis.nxd -i /path/to/eis_folder -b operando_eis

# Generate a single reusable template (one entry per file_name)
nexuscreator -g eis_template.nxd -i /path/to/eis_folder -b operando_eis -t

The folder must contain Gamry .DTA / _Raw.DTA file pairs following the naming convention EIS_{CH|DIS}_#cycle_#file for operando measurements or EIS_{CH|DIS}-static_#file for static (open-circuit) measurements.

  1. Generate YAML instead of .nxd (with -g):
# Single file → YAML
nexuscreator -g out_dir/ -i data.spec --yaml

# Folder (multi-file) → YAML per input
nexuscreator -g out_dir/ -i /path/to/folder --multi-file --yaml

# Folder (single-file) → one YAML file
nexuscreator -g out_dir/ -i /path/to/folder --single-file --yaml
  1. Generate a .nxd from an existing HDF5/NeXus file:
nexuscreator -g out.nxd -i data.nxs --hdf5-option links   # external links (default)
nexuscreator -g out.nxd -i data.nxs --hdf5-option extract # placeholders + dictionary
  1. Convert using a NeXus definition and data input:
# Single output
nexuscreator -n def.nxd -i data.spec -o out.nxs

# One file per scan + master file with external links
nexuscreator -n def.nxd -i data.spec -o out.nxs -f

# DTA/DAT single file
nexuscreator -n def.nxd -i data.dta -o out.nxs

# Batteries folder
nexuscreator -n def.nxd -i /path/to/folder -b batteries -o out.nxs

Useful options by when they are relevant:

  • Always relevant: -i/--input, -o/--output_path, -b/--beamline_name
  • When converting (-n): -n/--nexus_definition, -f/--file_per_scan, -I/--icat_proposal_number, --auto-generate-nxd, --pair-dta-raw
  • When generating (-g): -g/--generate_nexus_definition, -t/--template, --single-file, --multi-file, --yaml, --hdf5-option
  • For directory/batch processing: -r/--recursive, --glob, --glob-spec, --glob-dta, --dry-run, --summary-only, --no-group-dta-folders
  • For metadata/schema placement: --metadata-csv, --jsonld-structure, --nxdl-root, --app-def, --export-vars-csv, -d/--dictionary, -D/--debug
  • For validation/CSV value export: --validate, --export-values-csv, --export-values-prefix, --csv-delimiter

--hdf5-option modes:

  • links (or 1): build definitions with external links to source HDF5/NeXus datasets (default for -g).
  • extract (or 2): extract datasets into variables/placeholders (default for -n).

See more detailed examples in docs/source/usage.md (or build the Sphinx docs for HTML output).

Citation

If you use NeXusCreator, cite the version you used. The Zenodo DOI currently tracked in this repository archives release 2.0.0:

Perez Ponce, H. (2026). NeXusCreator (2.0.0). Zenodo. https://doi.org/10.5281/zenodo.19600002

Machine-readable citation metadata is available in CITATION.cff.

Documentation

When sphinx-build is not available locally, install it via the developer requirements above. Open _build/html/index.html in a browser to browse the rendered documentation.

Command Reference (CLI)

Run python3 -m nexuscreator help or python3 -m nexuscreator -h for the latest usage.

Always relevant

  • help, -h, --help: show help
  • -v, --version: print version
  • --license: print full Apache-2.0 license text
  • --notice: print third-party notices
  • --list-beamlines: list accepted values for -b/--beamline_name
  • -i, --input PATH: input file or directory
  • -o, --output_path PATH: output file or directory
  • -b, --beamline_name NAME: beamline context (for example ikft, batteries, peaxis)

Relevant when converting to .nxs (-n)

  • -n, --nexus_definition FILE: convert using a NeXus definition
  • -f, --file_per_scan: for SPEC, write one .nxs per scan and a master
  • -I, --icat_proposal_number NUM: ICAT proposal subfolder in output
  • --auto-generate-nxd: with directory input, auto-generate .nxd per file then convert
  • --pair-dta-raw: for single .dta, combine with sibling *_raw.dta (or inverse)

Relevant when generating definitions (-g)

  • -g, --generate_nexus_definition FILE: build a .nxd from input
  • -t, --template: emit a single-scan template
  • --single-file: with directory input, generate one combined definition
  • --multi-file: with directory input, generate one definition per input file
  • --yaml: write YAML (.yaml) instead of .nxd
  • --hdf5-option MODE: links (default for -g) or extract (default for -n), accepts 1/2

Relevant for directory/batch processing

  • -r, --recursive: scan directories recursively
  • --glob PATTERN: filter scanned files (for example *.spec)
  • --glob-spec PATTERN: SPEC-only filter (overrides --glob for SPEC)
  • --glob-dta PATTERN: DTA/DAT-only filter (overrides --glob for DTA/DAT)
  • --dry-run: list matches and planned outputs without writing
  • --summary-only: with --dry-run, print only the summary line
  • --no-group-dta-folders: process DTA/DAT individually instead of per-folder grouping

Relevant for metadata/schema-guided placement

  • --metadata-csv FILE: enrich variables (variable_name, variable_description, units)
  • --jsonld-structure FILE: JSON-LD descriptor for parser/generator behavior
  • --nxdl-root PATH: path to NXDL definitions for placement guidance
  • --app-def NAME: preferred NXDL application (for example NXxas)
  • --export-vars-csv FILE: export parsed variables with descriptions/units to CSV
  • -d, --dictionary: print the parsed variable dictionary
  • -D, --debug: print the current .nxd line being processed

Relevant for validation and value export

  • --validate: run punx validation after writing .nxs
  • --export-values-csv FILE: export dataset values from .nxs/HDF5 to CSV
  • --export-values-prefix PATH: restrict exported datasets to a prefix (must start with /)
  • --csv-delimiter CHAR: override CSV delimiter (default ,)

You can also run the module directly if you haven’t installed the package:

python3 -m nexuscreator -h

Internal Links in .nxd and YAML

You can declare internal links in NeXus Description files to point to an existing dataset elsewhere in the same file. Use the arrow syntax in .nxd:

scopeP: --> /entry/instrument/11_detector_chamber/scope/scopeP

When converting to HDF5, this becomes an internal soft link at scopeP pointing to the target dataset.

YAML representation uses a link field for the same construct:

scopeP:
  link: /entry/instrument/11_detector_chamber/scope/scopeP

Both forms are supported by the readers and the HDF5 writer.

External Links in .nxd and YAML

You can link to datasets in another NeXus/HDF5 file using the external link syntax in .nxd:

calibration_2:  --> ../calibration/rixsCucalcold_R0001.nxs | /entry/

This creates an HDF5 external link named calibration_2 pointing to the group /entry/ inside the target file.

YAML representation uses an external mapping with file and path keys:

calibration_2:
  external:
    file: ../calibration/rixsCucalcold_R0001.nxs
    path: /entry/

Both forms are supported by the YAML and .nxd readers and are written as HDF5 external links during conversion.

Plugins

NeXusCreator auto-discovers plugins under nexuscreator/plugins/:

  • Definition generators build .nxd objects for the -g flow
  • Data parsers create the variable library used for injection during conversion

Built-in plugins:

  • SPEC (plugins/spec_plugin.py): generator + parser
  • DTA/DAT (plugins/dta_plugin.py): generator + parser; supports RAW/temp/non-RAW and batteries folder
  • HDF5/NeXus (plugins/hdf5_plugin.py): generator + parser; generate .nxd by linking datasets in existing .nxs/.h5
  • TIF (plugins/tiff_plugin.py): generator + parser; reads .tif/.tiff and creates an image-centric definition

Write your own plugin by adding a module to plugins/ that subclasses the base interfaces; no registration is required. See docs/source/plugins.md for details and examples.

You can also load external plugin files/directories without editing this repository:

export NEXUSCREATOR_PLUGIN_PATHS="/path/to/my_plugins:/path/to/custom_plugin.py"

Set NEXUSCREATOR_PLUGIN_DEBUG=1 to print plugin import diagnostics.

Notes on directory generation with -g

  • Default for directory input is multi-file (one .nxd per supported input).
  • If -b peaxis is set for a directory, the default switches to single-file (combined) generation.
  • Use --single-file to produce one combined .nxd from a folder (batteries-style structure for DTA/DAT folders).
  • --single-file and --multi-file are mutually exclusive. The CLI will error if both are passed.

Temperature data location:

  • For both single .dat inputs and batteries folders, temperature datasets are placed under /entry/experiments/temperature.

Further reading: docs/architecture.md, docs/usage.md.

Python API

Programmatic conversion using the package API:

from nexuscreator import NeXusCreator, create_nexus

# One-shot helper
out_path = create_nexus(nexus_definition_file="def.nxd",
                        input_path="data.spec",
                        output_path="out.nxs",
                        beamline_name="ikft")

# Lower-level: execute using flags
NeXusCreator().execute_conversion({
    'nexus_definition_file': 'def.nxd',
    'input_path': 'data.spec',
    'output_path': 'out.nxs',
    'beamline_name': 'ikft',
})

Advanced Usage

from nexuscreator import NeXusCreator

creator = NeXusCreator()

# Batch processing multiple files
files_to_process = [
    {'input': 'data1.spec', 'output': 'out1.nxs'},
    {'input': 'data2.spec', 'output': 'out2.nxs'},
    {'input': 'data3.spec', 'output': 'out3.nxs'}
]

for file_info in files_to_process:
    creator.execute_conversion({
        'nexus_definition_file': 'template.nxd',
        'input_path': file_info['input'],
        'output_path': file_info['output'],
        'beamline_name': 'ikft'
    })

Performance Tips

  1. Reuse a prepared .nxd when converting many similar inputs.
  2. Batch related files together to reduce setup overhead.
  3. Use --dry-run first for large directory inputs to confirm file matching.
  4. Enable --validate only when you need NXDL validation, since it adds work after writing.

NeXus Definition Syntax

A nexus_object is a nested dictionary that defines the structure of a NeXus file. It uses specific keys and conventions to represent NeXus groups, datasets, attributes, and links.

Basic Structure

A nexus_object typically starts with an entry group, which is the root of the NeXus file:

nexus_object = {
    '@default': 'entry',
    'entry': {
        '@NX_class': 'NXentry',
        # Additional groups and datasets go here
    }
}

Groups

Groups are represented as nested dictionaries. The @NX_class key specifies the NeXus class of the group:

nexus_object = {
    'entry': {
        '@NX_class': 'NXentry',
        'instrument': {
            '@NX_class': 'NXinstrument',
            # Instrument-specific groups and datasets
        },
        'sample': {
            '@NX_class': 'NXsample',
            # Sample-specific groups and datasets
        }
    }
}

Datasets

Datasets are represented as dictionaries with specific keys:

  • @dtype: The data type (e.g., NX_FLOAT64, NX_INT32).
  • @value: The value of the dataset. This can be a direct value or a placeholder.
  • @units: The units of the dataset (optional).
  • @description: A description of the dataset (optional).
nexus_object = {
    'entry': {
        '@NX_class': 'NXentry',
        'temperature': {
            '@dtype': 'NX_FLOAT64',
            '@value': 298.15,
            '@units': 'K',
            '@description': 'Sample temperature'
        }
    }
}

Placeholders

Placeholders are used to inject values from a library during conversion. The @value key specifies the placeholder name:

nexus_object = {
    'entry': {
        '@NX_class': 'NXentry',
        'temperature': {
            '@dtype': 'NX_FLOAT64',
            '@value': 'temp_value',
            '@units': 'K'
        }
    }
}

Links

Links allow you to reference datasets in the same file (internal links) or in another file (external links).

Internal Links

Internal links use the link key to point to another dataset in the same file:

nexus_object = {
    'entry': {
        '@NX_class': 'NXentry',
        'data': {
            '@NX_class': 'NXdata',
            'signal': {
                '@dtype': 'NX_FLOAT64',
                '@value': [1.0, 2.0, 3.0]
            },
            'signal_link': {
                'link': '/entry/data/signal'
            }
        }
    }
}

External Links

External links use the external key to point to a dataset in another file:

nexus_object = {
    'entry': {
        '@NX_class': 'NXentry',
        'calibration': {
            'external': {
                'file': 'calibration.nxs',
                'path': '/entry/calibration/data'
            }
        }
    }
}

Attributes

Attributes are represented as keys starting with @ in the dataset or group dictionary:

nexus_object = {
    'entry': {
        '@NX_class': 'NXentry',
        '@default': 'data',
        'data': {
            '@NX_class': 'NXdata',
            '@signal': 'signal',
            '@axes': 'axes',
            'signal': {
                '@dtype': 'NX_FLOAT64',
                '@value': [1.0, 2.0, 3.0],
                '@units': 'counts',
                '@description': 'Measurement signal'
            }
        }
    }
}

Creating a nexus_object

To create a nexus_object, you can either:

  1. Write a .nxd file and use the NeXusDefinition library to read it.
  2. Directly construct the dictionary in Python.

Example: Direct Construction

nexus_object = {
    '@default': 'entry',
    'entry': {
        '@NX_class': 'NXentry',
        'instrument': {
            '@NX_class': 'NXinstrument',
            'detector': {
                '@NX_class': 'NXdetector',
                'data': {
                    '@dtype': 'NX_FLOAT64',
                    '@value': [1.0, 2.0, 3.0],
                    '@units': 'counts',
                    '@description': 'Detector counts'
                }
            }
        },
        'sample': {
            '@NX_class': 'NXsample',
            'temperature': {
                '@dtype': 'NX_FLOAT64',
                '@value': 'temp_placeholder',
                '@units': 'K',
                '@description': 'Sample temperature'
            }
        }
    }
}

Example: Using a .nxd File

Create a file named example.nxd:

@default: entry
entry: {
    @NX_class: NXentry
    instrument: {
        @NX_class: NXinstrument
        detector: {
            @NX_class: NXdetector
            data: {
                @dtype: NX_FLOAT64
                @value: [1.0, 2.0, 3.0]
                @units: counts
                @description: Detector counts
            }
        }
    }
    sample: {
        @NX_class: NXsample
        temperature: {
            @dtype: NX_FLOAT64
            @value: temp_placeholder
            @units: K
            @description: Sample temperature
        }
    }
}

Then read it using the NeXusDefinition library:

from nexuscreator.libraries.NeXusDefinition import NexusDefinitionReader

nexus_object = NexusDefinitionReader('example.nxd').read()

Modifying a nexus_object

You can modify a nexus_object by directly manipulating the dictionary:

# Add a new dataset
nexus_object['entry']['sample']['pressure'] = {
    '@dtype': 'NX_FLOAT64',
    '@value': 'pressure_placeholder',
    '@units': 'Pa',
    '@description': 'Sample pressure'
}

# Modify an existing dataset
nexus_object['entry']['sample']['temperature']['@value'] = 'new_temp_placeholder'

# Remove a dataset
del nexus_object['entry']['sample']['temperature']

Writing a nexus_object to a NeXus File

Use the NexusHDF5Writer to write a nexus_object to a NeXus file:

from nexuscreator.libraries.NeXusHDF5 import NexusHDF5Writer

NexusHDF5Writer(nexus_object).write('output.nxs')

How It Works

  • nexuscreator/cli.py provides the CLI and routes to either definition generation or conversion.
  • nexuscreator/creator.py orchestrates parsing, template expansion, injection, and writing. It:
    • Selects a parser via the plugin manager (preferred) or built-ins
    • Supports per-scan outputs for SPEC with a generated master file (external links)
    • Expands scan templates in .nxd using placeholders like scan{num}_ and @scan_template
  • nexuscreator/libraries/NeXusHDF5.py contains NexusValueInjector and NexusHDF5Writer with memory-efficient chunked writing.
  • nexuscreator/generators/ and nexuscreator/parsers/ include format-specific logic reused by plugins, with streaming parsers for large files.

See docs/architecture.md for a deeper dive into the data flow and components.

These optimizations make NeXusCreator suitable for processing large experimental datasets efficiently while maintaining data integrity and NeXus compliance.

Testing

make install  # or: pip install -r requirements.txt
make test     # or: pytest -q
  • Run a specific file: pytest -q tests/test_spec_parser_data_files.py
  • If a plugin causes a Qt/shiboken import error, disable autoload: PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 pytest -q

See docs/source/tests.md for details on the test suite and troubleshooting.

Development

  • Linting and formatting checks:
make install-dev  # installs ruff + flake8
make lint         # runs ruff and flake8 (non-fatal)
make compile      # byte-compiles modules

Troubleshooting

  • ModuleNotFoundError: No module named 'h5py' — Install dependencies: pip install h5py numpy.
  • “no suitable generator/parser plugin found” — Verify file extension and beamline flag; see Plugins docs.
  • Empty output or missing datasets — Use -d to print the variable dictionary; check that .nxd placeholders match library keys.
  • SPEC per-scan not splitting — Ensure -f is provided and scans exist in the parsed dictionary.
  • Too many lines in --dry-run output — Add --summary-only to print only the final batch summary.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nexuscreator-2.0.2.tar.gz (144.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nexuscreator-2.0.2-py3-none-any.whl (161.4 kB view details)

Uploaded Python 3

File details

Details for the file nexuscreator-2.0.2.tar.gz.

File metadata

  • Download URL: nexuscreator-2.0.2.tar.gz
  • Upload date:
  • Size: 144.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for nexuscreator-2.0.2.tar.gz
Algorithm Hash digest
SHA256 420ecf1504fc92611d731549e9904f95496a51d3844cbcd7966eb1750c0a29d1
MD5 ba31cd73914e724d8f4d3caaf4a55b03
BLAKE2b-256 93b6ac3e816f6acf5b97bfafc79dac1426aeef2228b76c1d9486234870cd2952

See more details on using hashes here.

File details

Details for the file nexuscreator-2.0.2-py3-none-any.whl.

File metadata

  • Download URL: nexuscreator-2.0.2-py3-none-any.whl
  • Upload date:
  • Size: 161.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for nexuscreator-2.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a9c906b6164c90293d7a749e0c4c8aaa6a5cc636b42c577f70a19a8e0487373f
MD5 57bd70c1af0b8d6d691bc41bf43360d9
BLAKE2b-256 587fd40776db7b0ddc5f62b3f2cf2997fe3043e7f4b881a4d0d7cd47c1a378f9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page