Skip to main content

Reproducible ML-ready thermodynamic fluid-property datasets from configurable property backends.

Project description

Carnopy

Alpha software: public interfaces and generated schemas may still change before the stable 0.1.0 release.

Carnopy is a CLI-first Python package for generating reproducible, backend-derived thermophysical datasets for machine-learning, surrogate-model, and engineering workflows.

Carnopy is not a thermodynamic property model. It orchestrates configured property backends, validates deterministic sampling, preserves failed states as diagnostics, and emits stable tabular data with provenance. Generated values are synthetic backend output, not experimental data or backend-independent ground truth.

Milestone 1 supports pure fluids through CoolProp and three modes:

  • property_table: temperature-pressure state tables;
  • saturation_table: saturated-liquid and saturated-vapor endpoint rows;
  • vapor_mass_fraction_table: two-phase states over vapor mass fraction.

Contents

Installation

After 0.1.0a1 is published to PyPI:

python -m pip install "carnopy==0.1.0a1"

Install optional plotting support:

python -m pip install "carnopy[all]==0.1.0a1"

For an isolated CLI:

uv tool install "carnopy==0.1.0a1"
uv tool install "carnopy[all]==0.1.0a1"

The base package supports generation and validation. The viz and all extras install Matplotlib for manual or configured figure generation. PyArrow remains a core dependency because Parquet is a supported first-class output format.

For repository development:

uv sync --locked --extra all --group dev
uv run --locked carnopy --help

Quick start

The normal workflow is:

init → edit → optional validate → generate → inspect → optional plot

Create a starter configuration:

carnopy init property_table my-dataset.yaml

init reads the selected template packaged inside the installed carnopy module and writes a new file at the path you provide. For example, when the current directory is /home/cfd/carnopy/:

carnopy init property_table property.yaml

creates:

/home/cfd/carnopy/property.yaml

from the packaged property_table.yaml template. It does not modify or move the packaged template, and it refuses to overwrite an existing property.yaml. A relative output path is resolved from the current working directory; an absolute path is written exactly where specified.

Available modes:

property_table
saturation_table
vapor_mass_fraction_table

Discover backend fluids and semantic properties:

carnopy fluids
carnopy properties

Edit the YAML, optionally validate it, then generate an immutable run:

carnopy validate my-dataset.yaml
carnopy generate my-dataset.yaml

generate validates automatically. The separate validate command is useful for scripts and early feedback, but does not evaluate thermodynamic rows.

After generation, inspect the run before choosing a plot:

carnopy inspect outputs/<run>

The inspection lists fluids, sampling levels, emitted properties, compatible plot kinds, and copyable commands.

To choose a different output root:

carnopy generate \
  configs/cyclopentane_vapor_fraction_pressure.yaml \
  --out outputs/manual-test

The run is created directly under that root. Copy the exact path printed after Output directory:; do not prepend the output root again:

# Example only; replace this with the exact path printed by your run.
RUN_DIR="outputs/manual-test/20260621T172006Z_vapor_fraction_c8e28e9f"

Run names use UTC creation time, a short mode label, and the first eight hexadecimal characters of the unique run_id. Full identities and hashes remain in metadata.json.

Use command-specific help for the complete current interface:

carnopy --help
carnopy generate --help
carnopy plot --help

Configuration

Schema version 1 requires:

schema_version: 1
backend: coolprop
mode: property_table
fluids: [Propane]

grid:
  temperature:
    kind: linspace
    start: 20
    stop: 100
    num: 5
    unit: degC
  pressure:
    kind: linspace
    start: 1
    stop: 20
    num: 5
    unit: bar

properties:
  - specific_enthalpy
  - mass_density

outputs:
  # Omit this section to keep the same default.
  dataset_formats: [csv, parquet]

Modes

property_table requires temperature and pressure and generates their Cartesian product for every selected fluid.

saturation_table requires exactly one of temperature or pressure. It computes the missing saturation coordinate and emits separate saturated-liquid and saturated-vapor rows.

vapor_mass_fraction_table requires vapor mass fraction plus exactly one of temperature or pressure. Vapor mass fraction is vapor mass divided by total vapor-plus-liquid mass. Carnopy denotes it by $x_{\mathrm{vap}}$ in figures and scientific equations while keeping the explicit public field name vapor_mass_fraction. CoolProp's Q name remains internal to the adapter.

For a pure fluid at fixed saturation temperature or pressure:

  • $x_{\mathrm{vap}}=0$ is the saturated-liquid boundary;
  • $x_{\mathrm{vap}}=1$ is the saturated-vapor boundary;
  • $0<x_{\mathrm{vap}}<1$ is an equilibrium two-phase mixture state.

The endpoint states have definite backend properties. Near-endpoint values such as 0.01 and 0.99 are interior mixture states; they supplement rather than replace the boundaries. For specific enthalpy and specific volume:

h(x_{\mathrm{vap}})
=(1-x_{\mathrm{vap}})h_f+x_{\mathrm{vap}}h_g
\frac{1}{\rho(x_{\mathrm{vap}})}
=\frac{1-x_{\mathrm{vap}}}{\rho_f}
+\frac{x_{\mathrm{vap}}}{\rho_g}

See the CoolProp high-level saturation documentation for the backend definition of the endpoint states.

Samplers

Sampler Parameters Behavior
explicit values Preserves declared order; values must be finite and unique after SI conversion.
linspace start, stop, num Includes both endpoints; supports ascending and descending ranges.
stepspace start, stop, step Includes both endpoints; the endpoint must be reachable.
geomspace start, stop, num Positive physical endpoints; supports either direction.
logspace start_exp, stop_exp, num, optional base Samples exponent space; base must exceed one.

Equal sampler bounds are rejected; use explicit for one value. Geometric and logarithmic sampling is not supported for offset Celsius values or vapor mass fraction. Use Kelvin for geometric temperature grids.

linspace uses uniform increments. For example, start: 1, stop: 5, and num: 5 produce 1, 2, 3, 4, 5. geomspace uses uniform ratios and produces approximately 1, 1.495, 2.236, 3.344, 5 for the same bounds.

Dataset formats

Select generated table formats independently of the scientific specification:

outputs:
  dataset_formats: [csv]

Supported values are csv and parquet. At least one is required. Omitting outputs preserves the default [csv, parquet]. Format selection changes the artifact-generation context and output_request_id, but not spec_id or config.normalized.json.

Units

Supported input units:

temperature: K, degC
pressure: Pa, kPa, MPa, bar
vapor_mass_fraction: "1"

All backend calls and generated numeric columns use SI. Original units and sampler definitions remain recorded in metadata.

Validation rejects non-finite values, non-positive pressure, temperatures at or below absolute zero, vapor mass fractions outside [0, 1], incompatible units, duplicate canonical fluids, and projected runs above 1,000,000 rows.

Validation proves that a configuration is structurally executable. It does not promise that every fluid, state, phase, and requested property will be valid.

Properties

Use carnopy properties for the authoritative installed registry.

Semantic name Dataset column Classification
specific_enthalpy specific_enthalpy_J_kg backend-provided, reference-dependent
specific_entropy specific_entropy_J_kgK backend-provided, reference-dependent
specific_internal_energy specific_internal_energy_J_kg backend-provided, reference-dependent
mass_density mass_density_kg_m3 backend-provided
isobaric_specific_heat_capacity isobaric_specific_heat_capacity_J_kgK backend-provided
isochoric_specific_heat_capacity isochoric_specific_heat_capacity_J_kgK backend-provided
dynamic_viscosity dynamic_viscosity_Pa_s backend-provided
kinematic_viscosity kinematic_viscosity_m2_s derived from viscosity and density
thermal_conductivity thermal_conductivity_W_mK backend-provided
prandtl_number prandtl_number backend-provided
speed_of_sound speed_of_sound_m_s backend-provided
molar_mass molar_mass_kg_mol fluid constant
critical_temperature critical_temperature_K fluid constant
critical_pressure critical_pressure_Pa fluid constant
triple_point_temperature triple_point_temperature_K fluid constant
surface_tension surface_tension_N_m mode/region limited

Derived dependencies may be evaluated internally without being emitted unless explicitly requested. Fluid constants may be repeated in rows and are also summarized in metadata.

Milestone 1 uses strict row validity: failure of any required coordinate, phase, or requested property makes the row invalid. Successfully evaluated values may remain populated while failed values remain null. Requesting a mode-limited property such as surface_tension over a broad state grid can therefore invalidate otherwise usable rows.

Visualization

Visualization is a reproducible view of emitted dataset columns:

  • it never calls CoolProp or another thermodynamic backend;
  • it never smooths, interpolates, extrapolates, or invents states;
  • it preserves invalid and missing gaps;
  • it retains markers at emitted samples;
  • its identity is separate from scientific dataset identity.

Install carnopy[all] or carnopy[viz] before plotting.

Manual plotting

Supported plot kinds:

property-curves
property-heatmap
xy
pv
ts

Property curves use discrete, colorblind-safe series colors and markers. For property_table, choose the x-axis explicitly:

carnopy plot outputs/<property-run> \
  --kind property-curves \
  --property mass_density \
  --x temperature

Carnopy connects adjacent valid emitted samples with straight line segments as visual guides. It does not smooth or evaluate intermediate states. A sparse series advisory is emitted for connected series with five or fewer samples. Generate a denser source grid for finer thermodynamic resolution. Use SVG or PDF for zoom-independent rendering:

carnopy plot outputs/<run> ... --output figures/plot.svg
carnopy plot outputs/<run> ... --output figures/plot.pdf

For vapor_mass_fraction_table, vapor mass fraction is the x-axis and the sampled saturation pressure or temperature defines the series:

carnopy plot "$RUN_DIR" \
  --kind property-curves \
  --property mass_density \
  --value-scale linear \
  --show

Sampled heatmaps use flat, non-interpolated cells and require at least two unique values on each axis:

carnopy plot "$RUN_DIR" \
  --kind property-heatmap \
  --property specific_enthalpy \
  --color-scale linear

saturation_table does not support property heatmaps because it contains only the two endpoint branches.

Generic x-y plots use numeric semantic fields from emitted columns:

carnopy plot outputs/<property-run> \
  --kind xy \
  --x specific_enthalpy \
  --y specific_entropy \
  --group-by pressure

If more than one independent sampling coordinate remains, --group-by must resolve the ambiguity. Carnopy does not apply hidden grouping precedence.

Conventional thermodynamic diagrams are derived only from emitted columns:

carnopy plot outputs/<run-with-density> --kind pv
carnopy plot outputs/<run-with-entropy> --kind ts

The p-v diagram uses:

specific_volume = 1 / mass_density

The T-s diagram uses emitted entropy and temperature and requires recorded reference-state metadata. Neither command fabricates a saturation dome, critical point, or missing branch.

Exact filters use canonical SI values and never select a nearest neighbor:

carnopy plot "$RUN_DIR" \
  --kind property-curves \
  --property mass_density \
  --filter pressure=200000

Repeat --filter to combine filters with logical AND. Current filter fields are temperature, pressure, vapor mass fraction, phase, and saturation endpoint. Repeat --fluid to select multiple fluids; each fluid receives its own facet.

SOURCE may be a run directory, CSV, or Parquet file. Run directories prefer Parquet and verify it against metadata.json. Standalone saturation and vapor-quality files may require --saturation-coordinate pressure or --saturation-coordinate temperature.

Every export writes an image plus .plot.json provenance sidecar under figures/ by default. Existing image or sidecar paths are refused. Finalization uses exclusive same-filesystem hard links: it is no-overwrite-safe, but the two-file pair is not fully crash-atomic.

Configured visualization

An optional top-level visualization section generates figures after the immutable dataset run is finalized:

visualization:
  format: png
  fluids: [Propane]

  plots:
    - name: density-vs-temperature
      kind: property_curves
      property: mass_density
      x: temperature
      value_scale: linear

    - name: density-map
      kind: property_heatmap
      property: mass_density
      color_scale: log

    - name: enthalpy-entropy
      kind: xy
      x: specific_enthalpy
      y: specific_entropy
      group_by: pressure

    - name: pressure-specific-volume
      kind: pv

    - name: temperature-entropy
      kind: ts

Supported formats are png, pdf, and svg. Per-plot format and fluids replace their shared values; scales are selected per plot. Per-plot filters are AND-merged with shared filters, and conflicting values for the same field are rejected. Plot names must be unique safe filename slugs. Output paths and interactive display are intentionally not stored in YAML.

Shared or per-plot exact filters use YAML mappings:

visualization:
  filters:
    phase: gas
  plots:
    - name: gas-density
      kind: property_curves
      property: mass_density
      x: temperature
      filters:
        pressure: 100000

Generate with the default figure root:

carnopy generate my-dataset.yaml

Or select another figure root:

carnopy generate my-dataset.yaml \
  --out outputs/manual-test \
  --figures-out figures/manual-test

Configured figures are written to:

<figures-root>/<run-directory-name>/
├── <plot-name>.<format>
├── <plot-name>.plot.json
└── visualization-report.json

The same YAML requests can be applied later to an existing immutable run. The file may be a full Carnopy configuration or a small file containing only a top-level visualization: section:

carnopy plot outputs/<run> \
  --config plots.yaml \
  --figures-out figures

Batch plotting accepts run directories, not standalone CSV/Parquet files. Scientific generation fields in a full config are ignored; requests are validated against the actual emitted run columns. Manual plot options cannot be combined with --config.

Plots execute independently after dataset finalization. A failed plot preserves the immutable run and any successful figures, records outcomes in the report, and makes the CLI exit with code 1. A zero-valid-row dataset retains exit code 3 and records configured plots as skipped.

Visualization settings do not change config.normalized.json, spec_id, or generation_context_id. They receive their own visualization_request_id = viz-<sha256>. Exact YAML bytes still affect the raw configuration hash.

Generated artifacts and provenance

Each immutable run contains the selected dataset files plus mandatory provenance artifacts:

outputs/<run>/
├── dataset.csv          # when requested
├── dataset.parquet      # when requested
├── config.original.yaml
├── config.normalized.json
├── metadata.json
└── report.json

Runs are staged and then finalized atomically as one directory. Existing final or staging paths are never overwritten.

Identity layers:

  • spec_id: canonical executable scientific specification;
  • generation_context_id: specification plus software and artifact context;
  • output_request_id: canonical dataset serialization request;
  • run_id: one UUID4 execution attempt;
  • artifact hashes: exact emitted bytes;
  • visualization_request_id: normalized visualization request, independent from dataset identity.

Configuration provenance includes SHA-256 hashes of exact source YAML and canonical materialized SI configuration bytes. Metadata records software versions, reference-state policy, canonical fluids and properties, sampling, failure counts, units, fluid constants, and artifact hashes. Carnopy does not store the host source-config path.

Parquet schema metadata includes the dataset schema version and unit mapping. Figures are derived artifacts outside the run and are not added to immutable dataset artifact hashes.

Python API

from carnopy import generate_dataset, load_config, validate_config

loaded = load_config("my-dataset.yaml")
validation = validate_config("my-dataset.yaml")
result = generate_dataset(
    "my-dataset.yaml",
    output_root="outputs",
    figures_root="figures",
)

When configured visualization exists, result.visualization contains its request ID, status, figure directory, report path, and outcome counts. result.dataset_formats and result.output_request_id describe the selected table serialization independently of the scientific spec_id.

Manual plotting:

from carnopy.visualization import (
    plot_property_heatmap,
    plot_thermodynamic_diagram,
    plot_xy,
)

heatmap = plot_property_heatmap(
    "outputs/<run>",
    property_name="mass_density",
)

xy = plot_xy(
    "outputs/<run>",
    x="specific_enthalpy",
    y="specific_entropy",
    group_by="pressure",
)

pv = plot_thermodynamic_diagram("outputs/<run>", kind="pv")

The returned Matplotlib figure represents an image that has already been exported. Modifying it does not update the image or provenance sidecar.

Scientific limitations

  • CoolProp is the only backend in Milestone 1.
  • Pure fluids only; mixtures are deferred.
  • Generated data is backend output, not experimental evidence.
  • All backend calls and generated numeric columns use SI.
  • Specific enthalpy, entropy, and internal energy depend on reference state.
  • Carnopy resets every requested fluid to CoolProp DEF before generation and records that policy.
  • CoolProp reference-state mutation is process-global; concurrent embedded use with unrelated CoolProp calculations is unsupported in Milestone 1.
  • Release regression tests compare finalized Parquet values with direct CoolProp calls for representative states in all three modes.
  • Separate sanity checks require the generated normal boiling points of Propane and Cyclopentane at 101325 Pa to remain within the uncertainty intervals published by the NIST Chemistry WebBook. These checks do not establish universal experimental accuracy.
  • Absolute reference-dependent values are not directly comparable across different reference conventions.
  • Visualization reads emitted columns only and is not a second property evaluation layer.
  • ORC generation, additional backends, ML training, GUI, web services, databases, and mixture models are deferred.

Post-alpha work may add an optional cycle-feasibility subsystem that produces traceable screening datasets without turning the property generator into a hidden process simulator. An ORC/TFC contract must explicitly include source and sink profiles, pinch/approach temperatures, pressure losses, component efficiencies, subcooling and superheat margins, cavitation/NPSH constraints, minimum turbine-exhaust quality, and critical/maximum operating limits. Saturated liquid alone is not a pump cavitation margin, and turbine discharge need not universally have vapor mass fraction one.

Official backend references:

Development and contribution

Carnopy uses a src/ layout, Hatchling, standalone uv, Ruff, strict mypy, and pytest. pyproject.toml and uv.lock are authoritative.

Normal development:

uv sync --locked --extra all --group dev

Release-readiness tooling:

uv sync --locked --extra all --group dev --group release

Quality gate:

uv lock --check
uv run --locked ruff check .
uv run --locked ruff format --check .
uv run --locked mypy src/carnopy
uv run --locked pytest
uv run --locked python scripts/preflight.py
uv pip check --python .venv/bin/python

Keep changes small and explicit. Public configuration names, semantic property names, SI dataset columns, failure codes, metadata fields, and identity rules are compatibility contracts. Tests use temporary output directories and do not commit generated datasets or figures.

The test count is not a quality target. The suite separates configuration, sampling, three thermodynamic modes, diagnostics, provenance, visualization, CLI behavior, packaging, and release automation. New tests should protect a distinct contract or regression and use parametrization instead of duplicating equivalent cases.

Contributor and coding-agent rules, architecture constraints, commit conventions, and release-maintainer safeguards are in AGENTS.md.

Alpha release procedure

Carnopy 0.1.0a1 is intended to be a functional alpha, not a placeholder package. The PyPI name is claimed only after production PyPI accepts the distribution.

Before release:

  1. Make gcalpay/carnopy public.
  2. Enable GitHub secret scanning and push protection.
  3. Create a protected GitHub environment named pypi with a required human reviewer and a deployment tag rule matching v*.
  4. Register a pending Trusted Publisher on production PyPI:
Project:      carnopy
Owner:        gcalpay
Repository:   carnopy
Workflow:     publish.yml
Environment:  pypi

Pending publishers do not reserve the project name. Confirm production name availability immediately before tagging.

Release verification:

uv sync --locked --extra all --group dev --group release
uv lock --check
uv run --locked ruff check .
uv run --locked ruff format --check .
uv run --locked mypy src/carnopy
uv run --locked pytest
uv run --locked python scripts/preflight.py
uv run --locked --group release python -m build
uv run --locked --group release python -m twine check dist/*
uv run --locked python scripts/check_distribution.py dist/*
uv pip check --python .venv/bin/python

The build command uses an isolated build environment by default and installs the declared build backend there. The development environment therefore does not need to be changed solely to run a build. Use the ignored repository-local prerelease/ directory for a non-destructive rehearsal when an existing dist/ must be preserved:

uv run --locked --group release python -m build --outdir prerelease
uv run --locked --group release python -m twine check prerelease/*
uv run --locked python scripts/check_distribution.py prerelease/*

Final approved artifacts are built into dist/ for inspection, hashing, and publication. Carnopy build artifacts should not be written outside the repository.

The human creates and pushes the release tag:

git tag -a v0.1.0a1 -m "Release carnopy 0.1.0a1"
git push origin v0.1.0a1

The publishing workflow tests Python 3.10–3.13, builds one wheel/sdist pair once, verifies and hashes it, waits for production approval, publishes the verified files to PyPI, then downloads and smoke-tests the published release. Only the production publish job receives id-token: write; no long-lived index token or skip-existing behavior is used.

Never rebuild or republish changed files under an uploaded version. Any payload change requires 0.1.0a2 or later. Never move a pushed release tag or delete a release to reuse its version.

Official publishing references:

License

Carnopy is distributed under the MIT License. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

carnopy-0.1.0a1.tar.gz (253.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

carnopy-0.1.0a1-py3-none-any.whl (90.3 kB view details)

Uploaded Python 3

File details

Details for the file carnopy-0.1.0a1.tar.gz.

File metadata

  • Download URL: carnopy-0.1.0a1.tar.gz
  • Upload date:
  • Size: 253.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for carnopy-0.1.0a1.tar.gz
Algorithm Hash digest
SHA256 6c1a310b49f70c29a5bfed25aa65d4cdfd29a295cb557931fc385a1f6083ec91
MD5 a60295a12d675af871a6f9befbdb5f99
BLAKE2b-256 222fd562967a3b6a609bd82cfc61f08b8261aa9a9f7e21a72e511e345b138034

See more details on using hashes here.

Provenance

The following attestation bundles were made for carnopy-0.1.0a1.tar.gz:

Publisher: publish.yml on gcalpay/carnopy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file carnopy-0.1.0a1-py3-none-any.whl.

File metadata

  • Download URL: carnopy-0.1.0a1-py3-none-any.whl
  • Upload date:
  • Size: 90.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for carnopy-0.1.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 60c50daf5719f9c59bebf1db30ed87d6836da71e2791af4e2c9ada79be6220c7
MD5 052006cfe400a1a6e13d6416abbb4849
BLAKE2b-256 4fea6450ed3259f7e783ee190b2c992eb0995ef8212e455b6b4a3d0c4ab3d92b

See more details on using hashes here.

Provenance

The following attestation bundles were made for carnopy-0.1.0a1-py3-none-any.whl:

Publisher: publish.yml on gcalpay/carnopy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page