Skip to main content

Nuclear data as Parquet — queryable with DuckDB

Project description

nucl-parquet

Nuclear data as Parquet files — cross-sections, stopping powers, decay data, and isotopic abundances from all major evaluated libraries. Queryable with DuckDB, Polars, Pandas, or any Arrow-compatible tool.

Installation

pip install nucl-parquet

The pip package is a thin loader (~50 KB). Data files are either cloned from the git repo or downloaded from GitHub Releases:

import nucl_parquet

# Download data to ~/.nucl-parquet/ (first time only)
nucl_parquet.download()

Or clone the repo directly for the full dataset:

git clone https://github.com/exoma-ch/nucl-parquet.git
export NUCL_PARQUET_DATA=/path/to/nucl-parquet/data

Usage

import nucl_parquet

db = nucl_parquet.connect()

# Cross-section query
db.sql("SELECT * FROM tendl_2024 WHERE target_A=63 AND residual_Z=30")

# Compare all libraries
db.sql("SELECT library, energy_MeV, xs_mb FROM xs WHERE target_A=63 AND residual_Z=30")

# Decay chain
db.sql(nucl_parquet.DECAY_CHAIN_SQL, params={"parent_z": 92, "parent_a": 238})

# Stopping power — light ions (NIST PSTAR/ASTAR/ESTAR)
nucl_parquet.elemental_dedx(db, "p", 29, 10.0)     # protons in Cu at 10 MeV
nucl_parquet.elemental_dedx(db, "e", 29, 1.0)      # electrons in Cu at 1 MeV
nucl_parquet.compound_dedx(db, "p", [(29, 0.5), (30, 0.5)], 10.0)

# Stopping power — heavy ions (CatIMA, any isotope of Z=1-92)
nucl_parquet.elemental_dedx(db, "c12",  6, 12 * 100.0)   # C-12 in C at 100 MeV/u
nucl_parquet.elemental_dedx(db, "pb208", 82, 208 * 50.0)  # Pb-208 in Pb at 50 MeV/u
nucl_parquet.elemental_dedx(db, "xe132", 14, 132 * 50.0)  # Xe-132 in Si at 50 MeV/u

# Heavy-ion total reaction cross-sections (Tripathi 1997)
db.sql("SELECT * FROM hi_xs WHERE target_Z=29 ORDER BY energy_MeV")  # c12 on Cu
db.sql("""
    SELECT energy_MeV, energy_MeV/12 AS energy_MeV_u, xs_mb
    FROM hi_xs WHERE target_Z=6
""")  # c12 on C — typical carbon therapy channel

# Heavy-ion production cross-sections (Geant4 INCL++/ABLA07)
# σ(Zf, Af, E) per residual isotope — 6 projectiles × 92 targets × ~60 energies
db.sql("""
    SELECT residual_Z, residual_A, energy_MeV, xs_mb
    FROM hi_xs_prod
    WHERE target_Z=29 AND library='hi-xs-prod'
    ORDER BY residual_Z, residual_A, energy_MeV
""")  # all C-12 fragmentation products on Cu

Data resolution

connect() finds data in this order:

  1. Explicit data_dir argument
  2. $NUCL_PARQUET_DATA environment variable
  3. Sibling repo checkout (when running from source)
  4. ~/.nucl-parquet/ (downloaded via nucl_parquet.download())

Why Parquet instead of ENDF-6?

The ENDF-6 format dates from the 1960s. It was designed for Fortran on punch cards: 80-character fixed-width records, implicit column positions, and a cryptic MF/MT numbering system.

ENDF-6 Parquet
Format Fixed-width Fortran text, 80-char cards Columnar binary, self-describing schema
Parsers needed Specialized (NJOY, PREPRO, FUDGE, endf pkg) Any language — Python, R, Julia, Rust, JS, SQL
Random access Sequential parse from start Predicate pushdown, skip irrelevant row groups
Compression None (or gzip'd text) zstd columnar compression (5-10x smaller)
Cross-library comparison Convert each library separately first SELECT * FROM '*/xs/p_Cu.parquet'
Browser/WASM Not feasible Works natively (DuckDB-WASM, Pyodide)

Size comparison for the same data:

Library ENDF-6 (zipped) Parquet (zstd) Reduction
TENDL-2025 neutron ~800 MB (2850 zip files) 25 MB 32x
ENDF/B-VIII.1 (all) ~120 MB 4.3 MB 28x
JENDL-5 (all) ~200 MB 8.6 MB 23x

Libraries included

Library Projectiles Source
TENDL-2024 n, p, d, t, ³He, α IAEA/PSI
TENDL-2025 n, p, d, t, ³He, α PSI
ENDF/B-VIII.1 n, p, d, t, ³He, α NNDC/BNL
JEFF-4.0 n, p NEA
JENDL-5 n, p, d, α JAEA
CENDL-3.2 n CIAE
BROND-3.1 n IPPE
FENDL-3.2 n IAEA
EAF-2010 n CCFE
IRDFF-II n IAEA
IAEA-Medical p, d IAEA
EXFOR n, p, d, t, ³He, α IAEA NDS (experimental)
HI-XS (Tripathi 1997) p, ⁴He, ¹²C, ¹⁶O, ²⁰Ne, ²⁸Si, ⁴⁰Ar, ⁴⁰Ca, ⁵⁶Fe, ⁵⁸Ni, ¹³²Xe, ²⁰⁸Pb semi-empirical (Tripathi 1997)
HI-XS Production (Geant4 INCL++/ABLA07) ¹²C, ¹⁶O, ²⁰Ne, ²⁸Si, ⁴⁰Ar, ⁵⁶Fe Geant4 11.3.2 Monte Carlo

Parquet schemas

Evaluated cross-sections ({library}/xs/*.parquet):

Column Type Description
target_A Int32 Target mass number
residual_Z Int32 Product atomic number
residual_A Int32 Product mass number
state Utf8 Isomer state: "", "g", "m"
energy_MeV Float64 Projectile energy in MeV
xs_mb Float64 Cross-section in millibarn

EXFOR experimental (exfor/*.parquet):

Column Type Description
exfor_entry Utf8 EXFOR accession number
target_Z Int32 Target atomic number
target_A Int32 Target mass number (0 = natural)
residual_Z Int32 Product atomic number
residual_A Int32 Product mass number
state Utf8 Isomer state
energy_MeV Float64 Projectile energy in MeV
energy_err_MeV Float64 Energy uncertainty (nullable)
xs_mb Float64 Cross-section in millibarn
xs_err_mb Float64 Cross-section uncertainty (nullable)
author Utf8 First author
year Int32 Publication year

Stopping powers (stopping/{source}.parquet — one file per source):

Column Type Description
source Utf8 PSTAR, ASTAR, ESTAR, dSTAR, tSTAR, He3STAR, catima_C12, …
target_Z Int32 Target element Z (1–92)
energy_MeV Float64 Projectile kinetic energy (MeV, total)
dedx Float64 Mass stopping power (MeV cm²/g)

Files: PSTAR.parquet, ASTAR.parquet, ESTAR.parquet, dSTAR.parquet, tSTAR.parquet, He3STAR.parquet, and catima_{beam}.parquet for C12/O16/Ne20/Si28/Ar40/Fe56. The full 92×92 CaTiMA matrix (MeV/u units) lives separately at stopping/catima/catima.parquet.

dSTAR, tSTAR, and He3STAR are velocity-scaled from PSTAR/ASTAR — exact for electronic stopping since Z_proj and velocity fully determine dE/dx. For elements not in the NIST table (e.g. Ra, Rn, Ac, Po, Fr, At, Tc, Pm), elemental_dedx() automatically falls back to CaTiMA (Bethe-Bloch), which covers all Z=1–92.

Heavy-ion total reaction cross-sections (hi-xs/xs/{proj}_{target}.parquet):

Tripathi (1997) semi-empirical parameterization — total reaction cross-sections for all 12 projectiles against all 92 target elements. Energy stored as total MeV for the projectile; 1–1000 MeV/u range, 60 log-spaced points.

Column Type Description
target_Z Int32 Target atomic number (1–92)
target_A Int32 Target mass number (most-abundant stable isotope)
energy_MeV Float64 Total projectile kinetic energy (MeV)
xs_mb Float64 Total reaction cross-section (mb)

Heavy-ion production cross-sections (hi-xs-prod/xs/{proj}_{target}.parquet):

Geant4 11.3.2 FTFP_INCLXX physics list (INCL++ cascade + ABLA07 de-excitation) — per-isotope fragment production cross-sections σ(Zf,Af,E) in mb, normalized to Tripathi (1997) σ_R. Covers C-12, O-16, Ne-20, Si-28, Ar-40, Fe-56 projectiles against all 92 target elements, ~60 log-spaced energy points from 1–1000 MeV/u.

Column Type Description
proj_Z Int32 Projectile atomic number
proj_A Int32 Projectile mass number
target_Z Int32 Target atomic number (1–92)
target_A Int32 Target mass number (most-abundant stable isotope)
residual_Z Int32 Fragment atomic number
residual_A Int32 Fragment mass number
energy_MeV Float64 Mean actual reaction vertex energy (MeV total)
xs_mb Float64 Production cross-section (mb)

Heavy-ion stopping powers (stopping/catima.parquet):

Full 92×92 matrix — all projectile elements Z=1–92 against all target elements Z=1–92, computed with CatIMA. Energy stored in MeV/u; isotope-independent (divide total MeV by A to look up).

Column Type Description
proj_Z Int32 Projectile atomic number (1–92)
target_Z Int32 Target atomic number (1–92)
energy_MeV_u Float64 Kinetic energy per nucleon (MeV/u)
dedx Float64 Mass stopping power (MeV cm²/g)

Development

# Install dev dependencies
uv sync --dev

# Run unit tests (no data needed)
uv run pytest tests/test_loader.py -v

# Run full test suite (requires data)
uv run pytest tests/ -v

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nucl_parquet-0.8.0.tar.gz (39.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nucl_parquet-0.8.0-py3-none-any.whl (37.8 kB view details)

Uploaded Python 3

File details

Details for the file nucl_parquet-0.8.0.tar.gz.

File metadata

  • Download URL: nucl_parquet-0.8.0.tar.gz
  • Upload date:
  • Size: 39.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nucl_parquet-0.8.0.tar.gz
Algorithm Hash digest
SHA256 a8910c0f6e9792608820f47aa57bd9f8be58cb6c79bdf440091762289ab33c5f
MD5 f397eb60c75f8969ccdbf3a0f3e0cc76
BLAKE2b-256 0fe3a829dbbb303c284ddcbbf5469953adf6b4b6f885d89b00c6aa0b4eca7c71

See more details on using hashes here.

Provenance

The following attestation bundles were made for nucl_parquet-0.8.0.tar.gz:

Publisher: release.yml on exoma-ch/nucl-parquet

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nucl_parquet-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: nucl_parquet-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 37.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nucl_parquet-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0be698bb96c489c53d780a14cb63d47a7201348f1e80c0dc98c566adc9e2e0b6
MD5 7aee44a3b0454fdaa01bef286c8178d3
BLAKE2b-256 1d262a3aca9474900688ad7d4d34f852bf9befff5820b5f1613dd325df6d1d08

See more details on using hashes here.

Provenance

The following attestation bundles were made for nucl_parquet-0.8.0-py3-none-any.whl:

Publisher: release.yml on exoma-ch/nucl-parquet

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page