Skip to main content

A Python toolkit for analyzing Galacticus semi-analytic model outputs.

Project description

Dendros

Dendros Logo

License: GPL v3 PyPI version Documentation

A Python toolkit for analyzing Galacticus semi-analytic model outputs — both HDF5 model outputs and posterior-sample ("MCMC") chain logs.


Installation

pip install dendros

To also enable pandas and tabulate table output:

pip install 'dendros[pandas,tabulate]'

To enable plotting of Galacticus /analyses results (requires matplotlib):

pip install 'dendros[plot]'

Install the latest development version directly from GitHub:

pip install git+https://github.com/galacticusorg/dendros.git

Quickstart

Opening files

from dendros import open_outputs

# Single file
c = open_outputs("galacticus.hdf5")

# Auto-detect MPI-split outputs (given any one rank's file)
c = open_outputs("galacticus_MPI:0000.hdf5")

# Explicit list of files
c = open_outputs(["rank0.hdf5", "rank1.hdf5"])

# Glob pattern
c = open_outputs("run001/galacticus*.hdf5")

# Lightcone run (different top-level group)
c = open_outputs("lightcone.hdf5", output_root="Lightcone")

Use Collection as a context manager to ensure files are closed:

with open_outputs("galacticus.hdf5") as c:
    ...

Checking completion status

Galacticus writes a statusCompletion attribute when a run finishes. validate_completion raises an error if any file is incomplete:

with open_outputs("galacticus.hdf5") as c:
    c.validate_completion()           # raises RuntimeError if incomplete
    c.validate_completion(mode="warn")    # emit warning instead
    c.validate_completion(mode="ignore")  # do nothing

Listing available outputs

with open_outputs("galacticus.hdf5") as c:
    tbl = c.list_outputs()          # astropy Table by default
    print(tbl)

    # or as a pandas DataFrame:
    df = c.list_outputs(format="pandas")

    # or as a tabulate string:
    df = c.list_outputs(format="tabulate")

Example output:

index  name     time   scale_factor  redshift
----- ------- -------- ------------ ---------
    1 Output1  13.8        1.0          0.0
    2 Output2   6.0        0.5          1.0

You can also access the index object directly:

with open_outputs("galacticus.hdf5") as c:
    for meta in c.outputs:
        print(meta.name, meta.redshift)

Listing available properties

with open_outputs("galacticus.hdf5") as c:
    tbl = c.list_properties("Output1")   # by name
    tbl = c.list_properties(1)           # by 1-based integer index
    print(tbl)

Example output:

name         dtype    shape   description          unitsInSI
---------- ------- -------- -------------------- -----------
haloMass   float64  (1000,) Halo virial mass     1.989e+30
stellarMass float64 (1000,) Stellar mass of disk 1.989e+30
...

Reading datasets

with open_outputs("galacticus.hdf5") as c:
    # List of dataset paths → same strings used as dict keys
    data = c.read("Output1", ["nodeData/basicMass", "nodeData/diskMassStellar"])
    print(data["nodeData/basicMass"])   # numpy array

    # Dict → custom labels
    data = c.read(
        "Output1",
        {"Mhalo": "nodeData/basicMass", "Mstar": "nodeData/diskMassStellar"},
    )
    print(data["Mhalo"])

Filtering galaxies

Pass a boolean mask or integer index array as where:

with open_outputs("galacticus.hdf5") as c:
    # First read to build a mask
    masses = c.read("Output1", ["nodeData/basicMass"])["nodeData/basicMass"]
    mask = masses > 1e12

    # Then read everything for the selected galaxies only
    data = c.read(
        "Output1",
        {"Mhalo": "nodeData/basicMass", "Mstar": "nodeData/diskMassStellar"},
        where=mask,
    )

h5py-like browsing

with open_outputs("galacticus.hdf5") as c:
    print(c.keys())                        # top-level groups
    grp = c["Outputs/Output1"]
    print(grp.keys())                      # subgroups / datasets
    print(grp.attrs)                       # group attributes
    ds = c["Outputs/Output1/nodeData/basicMass"]
    print(ds.dtype, ds.shape)

Plotting analyses

If a Galacticus run was configured to write reduced analysis results, the HDF5 file will contain a top-level /analyses group with one subgroup per analysis. Dendros can list those analyses and plot each model curve with its observational/target overlay. Requires the [plot] extra.

For MPI runs, the /analyses data is reduced over all ranks and is identical in every rank's file, so dendros reads only the primary file.

with open_outputs("galacticus.hdf5") as c:
    print(c.list_analyses())                     # tabulate available analyses

    figs = c.plot_analyses()                     # one matplotlib Figure per analysis
    figs = c.plot_analyses(name="stellarMassFunction",
                           output_directory="figs",
                           file_format="pdf")    # also save to disk

MPI outputs

When Galacticus runs with MPI, it writes one file per rank with the suffix _MPI:NNNN (e.g. galacticus_MPI:0000.hdf5, galacticus_MPI:0001.hdf5, …). All ranks contain identical metadata groups; galaxy datasets are split across ranks.

open_outputs handles this automatically:

# Any single-rank file → auto-detects all peers
c = open_outputs("galacticus_MPI:0000.hdf5")

# Or pass an explicit list / glob
c = open_outputs("galacticus_MPI:????.hdf5")

c.read(...) transparently concatenates arrays across all ranks along axis 0.


Lightcone outputs

For lightcone runs the top-level group is typically Lightcone rather than Outputs. Pass output_root to override the default:

c = open_outputs("lightcone.hdf5", output_root="Lightcone")

MCMC analysis

Dendros also reads Galacticus posterior-sample ("MCMC") chain logs given the config XML used to drive the run, and provides convergence diagnostics, post-burn analyses, parameter-file emission, and corner plots:

from dendros import open_mcmc

with open_mcmc("mcmcConfig.xml") as run:
    outliers = run.outlier_chains()
    step = run.convergence_step(threshold=1.1, drop_chains=outliers)

    ess = run.effective_sample_size(post_burn=step)
    fit = run.multivariate_normal_fit(post_burn=step, drop_chains=outliers)
    fit.write_reparameterization_config("reparam.xml")

    map_ = run.maximum_posterior(drop_chains=outliers)
    run.write_parameter_files(map_.state, "max_posterior")

    fig = run.corner_plot(post_burn=step, drop_chains=outliers)

Brooks-Gelman corrected Rhat (with the non-parametric R_interval companion), Geweke z-scores, an iterative Grubbs outlier test on chain final states, Sokal-windowed autocorrelation times, effective sample sizes, sliding-window acceptance rates, projection-pursuit PCA, multivariate-normal fits with reparameterization-config emission, posterior sampling, and base-parameter file generation are all supported. Corner plots require the optional extra: pip install 'dendros[mcmc]'. See the MCMC docs page for details.


Documentation

Full API reference and more examples are available at dendros.readthedocs.io.


Contributing

See CONTRIBUTING.md for development setup, coding style, and how to propose changes.


License

Dendros is released under the GNU General Public License v3.0 or later.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dendros-0.3.0.tar.gz (86.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dendros-0.3.0-py3-none-any.whl (73.7 kB view details)

Uploaded Python 3

File details

Details for the file dendros-0.3.0.tar.gz.

File metadata

  • Download URL: dendros-0.3.0.tar.gz
  • Upload date:
  • Size: 86.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dendros-0.3.0.tar.gz
Algorithm Hash digest
SHA256 78f86a6e87dfe8327b544a4bd9c56fd87f0f52318726d1f2c2718e79026be7cf
MD5 149fa3068c7898aeb60b15321330cf00
BLAKE2b-256 a5506acdae4ee696c6a7a6b4e20818b670ea201d54d6d343e9ef52cbb5152e2a

See more details on using hashes here.

Provenance

The following attestation bundles were made for dendros-0.3.0.tar.gz:

Publisher: release.yml on galacticusorg/dendros

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dendros-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: dendros-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 73.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dendros-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0887f8884a3c3253290b9c38c83d185e60445b95262e2ee7cfbdcd7a9d1c434c
MD5 885e0a386ed78212f4a8abd3584fb5be
BLAKE2b-256 1d3766bce7539875b1bfd4359d31e5d6172e377b5238b48a0212fcc16f79f68a

See more details on using hashes here.

Provenance

The following attestation bundles were made for dendros-0.3.0-py3-none-any.whl:

Publisher: release.yml on galacticusorg/dendros

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page