Skip to main content

A Python toolkit for analyzing Galacticus semi-analytic model outputs.

Project description

Dendros

Dendros Logo

License: GPL v3 PyPI version Documentation

A Python toolkit for analyzing Galacticus semi-analytic model outputs — both HDF5 model outputs and posterior-sample ("MCMC") chain logs.


Installation

pip install dendros

To also enable pandas and tabulate table output:

pip install 'dendros[pandas,tabulate]'

To enable plotting of Galacticus /analyses results (requires matplotlib):

pip install 'dendros[plot]'

Install the latest development version directly from GitHub:

pip install git+https://github.com/galacticusorg/dendros.git

Quickstart

Opening files

from dendros import open_outputs

# Single file
c = open_outputs("galacticus.hdf5")

# Auto-detect MPI-split outputs (given any one rank's file)
c = open_outputs("galacticus_MPI:0000.hdf5")

# Explicit list of files
c = open_outputs(["rank0.hdf5", "rank1.hdf5"])

# Glob pattern
c = open_outputs("run001/galacticus*.hdf5")

# Lightcone run (different top-level group)
c = open_outputs("lightcone.hdf5", output_root="Lightcone")

Use Collection as a context manager to ensure files are closed:

with open_outputs("galacticus.hdf5") as c:
    ...

Checking completion status

Galacticus writes a statusCompletion attribute when a run finishes. validate_completion raises an error if any file is incomplete:

with open_outputs("galacticus.hdf5") as c:
    c.validate_completion()           # raises RuntimeError if incomplete
    c.validate_completion(mode="warn")    # emit warning instead
    c.validate_completion(mode="ignore")  # do nothing

Listing available outputs

with open_outputs("galacticus.hdf5") as c:
    tbl = c.list_outputs()          # astropy Table by default
    print(tbl)

    # or as a pandas DataFrame:
    df = c.list_outputs(format="pandas")

    # or as a tabulate string:
    df = c.list_outputs(format="tabulate")

Example output:

index  name     time   scale_factor  redshift
----- ------- -------- ------------ ---------
    1 Output1  13.8        1.0          0.0
    2 Output2   6.0        0.5          1.0

You can also access the index object directly:

with open_outputs("galacticus.hdf5") as c:
    for meta in c.outputs:
        print(meta.name, meta.redshift)

Listing available properties

with open_outputs("galacticus.hdf5") as c:
    tbl = c.list_properties("Output1")   # by name
    tbl = c.list_properties(1)           # by 1-based integer index
    print(tbl)

Example output:

name         dtype    shape   description          unitsInSI
---------- ------- -------- -------------------- -----------
haloMass   float64  (1000,) Halo virial mass     1.989e+30
stellarMass float64 (1000,) Stellar mass of disk 1.989e+30
...

Reading datasets

with open_outputs("galacticus.hdf5") as c:
    # List of dataset paths → same strings used as dict keys
    data = c.read("Output1", ["nodeData/basicMass", "nodeData/diskMassStellar"])
    print(data["nodeData/basicMass"])   # numpy array

    # Dict → custom labels
    data = c.read(
        "Output1",
        {"Mhalo": "nodeData/basicMass", "Mstar": "nodeData/diskMassStellar"},
    )
    print(data["Mhalo"])

Filtering galaxies

Pass a boolean mask or integer index array as where:

with open_outputs("galacticus.hdf5") as c:
    # First read to build a mask
    masses = c.read("Output1", ["nodeData/basicMass"])["nodeData/basicMass"]
    mask = masses > 1e12

    # Then read everything for the selected galaxies only
    data = c.read(
        "Output1",
        {"Mhalo": "nodeData/basicMass", "Mstar": "nodeData/diskMassStellar"},
        where=mask,
    )

h5py-like browsing

with open_outputs("galacticus.hdf5") as c:
    print(c.keys())                        # top-level groups
    grp = c["Outputs/Output1"]
    print(grp.keys())                      # subgroups / datasets
    print(grp.attrs)                       # group attributes
    ds = c["Outputs/Output1/nodeData/basicMass"]
    print(ds.dtype, ds.shape)

Plotting analyses

If a Galacticus run was configured to write reduced analysis results, the HDF5 file will contain a top-level /analyses group with one subgroup per analysis. Dendros can list those analyses and plot each model curve with its observational/target overlay. Requires the [plot] extra.

For MPI runs, the /analyses data is reduced over all ranks and is identical in every rank's file, so dendros reads only the primary file.

with open_outputs("galacticus.hdf5") as c:
    print(c.list_analyses())                     # tabulate available analyses

    figs = c.plot_analyses()                     # one matplotlib Figure per analysis
    figs = c.plot_analyses(name="stellarMassFunction",
                           output_directory="figs",
                           file_format="pdf")    # also save to disk

MPI outputs

When Galacticus runs with MPI, it writes one file per rank with the suffix _MPI:NNNN (e.g. galacticus_MPI:0000.hdf5, galacticus_MPI:0001.hdf5, …). All ranks contain identical metadata groups; galaxy datasets are split across ranks.

open_outputs handles this automatically:

# Any single-rank file → auto-detects all peers
c = open_outputs("galacticus_MPI:0000.hdf5")

# Or pass an explicit list / glob
c = open_outputs("galacticus_MPI:????.hdf5")

c.read(...) transparently concatenates arrays across all ranks along axis 0.


Lightcone outputs

For lightcone runs the top-level group is typically Lightcone rather than Outputs. Pass output_root to override the default:

c = open_outputs("lightcone.hdf5", output_root="Lightcone")

MCMC analysis

Dendros also reads Galacticus posterior-sample ("MCMC") chain logs given the config XML used to drive the run, and provides convergence diagnostics, post-burn analyses, parameter-file emission, and corner plots:

from dendros import open_mcmc

with open_mcmc("mcmcConfig.xml") as run:
    outliers = run.outlier_chains()
    step = run.convergence_step(threshold=1.1, drop_chains=outliers)

    ess = run.effective_sample_size(post_burn=step)
    fit = run.multivariate_normal_fit(post_burn=step, drop_chains=outliers)
    fit.write_reparameterization_config("reparam.xml")

    map_ = run.maximum_posterior(drop_chains=outliers)
    run.write_parameter_files(map_.state, "max_posterior")

    fig = run.corner_plot(post_burn=step, drop_chains=outliers)

Brooks-Gelman corrected Rhat (with the non-parametric R_interval companion), Geweke z-scores, an iterative Grubbs outlier test on chain final states, Sokal-windowed autocorrelation times, effective sample sizes, sliding-window acceptance rates, projection-pursuit PCA, multivariate-normal fits with reparameterization-config emission, posterior sampling, and base-parameter file generation are all supported. Corner plots require the optional extra: pip install 'dendros[mcmc]'. See the MCMC docs page for details.


Documentation

Full API reference and more examples are available at dendros.readthedocs.io.


Contributing

See CONTRIBUTING.md for development setup, coding style, and how to propose changes.


License

Dendros is released under the GNU General Public License v3.0 or later.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dendros-0.4.0.tar.gz (91.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dendros-0.4.0-py3-none-any.whl (77.4 kB view details)

Uploaded Python 3

File details

Details for the file dendros-0.4.0.tar.gz.

File metadata

  • Download URL: dendros-0.4.0.tar.gz
  • Upload date:
  • Size: 91.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dendros-0.4.0.tar.gz
Algorithm Hash digest
SHA256 c0ce46bf61217218cdc3ac0c030daf3f29c73785385ff7eb6212bf440b158949
MD5 c620734efd3a09baaa42a2b98748e758
BLAKE2b-256 92951c7946b7ab027d0662a49567477b4815067993b8ac897452d9b8484f50f8

See more details on using hashes here.

Provenance

The following attestation bundles were made for dendros-0.4.0.tar.gz:

Publisher: release.yml on galacticusorg/dendros

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dendros-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: dendros-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 77.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dendros-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 82bf78529e3471fe670f793201be36716d69b0210982bdbb1b3f037698498c2a
MD5 a2b40a1a517a65cc0b8641cc96be7bd5
BLAKE2b-256 08ca498a7cfb325f70e8c638308e3d03cf540fe937512dd18ead8d8715b27663

See more details on using hashes here.

Provenance

The following attestation bundles were made for dendros-0.4.0-py3-none-any.whl:

Publisher: release.yml on galacticusorg/dendros

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page