Skip to main content

NuGraph2: A Graph Neural Network for neutrino physics event reconstruction

Project description

NuGraph: a Graph Neural Network (GNN) for neutrino physics event reconstruction

This repository contains a GNN architecture for reconstructing particle interactions in neutrino physics detector environments. Its primary function is the classification of detector hit particle type through semantic segmentation, with additional secondary functions such as background hit rejection, event classification, clustering and vertex reconstruction.

Installation

This repository can be installed in Python via pip, although using Anaconda to install dependencies is strongly recommended. Detailed instructions on how to easily install all necessary dependencies are available here.

Once dependencies are installed, you can simply clone this repository and installing it via pip – if you intend to carry out any development on the code, installing in editable mode is recommended:

git clone git@github.com:exatrkx/NuGraph
pip install --no-deps -e ./NuGraph

Training a model

You can train the model using a processed graph dataset as input by executing the train.py script in the scripts subdirectory. This script accepts many arguments to configure your training – for a complete summary of all available arguments, you can simply run

scripts/train.py --help

As an example, to train the network for semantic segmentation on the Heimdall cluster, one might run

scripts/train.py --data-path /raid/uboone/NuGraph2/NG2-paper.gnn.h5 \
                 --logdir /raid/$USER/logs --name default --version semantic-filter \
                 --semantic --filter

This command would start a network training using the requested input dataset, training with the semantic head enabled, and writing network parameters and metrics to the directory /raid/$USER/logs/default/semantic-filter.

Training on SLURM clusters

If you're working on a cluster that uses the SLURM batch submission system, such as the Wilson cluster at Fermilab, then you'll need to submit training via a batch script instead. An example batch script train_batch.sh is included in the scripts subdirectory. If you're training on the Wilson cluster, you can submit a training job by running

sbatch scripts/train_batch.sh <args>

where <args> are the same argument you'd pass if you were executing the training script locally.

If you're training on a SLURM environment other than the Wilson cluster, you'll need to edit the SLURM directives in the script appropriately for the cluster you're working on before submitting.

Metric logging

In the above example, training outputs including logging metrics would be written to a subdirectory of /raid/$USER/logs. We can use the Tensorboard interface to visualise these metrics and track the network's training progress. You can start Tensorboard using the following command:

tensorboard --port XXXX --bind_all --logdir /raid/$USER/logs --samples_per_plugin 'images=200'

In the above example, you should replace XXXX with a unique port number of your choosing. Provided you're forwarding that port when working over SSH, you can then access the interface in a local browser at localhost:XXXX.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nugraph-24.4.0.tar.gz (37.5 kB view hashes)

Uploaded Source

Built Distribution

nugraph-24.4.0-py3-none-any.whl (28.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page