particleflow

Machine Learning for Particle Flow Reconstruction

These details have not been verified by PyPI

Project description

Summary

ML-based particle flow (MLPF) focuses on developing full event reconstruction for particle detectors using computationally scalable and flexible machine learning models. The project aims to improve particle flow reconstruction across various detector environments, including CMS, as well as future detectors via Key4HEP. We build on existing, open-source simulation software by the experimental collaborations.

High-level overview

TLDR; I just want to run the code

You can use uv to set up the repo and test that everything works:

git clone --recurse-submodules https://github.com/jpata/particleflow.git
uv sync
uv run ./scripts/local_test_cld.sh
uv run ./scripts/local_test_cms.sh

Alternatively, you can use a prepared container:

apptainer exec --nv https://jpata.web.cern.ch/jpata/pytorch-20260305-08d6950.sif ./scripts/local_test_cld.sh
apptainer exec --nv https://jpata.web.cern.ch/jpata/pytorch-20260305-08d6950.sif ./scripts/local_test_cms.sh

Datasets

If you wish to train on pre-made datasets, you can download them from the Hugging Face Hub. To download a specific dataset and split (e.g., CLD, PF setup, configuration split 1):

uv run hf download jpata/particleflow \
  --include "tensorflow_datasets/cld/cld_edm_*_pf/1/*" \
  --local-dir data/tfds \
  --repo-type dataset

This will download the requested files into data/tfds/tensorflow_datasets/cld/cld_edm_*_pf/1/.

Dataset Upload

To upload a generated dataset to the Hugging Face Hub:

uv run python3 scripts/upload_hf.py --repo jpata/particleflow --spec particleflow_spec.yaml clic 1

Training

Run the training on the downloaded data configuration split

uv run \
    python mlpf/pipeline.py \
    --spec-file particleflow_spec.yaml \
    --production cld \
    --model-name pyg-cld-v1 \
    --data-dir data/tfds/tensorflow_datasets/cld \
    train \
    --data_config 1 \
    --gpu_batch_multiplier 4 \
    --gpus 1

Model Upload

To upload a trained model to the Hugging Face Hub:

uv run python3 scripts/upload_model_hf.py experiments/pyg-clic-hits-v1_clic_20260328_144021_479374 --version v3.1.0

Model Download & Evaluation

To download a specific model (e.g., CLD, cluster-based, version v3.1.0) and run evaluation on a sample ROOT file:

Download the model files from the Hugging Face Hub:

uv run hf download jpata/particleflow \
  --include "cld/clusters/v3.1.0/pyg-cld-v1_cld_20260328_101206_533260/*" \
  --local-dir models \
  --repo-type model

Run the evaluation script:

mkdir -p local_test_data/cld/p8_ee_ttbar_ecm365/root
cd local_test_data/cld/p8_ee_ttbar_ecm365/root
wget -q --no-check-certificate -nc https://jpata.web.cern.ch/jpata/mlpf/cld/v1.2.3_key4hep_2025-05-29_CLD_f1e8f9/gen/root/reco_p8_ee_ttbar_ecm365_300000.root
cd ../../..

uv run python3 mlpf/standalone_eval/key4hep/evaluator.py \
  --input local_test_data/cld/p8_ee_ttbar_ecm365/root/reco_p8_ee_ttbar_ecm365_300000.root \
  --checkpoint models/cld/clusters/v3.1.0/pyg-cld-v1_cld_20260328_101206_533260/checkpoints/best_weights.pth \
  --detector cld \
  --outpath eval_results.parquet

The input ROOT file should be in the EDM4hep format.

End-to-end workflow: dataset generation and model training

The full data generation, model training, and validation workflow are managed using Pixi for environment and Snakemake for job orchestration. Apptainer images are used to provide the software for the steps for different detetors.

#ensure all gen configs are downloaded
git submodule update --init --recursive

# install pixi, restart your shell or source your .bashrc after this. only do once.
curl -fsSL https://pixi.sh/install.sh | bash

# copy the configuration for your site. only do once.
ln -s configs/{local,tallinn,lxplus}/pixi.toml pixi.toml

# initalize the orhcestrator python environment. only do this once.
pixi run init

# generate the snakefile (will overwrite the defaults)
PROD={cms_run3,clic,cld} pixi run snakefile

# run the steps (this will take many days and thousands of jobs), so run inside screen or tmux
PROD={cms_run3,clic,cld} pixi run gen
PROD={cms_run3,clic,cld} pixi run post
PROD={cms_run3,clic,cld} pixi run tfds
PROD={cms_run3,clic,cld} pixi run train

Publications

The following publications trace the development of MLPF from early proofs of concept to full detector simulations and fine-tuning studies across detectors.

[2021] First full-event GNN demonstration of MLPF: Paper Code Dataset
[2021] First demonstration in CMS Run 3: Paper CMS-DP
[2022] Improved performance in CMS Run 3: CMS-DP
[2024] Improved performance with full simulation for future colliders: Paper Code Results
[2025] Fine-tuning across detectors: Paper Code
[2026] CMS Run 3 full results: Paper CMS-DP Code

Citations and Reuse

You are welcome to reuse the code in accordance with the LICENSE.

How to Cite

Academic Work: Please cite the specific papers listed in the Publications section above relevant to the method you are using (e.g., initial GNN idea, fine-tuning, or specific detector studies).
Code Usage: If you use the code significantly for research, please cite the specific tagged version from Zenodo.
Dataset Usage: Cite the appropriate dataset via the Zenodo link and the corresponding paper.

Contact

For collaboration ideas that do not fit into the categories above, please get in touch via GitHub Discussions.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

3.1.0

Apr 7, 2026

0.0.1 yanked

May 30, 2018

Reason this release was yanked:

Yanking unmaintained code

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

particleflow-3.1.0.tar.gz (216.9 kB view details)

Uploaded Apr 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

particleflow-3.1.0-py3-none-any.whl (249.4 kB view details)

Uploaded Apr 7, 2026 Python 3

File details

Details for the file particleflow-3.1.0.tar.gz.

File metadata

Download URL: particleflow-3.1.0.tar.gz
Upload date: Apr 7, 2026
Size: 216.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for particleflow-3.1.0.tar.gz
Algorithm	Hash digest
SHA256	`2dc182f07b645c8840ca141183bdab4a5432a9539b0945b0656160bb298e6d80`
MD5	`56e1fc89739b1299558ef81ad2a1445f`
BLAKE2b-256	`792890e7d409b2d13bb436b545b28efd272df405313626d59bfc9bf860d1a76e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for particleflow-3.1.0.tar.gz:

Publisher: pypi-publish.yml on jpata/particleflow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: particleflow-3.1.0.tar.gz
- Subject digest: 2dc182f07b645c8840ca141183bdab4a5432a9539b0945b0656160bb298e6d80
- Sigstore transparency entry: 1247049098
- Sigstore integration time: Apr 7, 2026
Source repository:
- Permalink: jpata/particleflow@a8525f1b6a2ee3de8c5a629ea0347c8fe8edaee0
- Branch / Tag: refs/tags/v3.1.0
- Owner: https://github.com/jpata
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi-publish.yml@a8525f1b6a2ee3de8c5a629ea0347c8fe8edaee0
- Trigger Event: push

File details

Details for the file particleflow-3.1.0-py3-none-any.whl.

File metadata

Download URL: particleflow-3.1.0-py3-none-any.whl
Upload date: Apr 7, 2026
Size: 249.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for particleflow-3.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`15390cccf8c0590cb8f719f458245fc3b17d1bce510e86bf6e31841c37a946aa`
MD5	`00467ce7dcba3718d4741e07827b4ea9`
BLAKE2b-256	`c1278cd2ba8ef18417294e9ac648133984cc415b84e3812542e942a9dd9484c0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for particleflow-3.1.0-py3-none-any.whl:

Publisher: pypi-publish.yml on jpata/particleflow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: particleflow-3.1.0-py3-none-any.whl
- Subject digest: 15390cccf8c0590cb8f719f458245fc3b17d1bce510e86bf6e31841c37a946aa
- Sigstore transparency entry: 1247049102
- Sigstore integration time: Apr 7, 2026
Source repository:
- Permalink: jpata/particleflow@a8525f1b6a2ee3de8c5a629ea0347c8fe8edaee0
- Branch / Tag: refs/tags/v3.1.0
- Owner: https://github.com/jpata
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi-publish.yml@a8525f1b6a2ee3de8c5a629ea0347c8fe8edaee0
- Trigger Event: push

particleflow 3.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Summary

TLDR; I just want to run the code

Datasets

Dataset Upload

Training

Model Upload

Model Download & Evaluation

End-to-end workflow: dataset generation and model training

Publications

Citations and Reuse

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance