Mutual Information techniques for decorrelation of physics data
Project description
HEPINFO is a framework featuring models designed for decorrelating physics data using axol1tl with mutual information. The repository is under active development, and the name "miaxol1tl" is currently a working title.
Introduction
This repository contains the code supporting the poster Adaptive Machine Learning on FPGAs: Bridging Simulated and Real-World Data in High-Energy Physics presented at the EuCAIFCon24 conference. The code is designed to facilitate machine learning-based decorrelation techniques in high-energy physics, focusing on simulated and real-world datasets.
Data
To use the repository, follow these steps to set up the required datasets:
Flavours of Physics Dataset
- Download the data from the Flavours of Physics: Finding τ → μμμ competition.
ttbar / Dijet Data Generation
-
Navigate to the
datadirectory:cd data
-
Build the Docker image (adapted from FastSimulation):
- Linux:
docker build -f Dockerfile -t fastsim:latest .
- TODO: M1/2/3 Mac (the image builds but running madgraph does not work):
docker build --platform linux/x86_64 -f Dockerfile -t fastsim:latest .
- Linux:
-
Start the Docker container:
docker run -it --rm -v $(pwd):/scratch fastsim:latest cd scratch
-
Generate enough pileup events:
- Note: You will need
Npileup events for each real event, whereNcan range from 20 to 200, depending on the target LHC run conditions. More details can be found in the pileup documentation. - Update the pileup command file:
cp /opt/delphes/examples/Pythia8/generatePileUp.cmnd generatePileup100k.cmnd
EditMain:numberOfEventsto a higher value (e.g., 100k or more). - Run the pileup generation:
DelphesPythia8 /opt/delphes/cards/converter_card.tcl generatePileup100k.cmnd MinBias100k.root
- Convert the ROOT file to a pileup format:
root2pileup MinBias.pileup MinBias100k.root
- Note: You will need
-
Generate the processes:
So after this stage you have a file with 100k events. Now let’s move on to generating the actual events. Generate the processes -- you want both dijet and ttbar:
madgraph_dijet.script:
import model sm
define p = g u c d s u~ c~ d~ s~
define l+ = e+ mu+ ta+
define l- = e- mu- ta-
define vl = ve vm vt
define vl~ = ve~ vm~ vt~
define l = l+ l-
define ln=vl vl~
generate p p > j j
output dijet_process_42
launch dijet_process_42
done
set nevents 100000
set gseed 42
done
madgraph_ttbar.script:
import model sm
define p = g u c d s u~ c~ d~ s~
define l+ = e+ mu+ ta+
define l- = e- mu- ta-
define vl = ve vm vt
define vl~ = ve~ vm~ vt~
define l = l+ l-
define ln=vl vl~
generate p p > t t~
output ttbar_process_42
launch ttbar_process_42
done
set nevents 100000
set gseed 42
done
madgraph_higgs.script:
import model heft
define v = z w+ w-
generate p p > h j j $$v QCD=0, h > l+ l- vl vl~
output higgs_process_42
launch higgs_process_42
done
set nevents 100000
set gseed 42
done
Create these files in the docker container and run (in the following only for the dijet): /opt/MG5_aMC_v2_7_2/bin/mg5_aMC madgraph_dijet.script.
This produces a lhe file which is needed as input for the next step.
- Running Delphes:
To produce the final root files run cp /opt/delphes/cards/delphes_card_CMS_PileUp.tcl delphes_card_CMS_PileUp.tcl (adjust therein the path to the previously generated pileup file and the card can be also ATLAS).
Now adjust the path to the lhe file in the file pythia_card (and also the number of events to match the number of events in the lhe file).
pythia_card (has to be created in the docker):
! 1) Settings used in the main program.
Main:numberOfEvents = 10000 ! number of events to generate
Main:timesAllowErrors = 3 ! how many aborts before run stops
! 2) Settings related to output in init(), next() and stat().
Init:showChangedSettings = on ! list changed settings
Init:showChangedParticleData = off ! list changed particle data
Next:numberCount = 10000 ! print message every n events
Next:numberShowInfo = 1 ! print event information n times
Next:numberShowProcess = 1 ! print process record n times
Next:numberShowEvent = 1 ! print event record n times
! Adjust tau decays
15:onMode = off
15:onIfAny = 11 13
! 3) Set the input LHE file
Beams:frameType = 4
Beams:LHEF = /scratch/WZ_process/Events/run_01/unweighted_events.lhe.gz
In the dijet case this would be dijet_process/Events/run_01/unweighted_events.lhe.gz. Then run DelphesPythia8 delphes_card_CMS_PileUp.tcl pythia_card output.root
This produces an output file with 100k dijet events. Repeat this step for ttbar and higgs and store the data inside the data folder (dijet.root, ttbar.root and higgs.root).
- Generate the dataset from the root files:
cd data
python extract_data.py
We have 57 variables: MET, 4 electrons, 4 muons and 10 jets these are 19 objects, times 3 parameters -> 57 vars. In addition we read the pile-up as "Vertex_size" So the generated files have the following columns: vertex_size,misMET,misEta,misPhi,e0PT,e0Eta,e0Phi,...,m0PT,m0Eta,m0Phi,...,j0PT,j0Eta,j0Phi,...
Code for τ → 3μ
The primary code for analyzing the τ → 3μ process is detailed in the notebooks/tau_3mu.ipynb notebook. For the VHDL simulation of the Bernoulli layer, an example testbench is available in the tb folder.
Contributions: Contributions and collaborations are welcome. Please open an issue or submit a pull request to suggest improvements or report issues.
License: This project is licensed under the MIT License. See the LICENSE file for details.
We hope this repository aids in advancing decorrelation techniques and adaptive machine learning in high-energy physics.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hepinfo-0.1.4.tar.gz.
File metadata
- Download URL: hepinfo-0.1.4.tar.gz
- Upload date:
- Size: 43.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56dd95435d10d0393c3f5fc2fe259a1231af4dda2c7ac982d127ae89ead17cc5
|
|
| MD5 |
cb65073dfbc0be3268ff07b902ddde9e
|
|
| BLAKE2b-256 |
4502b0b369be1e8d0a150bdb1087fcaea18a9ab6aeeffec04d5880a1cf7c9a3b
|
Provenance
The following attestation bundles were made for hepinfo-0.1.4.tar.gz:
Publisher:
python-publish.yml on makoeppel/hepinfo
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hepinfo-0.1.4.tar.gz -
Subject digest:
56dd95435d10d0393c3f5fc2fe259a1231af4dda2c7ac982d127ae89ead17cc5 - Sigstore transparency entry: 183736037
- Sigstore integration time:
-
Permalink:
makoeppel/hepinfo@8b238558f7b3235f454521684e1e5336e4c371d6 -
Branch / Tag:
refs/tags/v0.1.4 - Owner: https://github.com/makoeppel
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@8b238558f7b3235f454521684e1e5336e4c371d6 -
Trigger Event:
release
-
Statement type:
File details
Details for the file hepinfo-0.1.4-py3-none-any.whl.
File metadata
- Download URL: hepinfo-0.1.4-py3-none-any.whl
- Upload date:
- Size: 44.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bea1d55f139ef63e9b4dca19c698535a441d565eeec1630e7301f076b2b984ef
|
|
| MD5 |
16110913b7106a8a2f784caad700a1fc
|
|
| BLAKE2b-256 |
b784084286e6373895c273bebaf1591753b29a553e508540eb6e353aa2e7c7e8
|
Provenance
The following attestation bundles were made for hepinfo-0.1.4-py3-none-any.whl:
Publisher:
python-publish.yml on makoeppel/hepinfo
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hepinfo-0.1.4-py3-none-any.whl -
Subject digest:
bea1d55f139ef63e9b4dca19c698535a441d565eeec1630e7301f076b2b984ef - Sigstore transparency entry: 183736041
- Sigstore integration time:
-
Permalink:
makoeppel/hepinfo@8b238558f7b3235f454521684e1e5336e4c371d6 -
Branch / Tag:
refs/tags/v0.1.4 - Owner: https://github.com/makoeppel
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@8b238558f7b3235f454521684e1e5336e4c371d6 -
Trigger Event:
release
-
Statement type: